Research:Rolling recurring old active editor
- = 5 edits
- = 30 days
SET @n = 5; /* edits threshold */
SET @u = 30; /* activity unit in days */
SET @T = "20140102"; /* February 1st, 2014 before midnight */
/* Build a table of active editors from last month */
CREATE TEMPORARY TABLE staging.last_active
SELECT
rev_user
FROM (
SELECT
rev_user
FROM revision
WHERE rev_timestamp BETWEEN
DATE_FORMAT(DATE_SUB(@T, INTERVAL @u*2 DAY), "%Y%m%d%H%i%S") AND
DATE_FORMAT(DATE_SUB(@T, INTERVAL @u DAY), "%Y%m%d%H%i%S")
UNION ALL
SELECT
ar_user AS rev_user
FROM archive
WHERE ar_timestamp BETWEEN
DATE_FORMAT(DATE_SUB(@T, INTERVAL @u*2 DAY), "%Y%m%d%H%i%S") AND
DATE_FORMAT(DATE_SUB(@T, INTERVAL @u DAY), "%Y%m%d%H%i%S")
) AS split_revisions
GROUP BY rev_user
HAVING COUNT(*) >= @n;
/* Build a table of active editors from current month */
CREATE TEMPORARY TABLE staging.current_active
SELECT
rev_user
FROM (
SELECT
rev_user
FROM revision
WHERE rev_timestamp BETWEEN
DATE_FORMAT(DATE_SUB(@T, INTERVAL @u DAY), "%Y%m%d%H%i%S") AND @T
UNION ALL
SELECT
ar_user AS rev_user
FROM archive
WHERE ar_timestamp BETWEEN
DATE_FORMAT(DATE_SUB(@T, INTERVAL @u DAY), "%Y%m%d%H%i%S") AND @T
) AS split_revisions
GROUP BY rev_user
HAVING COUNT(*) >= @n;
/* Find the intersection and filter the newbies & bots out */
SELECT
user_id,
user_name,
user_registration
FROM staging.current_active
LEFT JOIN staging.last_active USING (rev_user)
INNER JOIN user ON rev_user = user_id
WHERE
user_registration <
DATE_FORMAT(DATE_SUB(@T, INTERVAL @u*2 DAY), "%Y%m%d%H%i%S") AND
rev_user NOT IN (SELECT ug_user FROM user_groups WHERE ug_group = "bot");
Rolling recurring old active editors are non-bot editors who registered more than 2 time periods ago and completed a minimum number of edits in the last two time periods.
Discussion
[edit]The n edits threshold
[edit]The edits necessary to cross this threshold, the fewer editors will meet the criteria. Historically, "active editors" were considered to be editors who made 5 edits to specific types of pages over the course of a calendar month. So for historical purposes, we recommend .
The u activity unit
[edit]The wider this unit is set, the more casual editors will be considered "active". Historically, "active editors" were considered during calendar months. So for historical purposes, we recommend .
Time lag
[edit]This metric has a built-in time lag of for the time period in question.
Bot filtering
[edit]Bots are filtered using the bot flag method.
Discussion
[edit]The n edits threshold
[edit]The u activity unit
[edit]Time lag
[edit]Plain word definition
[edit]In plain words (using the default values) a '
Analysis
[edit]Discussion
[edit]Notes
[edit]In plain words (using the default values) a 'Rolling recurrent old active editor' is an editor that registered before 2 months ago and completed 5 edits in the last month and another 5 in the month prior to that one. Month is defined not as a calendar month but rather 30 calendar days.