Research talk:Autoconfirmed article creation trial/Work log/2018-01-15

Monday, January 15, 2018

Today I aim to make a short visit to our patroller statistics, look at our AfC and Draft hypotheses, and crunch data on the quality of recent article creations.

Patroller statistics

There's lately been a New Page Patrol backlog drive, so I wanted to quickly revisit some of our statistics in order to understand a bit more about how this affected our statistics. First, a plot of the size of the NPP backlog:

The dotted line on the graph shows the start of ACTRIAL. As we can see, there was a large drop before the start of the trial, then another fairly large drop a couple of weeks after ACTRIAL started where patrollers worked through all the most recent creations. From then on, the backlog was fairly stable and started to slowly increase until about mid-December. From then on, the backlog starts to consistently decrease. This can be partly attributed to recruitment of new patrollers, activity by existing patrollers, and a backlog drive starting on January 1, 2018.

We are interested in understanding how the recent activity is reflected in our statistics on number of active patrollers, and the proportion of work done. First, number of active patrollers:

There are several inflection points in this graph. A key one is the introduction of the patroller user right in November 2016, where the number of active patrollers drops from about 125 down to below 100. Later the number appears to stabilize around 75. Another key point is the drop after the introduction of ACTRIAL, where it gets down to almost 50. Lastly, we see an increase in the final months of 2017, likely due to the recruitment of new patrollers.

The graph above shows the proportion of all patrol actions performed by the most active quartile of patrollers on a given day. It is similar to a plot of the Gini coefficient, which is an often used measurement of inequality in distributions. In our case, it suggests that shortly after ACTRIAL started, the distribution of effort amongst patrollers evened out a little bit, but that the vast majority of the work was still done by a smaller number of participants. We can also see an upwards trend towards the end of the graph, suggesting that the backlog drive and the recent influx of patrollers have not resulted in a change on how work is distributed. Instead, we see that patrolling continues to be work where a smaller group of people are the key contributors.

Article quality in second half of 2017

Our quality analysis from Jan 8 only looked at data up until July 1, 2017. We are interested in understanding how content quality has developed since ACTRIAL started, and therefore used our article creation data in the "log" database (which is also used in page creation dashboard) to make predictions for more recent articles. As before, we grabbed creations of non-redirect pages in the Main namespace and used ORES' draft and article quality models to make predictions. Secondly, we created a dataset of all non-redirecting, non-autopatrolled Main namespace creations by accounts that were less than 30 days old. Then we combined these as before in order to understand more about the content created by recently registered accounts.

First of all, we look at the proportion of revisions that were retrievable through the API, meaning that it was not deleted in such a way that we cannot get to it:

The dotted line shows the start of ACTRIAL, and as we can see it marks a shift upwards in the proportion of retrievable revisions. At the same time, we note that variance increases as there are fewer articles created per day. There are days during ACTRIAL where the proportions dip down to previous levels. This suggests that the raised bar of article creation to some extent does improve the articles that get created, but doesn't solve it completely.

The graph above shows the proportion of retrievable revisions that are predicted to be "OK" by the ORES draft quality model. Here we can again clearly see a shift with the introduction of ACTRIAL, and that there again are some dips down to previous levels. Overall there appears to be more stability in this measurement compared to the previous graph of retrievable revisions, thus making the impact of ACTRIAL stronger.

Lastly we have the graph above showing the average weighed quality sum based on the ORES article quality model predictions. We only predict the quality of revisions that are also predicted as "OK" by the ORES draft quality model, thus suggesting that they would not be up for speedy deletion. Here we see that ACTRIAL appears to have no effect. The variance is higher, likely due to the lower volume of content created, but we see that the quality of content created by newly registered accounts appears to be largely the same. This suggests that ACTRIAL's effect on the produced content largely relates to other characteristics than general content quality, it's instead a question of whether the content meets the bar for inclusion in the encyclopedia.