Research talk:Autoconfirmed article creation trial/Work log/2018-02-12
Add topicTuesday, February 13, 2018
[edit]Today I'll wrap up the related measure for H5 so that it's completed.
H5: Further segmentation
[edit]We wish to investigate how ACTRIAL affect newly registered users who create articles, in comparison to those who start out by editing existing content. In our December 18 work log, we started diving into this data by looking at survival for newly registered accounts that created articles and/or drafts. Because "surviving editor" is defined as someone who edits in both weeks 1 and 5 after registering, we limited our dataset to accounts that created an article or draft in the first week. In our December 11 work log, we looked at how quickly after registering article and draft creations occur for those who create one during their first 30 days, finding that generally, creation happens within the first 24 hours. Focusing on accounts who create an article or draft in the first week therefore captures most of that creation, in addition to fitting with the definition of "surviving editor" as noted above.
Because ACTRIAL limits article creations, we focused our analysis on the Draft namespace. In our initial analysis, we found a significant decrease in survival rate during the first 1.5 months of ACTRIAL. We updated that result on January 17 based on the hypothesis that accounts creating pages in the Draft namespace during ACTRIAL could be seen as a combination of the same prior to ACTRIAL plus a random sample of accounts creating articles prior to ACTRIAL. We then find that the survival rate of those who create articles/drafts during their first week is unchanged during ACTRIAL.
Now we are interested in understanding how ACTRIAL affects those that do not create an article or draft. We examine both accounts that do not create an article/draft in their first week, and those that do not create an article/draft in their first five weeks, thus spanning both weeks found in the definition of a "surviving editor". We first look at a historical plot of those who do not create an article/draft during their first week:
Generally, the plot above looks fairly similar to the survival plot when we analyzed overall survival on February 7. The survival rate of autocreated accounts appears to be fairly stable across time, there is not a clear pattern emerging. For non-autocreated account, the pattern of increased survival during the fall months is something worth noting as that is likely to affect our results. Secondly, there also appears to be an increase in the survival rate shortly after the new year, although perhaps less pronounced.
Focusing in on the two most recent years and adding a line that shows the start of ACTRIAL can provide some further insight:
We can again see a large amount of variation and no apparent pattern in the data for autocreated accounts. For non-autocreated accounts, there are again the patterns of increased survival rate in fall and right after the new year. We can also see a strong increase in survival for accounts registered shortly before ACTRIAL starts. The pattern during ACTRIAL appears to largely echo the same from 2016, though.
Limiting the data to only accounts that do not create an article/draft during the first five weeks after registration results in no significant difference. The patterns found in the graph for non-autocreated accounts are the same, and the non-pattern in the graph for autocreated accounts remains.
We next create four 2x2 contingency matrices, one for each time span of no article/draft creations (first week and first five weeks), and one for each type of account creation (autocreated and non-autocreated). The matrices are as follows:
Autocreated accounts, no draft/article creations during the first week:
Non-survivor | % | Survivor | % | Row total | % | |
---|---|---|---|---|---|---|
Pre-ACTRIAL | 18,990 | 97.1% | 563 | 2.9% | 19,553 | 100.0% |
ACTRIAL | 3,866 | 96.5% | 139 | 3.5% | 4,005 | 100.0% |
Total | 22,856 | 97.0% | 702 | 3.0% | 23,558 | 100.0% |
Non-autocreated accounts, no draft/article creations during the first week:
Non-survivor | % | Survivor | % | Row total | % | |
---|---|---|---|---|---|---|
Pre-ACTRIAL | 424,059 | 97.7% | 10,009 | 2.3% | 434,068 | 100.0% |
ACTRIAL | 84,376 | 97.0% | 2,645 | 3.0% | 87,021 | 100.0% |
Total | 508,435 | 97.6% | 12,654 | 2.4% | 521,089 | 100.0% |
Autocreated accounts, no draft/article creations during the first five weeks:
Non-survivor | % | Survivor | % | Row total | % | |
---|---|---|---|---|---|---|
Pre-ACTRIAL | 20,281 | 97.0% | 628 | 3.0% | 20,909 | 100.0% |
ACTRIAL | 4,021 | 96.4% | 149 | 3.6% | 4,170 | 100.0% |
Total | 24,302 | 96.9% | 777 | 3.1% | 25,079 | 100.0% |
Non-autocreated accounts, no draft/article creations during the first five weeks:
Non-survivor | % | Survivor | % | Row total | % | |
---|---|---|---|---|---|---|
Pre-ACTRIAL | 444,137 | 97.6% | 10,783 | 2.4% | 454,920 | 100.0% |
ACTRIAL | 86,333 | 96.9% | 2,772 | 3.1% | 89,105 | 100.0% |
Total | 530,470 | 97.5% | 13,555 | 2.5% | 544,025 | 100.0% |
Similarly as we did for draft creators, we use a Chi-square goodness-of-fit test to compare the ACTRIAL survival rate against the pre-ACTRIAL survival rate, using the latter as the "expected rate". The results are as follows:
User group | X2 | P-value |
---|---|---|
Autocreated accounts, first week | 5.01 | 0.025 |
Non-autocreated accounts, first week | 207.91 | << 0.001 |
Autocreated accounts, first five weeks | 4.64 | 0.031 |
Non-autocreated accounts, first five weeks | 211.21 | << 0.001 |
For autocreated accounts, the survival rate is about 0.5% higher than similar periods of the five years prior to ACTRIAL, and this is a statistically significant increase. For non-autocreated accounts, we find an increase of 0.7% in both cases, and this is also a statistically significant increase. This suggests that survival is higher during ACTRIAL. As we discussed in our February 7 work log, further analysis is needed to understand to what extent ACTRIAL is causing this change, or whether there are other factors, given that we have seen increased retention in the fall of previous years.