Research talk:Onboarding new Wikipedians/Rollout/Work log/2014-03-13
Add topicThursday, March 13th
[edit]I have two goals for today.
- Extend my analysis to other metrics.
- Increase sample sizes in order to see significant effects around 2% change in proportions
I plan to do these in parallel, but #2 is going to take a lot of processing time so I'll start with it. So, first, I need to know how many observations I'll need in order to get significance. Time for a power analysis. I expect around a 2% change based on Research:OB#Overview_of_results.
It looks like 2000 observation is a sort of sweet spot where we can detect significant differences for most changes.
tests[n == 2000,list(baseline, change, p.value = round(p.value, 2), signif=p.value < 0.05),] baseline change p.value signif 1: 0.05 0.01 0.19 FALSE 2: 0.05 0.02 0.01 TRUE 3: 0.05 0.03 0.00 TRUE 4: 0.05 0.04 0.00 TRUE 5: 0.05 0.05 0.00 TRUE 6: 0.15 0.01 0.41 FALSE 7: 0.15 0.02 0.09 FALSE 8: 0.15 0.03 0.01 TRUE 9: 0.15 0.04 0.00 TRUE 10: 0.15 0.05 0.00 TRUE 11: 0.25 0.01 0.49 FALSE 12: 0.25 0.02 0.16 FALSE 13: 0.25 0.03 0.03 TRUE 14: 0.25 0.04 0.00 TRUE 15: 0.25 0.05 0.00 TRUE
It looks like we'll need a another 500 observations to identify significant effects at 2% for a baseline of about 15%. We'd have to push another 1500 (4k total) observations to have significance at a baseline of 25%.
I've bumped the sample up to 2k per wiki and kicked off the stats generation process again. --Halfak (WMF) (talk) 19:00, 13 March 2014 (UTC)
OK. Based on the old sample, he's the proportion of activated editors:
Same story here as before except that no wikis saw a significantly lower activation rate while ukwiki and itwiki saw significantly higher activation rates post deployment. --Halfak (WMF) (talk) 19:12, 13 March 2014 (UTC)
Now, I'd like to perform a similar test for my log-normally distributed data -- e.g. # of edits.
Now to look at the differences and compare significance.
The plot above is the result of t.tests performed on the log data, so the y axis reports differences in log space. The story here is similar to that of activation rates. Trends seem to be dominated by the proportion of new editors who make at least one edit. --Halfak (WMF) (talk) 19:26, 13 March 2014 (UTC)
I just realized that I forgot to check what happens if I limit activation to editing in the main namespace only. Here we go:
Here we see a lot more wikis showing positive trends, but again we lack significance with this limited dataset. Looks like I should hold off until I have that larger sample to work with. --Halfak (WMF) (talk) 20:18, 13 March 2014 (UTC)