Research talk:Anonymous editor acquisition/Signup CTA experiment/Work log/2014-04-03
Add topicThursday, April 3rd
[edit]Today I need to build an estimate of the number of anon editors that we see in a 1-2 week period so that we can figure out how long our experiment will need to last. I'm going to assume boolean metrics and a 2-3% change. Luckily, I've already done this analysis for a previous study. See File:Onboarding.rollout.proportion_test.pvalue_by_observations.svg. It looks like we should have no problem identifying significant effects with 5000 users per condition. That means I need 15k users to cover the three conditions (control, pre-edit CTA & post-edit CTA).
Until I can make use of tokens, my best way to estimate the # of anon editors is by looking for unique IP addresses and user agents over time using the cu_chamges table (see mw:Extension:CheckUser). --Halfak (WMF) (talk) 15:52, 3 April 2014 (UTC)
Created project repo: https://github.com/halfak/Anonymous-phenomena
I'm hoping to use this for all anon studies -- not just the current one. --Halfak (WMF) (talk) 15:55, 3 April 2014 (UTC)
I made the following figure to make sure that the timeline of the experiment was clear (assuming two week observation period):
I'll have to modify it when I'm done with the power analysis. --Halfak (WMF) (talk) 15:55, 3 April 2014 (UTC)
> SELECT -> COUNT(*) AS ip_agent_count, -> COUNT(DISTINCT cuc_ip) AS ip_count -> FROM ( -> SELECT -> cuc_ip, -> cuc_agent -> FROM cu_changes -> WHERE -> cuc_user = 0 AND -> cuc_timestamp BETWEEN "20140325" AND "20140401" -> GROUP BY 1,2 -> ) AS unique_ip_agent; +----------------+----------+ | ip_agent_count | ip_count | +----------------+----------+ | 97847 | 92969 | +----------------+----------+ 1 row in set (5.15 sec)
So, if IP or IP/Agent is a good way of estimating the # of users, it looks like we'll have about 92k-97k anons to work with in one week of experimentation. We might even get twice that many attempting to edit. We don't want to do less than a week due to the periodic nature of weekends and weekdays. Time to update the experiment duration to reflect a week. --Halfak (WMF) (talk) 21:03, 3 April 2014 (UTC)