Research talk:Revision scoring as a service/Work log/2015-07-22
Add topicAppearance
Latest comment: 9 years ago by EpochFail in topic Wednesday, July 22, 2015
Wednesday, July 22, 2015
[edit]White Cat asked me to check how many task lacked any labels for our ongoing Wiki labels campaigns.
wikilabels=> SELECT campaign.id, wiki, COUNT(*) FROM campaign INNER JOIN task ON campaign_id = campaign.id LEFT JOIN label ON task_id = task.id WHERE task_id IS NULL GROUP BY campaign.id, wiki; id | wiki | count ----+--------+------- 4 | enwiki | 602 9 | frwiki | 19949 6 | fawiki | 261 3 | ptwiki | 4 5 | trwiki | 1546 8 | azwiki | 20000 7 | ptwiki | 1058 (7 rows)
It looks like we have a lot of duplicate labels for enwiki due to running the auto-labeling after people got started with labeling.
Let's check how much energy we wasted (also autolabeling we'll be able to check).
wikilabels=> SELECT wiki, SUM(CAST(labels > 1 AS INT)) FROM (SELECT wiki, task_id, COUNT(label.*) AS labels FROM campaign INNER JOIN task ON campaign_id = campaign.id LEFT JOIN label ON task_id = task.id WHERE task_id IS NOT NULL GROUP BY wiki, task_id) AS foo GROUP BY wiki; wiki | sum --------+----- enwiki | 702 fawiki | 187 frwiki | 0 ptwiki | 285 trwiki | 57 (5 rows)
So in enwiki, we got 702 human labels that we didn't need to finish the campaign due to autolabeling. --EpochFail (talk) 14:52, 22 July 2015 (UTC)