Learning and Evaluation/Wikiresearch webinars
This page is kept for historical interest. Any policies mentioned may be obsolete. If you want to revive the topic, you can use the talk page or start a discussion on the community forum. |
Wikiresearch Webinars
[edit]Sessions
[edit]- Part 1: Beyond Wikimetrics - July 16, 2014. an overview of what you can do with the Wikimedia Cloud Services MySQL databases and the MediaWiki API
- Part 2: Becoming Wikimetrics - August 20, 2014. an in depth tutorial on how to use MySQL to gather and manipulate data
- Part 3: Building on Wikimetrics - August 27, 2014. how to use the MediaWiki API and the Python programming language to gather and manipulate data
Check for more details on upcoming webinars and other related events at the Evaluation portal news page
Part 1: Beyond Wikimetrics
[edit]date: July 16 2014
MySQL overview
[edit]MySQL query examples |
---|
select page_title from enwiki_p.page where page_touched > 20140716000000 translation: “computer, give me the names of all the wikipedia pages that have been edited today” select distinct rev_user_text translation: "computer, give me a list of 10 users who have played The Wikipedia Adventure" select distinct rev_user_text, user_editcount translation: "computer, give me a list of 10 users who have played The Wikipedia Adventure and their current edit counts" |
API overview
[edit]API request examples |
---|
|
Part 2: Becoming Wikimetrics
[edit]- date: August 20 2014. Join the live conversation on google, or on IRC at #wikimedia-researchconnect (web link)
- time: 1500 UTC
- Slides
- Video
In the second part of our series, we will learn how to write and run MySQL queries against the Wikimedia MySQL databases. The replica databases hosted on Wikimedia Cloud VPS allow you to ask questions about what is happening on Wikimedia projects such as "how many people have edited articles that I created?" and get back a precise answer in the form of data. A new tool called Quarry will help us ask and answer these questions together.
This session will be more interactive than the first session: you will have the opportunity to write and run your own queries, and collect your own data! Veteran query-wranglers will be standing by in #wikimedia-researchconnect ready to answer any questions you might have, and to help you if you get stuck.
- Explanatory queries
- http://quarry.wmflabs.org/query/278 - Queryable tables
- http://quarry.wmflabs.org/query/279 - All tables in the enwiki_p database
- http://quarry.wmflabs.org/query/276 - WikiLove sent during Wikimania 2014
- http://quarry.wmflabs.org/query/275 - description of the fields in the user, page, and revision tables on enwiki
- Example queries
- http://quarry.wmflabs.org/query/280 - Join example #1: Get first edit to a page
- http://quarry.wmflabs.org/query/281 - Join example #2: current edit count of some 2002 editors
- http://quarry.wmflabs.org/query/282 - Join example #3: pages in a category that need citations
- http://quarry.wmflabs.org/query/293 - 5 Anon editors who have recently asked questions at the Teahouse
- http://quarry.wmflabs.org/query/287 - Getting number of editors to a page and its talk page
- http://quarry.wmflabs.org/query/290 - Getting total edits to a page or pages by a list of users
- http://quarry.wmflabs.org/query/310 - Highly active editors who joined recently
Part 3: Building on Wikimetrics
[edit]- date: August 27 2014 Join the live conversation on google, or on IRC at #wikimedia-researchconnect (web link)
- time: 1500 UTC
- Slides
- feedback survey
- Video
In our third session, we will talk about the Python programming language, and the JSON and CSV data storage formats. We will walk step by step through several example research "projects" that involve downloading MySQL results from Quarry and using Python to combine the data with other results available through the MediaWiki API and the stats.grok.se pageview API.
- Python scripts and sample CSVs for today's exercises
folder of Python scripts. Download this file, unzip it, and place it on your desktop.
- Setting up and running Python scripts
- setting up Python on your computer
- navigating to a folder from the command line
- executing a Python file
- Another command line tutorial
- Challenge queries
- http://quarry.wmflabs.org/query/361- Recent edits to articles in w:Category:Secret_societies
- http://quarry.wmflabs.org/query/364 - The 100 people who have sent the most Wikilove
- http://quarry.wmflabs.org/query/363 - People who have sent Stroopwafels via Wikilove between August 27, 2013 and August 27 2014. And the number of wafels sent per user.
- Python research project queries
- http://quarry.wmflabs.org/query/365 - Revision ids of the first 100 questions asked at the enwiki Teahouse in 2014
- http://quarry.wmflabs.org/query/371 - Total monthly edits to articles in w:Category:The_Hobbit_(film_series)
- Sample API requests
- MediaWiki API: JSON file for the text of the Teahouse question "Request PR?" - http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Wikipedia:Teahouse/Questions&rvprop=content&rvstartid=590057276&rvendid=590057276&rvlimit=1&rvsection=1&format=json
- Pageview API (stats.grok.se): JSON file containing daily pageviews for An Unexpected Journey in February 2012 - http://stats.grok.se/json/en/201202/The_Hobbit:_An_Unexpected_Journey
- Data visualization
- Google sheet with Hobbit edit + pageview data, and graphs - https://docs.google.com/a/wikimedia.org/spreadsheets/d/1M2zDVL3DiM4Is9QwLppAz7VCzcYcgp2pdQPgEchr_LM/edit#gid=0
Getting help
[edit]- wiki-research-l mailing list
- #wikimedia-researchconnect IRC channel
- Evaluation portal question forum
Resources
[edit]please add new resources that you think would be helpful to others
Research tools
[edit]Data
[edit]Software
[edit]- Sequel Pro (Mac only - J-MO uses this one for demos)
- MySQL workbench (Windows, Linux, OSX)
Learning
[edit]- Setup and practice
- Etherpad of notes for a Wikimania 2015 Hackathon presentation on Quarry
- Creating an account on Toolforge. Good step-by-step tutorial by User:EpochFail
- More Toolforge and Cloud Services new user help
- Toolforge database access good info on how to access the Cloud Services MySQL databases
- UW Community Data Science Workshops syllabus
- Community_Data_Science_Workshops/Friday_April_4th_setup_and_tutorial setting up Python on your computer
- MySQL
- W3C schools MySQL tutorial
- SQLZoo MySQL tutorial
- Khan Academy Intro to SQL course
- Khan Academy 1-hour SQL tutorial
- Python