WikiCred/2022 CFP/Iffy.news
Iffy.news and Wikidata | |
---|---|
A WikiCred 2022 Grant Proposal | |
Project Type | Technology |
Author | (hearvox) |
Contact | Iffy.news |
Requested amount | US$6,700 |
Award amount | Unknown |
- What is your idea?
This project will make Wikidata/Wikipedia a more effective indicator of news-site credibility. The barrier is inconsistent methods of categorizing news outlets, causing unreliable search results and incomplete information. We can fix that. By standardizing the connection between a news website and its domain name, we can then programmatically insert other data (when missing), such as:
- Circulation
- First year in print
- Owner
- Global Traffic Rank
- Year online
- Credibility ratings
Implementing structured-data standards will improve search accuracy for both machines and humans. The screenshots below show how: (1) A search for bostonglobe.com
failed (No match was found
), even though the Boston Globe's Wikidata page lists its website. But (2 & 3) adding the domain name as an Alias (4) puts it in the top result:
- Why is it important?
Right now, Wikipedia articles and Wikidata items list domain names in many different ways. News outlets might have an Infobox, which might list a “Website”, which might be either its domain name or URL. Below there might be an External Links section, with the site listed as either “Official Website” or the domain name. And all those links are often the old, now-incorrect http
URL instead the site's new https
protocol.
All that unstructured data prevent machines from reliably identifying news outlets by their domain name. That prevents data from being reliably read or written programmatically.
Using APIs (and in consultation with Wikidata editors), we will give news outlets a standardized association with their website — e.g., the (correct) URL as an "official website” and domain name as an Alias. Once a news outlet is machine-identifiable by its unique domain name, we will API-insert additional data (stored in Iffy.news and other databases and ready to be used).
For the 2K outlets which already have a "Media Bias Fact Check ID”, we can add their MBFC credibility and factual-reporting ratings. Then we could experiment:
- At Wikipedia, making a User Script that identifies the source credibility of an article's References.
- At an external site, pulling in Wikidata on news sources, using Iffy.news' Fact-check Search tool to demonstrate.
- Link(s) to your resume or anything else (CV, GitHub, etc.) that may be relevant
- Is your project already in progress?
This project builds on the knowledge gained in our previous WikiCred grant, documented at MisincoCon, Iffy.news, and DataJournalism.com. Among the results was a SPARQL query for all news outlets with Wikipedia (en) articles. We already have databases of USA news-outlet domains and scripts which pull/update site-related data.
- How is this project relevant to credibility and Wikipedia?
Our work will strengthen a core Wikipedia/Wikidata principle: identifying and categorizing credible sources.
All articles must strive for verifiable accuracy, citing reliable, authoritative sources. — Wikipedia, "Five Pillars"
- What is the ultimate impact of this project?
This endeavor significantly improves Wikidata/Wikipedia as a tool for evaluating news-source reliability — helping journalists, researchers, and the public.
- Can your project scale?
The datasets we have list USA English news outlets. We will document our process and publish our SPARQL, JavaScript, and API code (in a GitHub repo). People can then use our instructions, to auto-gather data and programmatically insert it into Wikidata/Wikipedia, on their news-outlet data for other countries and languages.
- Why are you the people to do it?
It needs to be done. Iffy.news has the databases, the API skills, the Wikidata familiarity, and the journalism/documentation experience to do it.
- What is the impact of your idea on diversity and inclusiveness of the Wikimedia movement?
Mis/disinformation is the enemy of diversity, inclusion, and accessibility. This project helps stop mis/disinfo at its source.
- What are the challenges associated with this project and how you will overcome them?
- Establish a machine-readable, unique relationship between news outlets in Wikidata/Wikipedia and their domain name.
- With the help of experienced Wikidata editors, determine what statements should be added or updated for news-outlet Items (then script API calls to safely and accurately insert that data).
- With the help of experienced Wikipedia editors, determine what information should be added or updated for news-outlet articles (then script safe, accurate API calls).
- Demonstrate how external sites can use all this new Wikidata/Wikipedia data.
- How will you spend your funds?
- Research and Programming $4,000
- Technical Writing $2,000
- Wiki Editors (Consultants) $700
- How long will your project take?
6 months
- Have you worked on projects for previous grants before?
- Executive Producer of Hearing Voices from NPR series: Awarded seven National Endowment for the Arts grants and two Corporation for Public Broadcasting grants.
- Developer for News Netrics (an experiment in tracking news website performance): Funded by University of Missouri: J-School.
- Iffy.news: WikiCred Project
- Media Fellow: United States Artists
- Residential Fellow: Reynolds Journalism Institute