Jump to content

Meta:Wikimedia CH/Grant apply/CitationWatchlist

From Meta, a Wikimedia project coordination wiki

Infodata

[edit]
  • Name of the project: Citation Watchlist
  • Amount requested: CHF 20,000
  • Type of grantee: Group (Jake Orlowitz + James Hare with Hacks/Hackers)
  • Name of the contact: Jake Orlowitz (User:Ocaasi)
  • Contact: jorlowitz(_AT_)gmail.com
In case of questions, please write to grant(_AT_)wikimedia.ch

The problem and the context

[edit]

What is the problem you're trying to solve?

[edit]

Wikipedia's Citation Watchlist tool is designed to help editors monitor and address the issue of citation integrity, specifically by monitoring for the use of unreliable or problematic citations. The main problem it aims to solve includes:

Unreliable Information

[edit]

Wikipedia's reliability depends heavily on properly cited and verifiable sources. The Citation Watchlist helps editors identify when content is added or modified without reliable citations, ensuring that information meets Wikipedia's standards.

Decreasing Vandalism

[edit]

It also serves as a defense against vandalism and misinformation. By tracking pages where problematic citations have been added, the tool helps catch disruptive edits.

Maintaining Article Quality

[edit]

The tool supports ongoing efforts to maintain the quality of articles by making it easier for editors to locate and correct citation issues, ensuring Wikipedia remains a trustworthy source of knowledge.

In essence, the Citation Watchlist tool focuses on improving the accuracy and verifiability of Wikipedia content by drawing attention to citation misuse and encouraging timely editorial intervention.

What is your solution to this problem (please explain the context and the solution)?

[edit]

The Citation Watchlist addresses the problems of unreliable sources, information, vandalism, and maintaining article quality by providing targeted, real-time monitoring of citation-related changes. Here’s how it works

Tracking Citation Changes in Real Time

[edit]

The tool allows editors to enhance their watchlist so that where they receive alerts whenever problematic citation-related edits occur.This enables editors to quickly intervene if an edit violates Wikipedia's verifiability standards or introduces unreliable sources.

Tracking and Highlighting Unreliable Sources

[edit]

The Citation Watchlist enhances monitoring by adding visual indicators (like Cite Unsee) but to the watchlist, recent changes, and page history. These indicators, such as ❗ for severe warnings and ✋ for less severe cautions, make it easy for editors to quickly identify when unreliable URLs are added to articles. This helps solve the problem of unverified information by drawing immediate attention to potentially problematic sources.

Highlighting Articles with Citation Issues

[edit]

The Citation Watchlist helps editors prioritize work by flagging articles that need attention due to citation problems. By highlighting articles with citation issues, it streamlines the workflow for editors concerned with maintaining Wikipedia's reliability.

Reducing the Spread of Misinformation

[edit]

By focusing on articles where poor citations have been used, the tool helps to combat the spread of misinformation. Editors can act swiftly to challenge or revert edits that might otherwise go unchecked.

Streamlining the Monitoring of Citation Issues

[edit]

The tool ensures that editors can track reference changes just as easily as other article edits. By showing these visual cues directly in the areas where editors already look, like their watchlist or recent changes, the tool improves efficiency and makes it easier to spot vandalism or unsourced additions quickly. This reduces the time it takes for editors to act on citation-related issues.

Providing Context and Encouraging Collaboration

[edit]

The tool enables editors to view detailed information with minimal effort, such as by hovering over indicators to get more context on the source’s reliability. This integration allows editors to make informed decisions and collaborate more effectively to maintain article quality by addressing citation gaps or removing unreliable references.

Supporting Collaboration

[edit]

The Citation Watchlist tool encourages editors across the community to work together by allowing multiple users to monitor the same articles for citation issues. This collaborative monitoring can enhance Wikipedia's capacity to deal with citation-related problems in high-traffic articles or pages of particular interest.

Customizable and Scalable

[edit]

The Citation Watchlist supports the use of community-generated lists, such as the Perennial Sources List, which tracks frequently discussed and debated sources. It also draws on the lists of other tools such as Cite Unseen. Editors can even create and add their own lists, such as those focused on predatory journals or pseudoscience. This flexibility allows for customized monitoring based on the types of sources editors are most concerned about, further enhancing citation accuracy.

Project goals

[edit]

The known problems and development goals for improving the Citation Watchlist tool, which could guide a grant proposal, are outlined below:

Known Problems

[edit]
  1. False Negatives: The tool may miss bad source additions, particularly when problematic sources aren't flagged by its predefined lists. This reduces its effectiveness in identifying unreliable citations​.
  2. Speed and Efficiency: The tool’s performance can be slow, making it less efficient for editors when analyzing changes. It doesn't always provide instant feedback, which could hinder fast-paced editing​.
  3. User Interface (UI) Limitations:
    • The tool lacks sophisticated UI features that could improve user experience, such as enhanced tooltips providing more information about flagged sources or the provenance of domains.
    • Tooltip design could include links to the lists being invoked for easier access to context​.
  4. Customization Limitations:
    • Users currently have limited control over what lists and features are turned on or off. The tool does not yet support fine-grained control or customized source lists for individual editors.
  5. Session Persistence: Data is not persisted between sessions. This means that every time a page is refreshed, the tool needs to be re-run manually, limiting its long-term usability.
  6. Partial Source Coverage: The tool only screens URLs and doesn't identify problematic sources such as books or journal articles that aren’t domain-based, which limits its scope.

Development Goals

[edit]
  1. Improving Detection Accuracy: The tool should be enhanced to reduce false negatives and improve its ability to catch problematic sources that it currently misses. This could include better integration of community consensus on unreliable sources and additional screening techniques.
  2. Performance Enhancements: The speed and responsiveness of the tool should be improved, making it more instantaneous and efficient. This would enhance user productivity and ensure real-time citation analysis during the editing process​.
  3. UI Sophistication: Developing a more sophisticated user interface, including better tooltips with additional details like source context, domain history, and links to invoked source lists. This would provide users with more information at a glance and facilitate better editorial decisions.
  4. Customization and Personalization:
    • Introduce user-generated lists, allowing editors to develop and manage their own lists of problematic sources, like predatory journals or pseudoscience. This would enable personalized and more targeted monitoring.
    • Allow users to toggle lists and features on and off, making the tool adaptable to individual workflows, similar to ad-blocking tools​.
  5. Data Persistence Between Sessions: Implement features that allow the tool to retain data across sessions so users don’t have to manually restart it after each page reload. This would make the tool more user-friendly and reliable.
  6. Expanded Source Coverage: Extend the tool’s capabilities beyond URLs to also include books, journal articles, and other non-domain-based sources. This would broaden the scope of the tool, making it applicable to a wider range of citation types.

Project impact

[edit]

How will you know if you have met your goals?

[edit]
  • Enhanced detection algorithms to catch more bad source additions.
  • Speed optimization to ensure the tool provides real-time feedback.
  • UI improvements for better usability, context, and editor productivity.
  • Support for custom lists and more flexibility, allowing editors to curate which sources are flagged.
  • Persistent data handling to retain information across editing sessions.
  • Broader source screening capabilities to include non-URL sources.

By focusing on these problems and goals, the grant can support the development of a more effective and user-friendly Citation Watchlist tool.

Do you have any goals or metrics around participation or content?

[edit]

We'd like 500 editors to have the tool installed with 300 active users in 2 languages.

Project plan

[edit]

Activities

[edit]

To address the issues with the Citation Watchlist tool and achieve the development goals, we will structure the project plan around key phases that align with the desired improvements.

1. Research & Requirements Gathering (Months 1-2)

[edit]
  • Community Input: Begin by gathering feedback from Wikipedia editors, both experienced and new, to identify pain points in their workflows related to citation monitoring. Given our deep involvement in Wikipedia projects, we can leverage your existing network, including input from WikiCred participants, to get comprehensive feedback on tool usage.
  • Technical Evaluation: Analyze the current codebase to identify technical debt and assess the effort required for improvements like performance optimization, UI upgrades, and support for persistent sessions.
  • Define Custom Lists: Engage with editors to define requirements for user-generated lists, such as predatory journals or pseudoscience, to ensure that the functionality aligns with their real-world needs.

2. Prototype Improvements (Months 3-4)

[edit]
  • False Negative Reduction: Implement enhanced detection algorithms aimed at reducing false negatives. This might involve incorporating more sophisticated pattern recognition for citations, such as cross-referencing against multiple citation lists.
  • Performance Optimization: Work on improving the tool’s efficiency, reducing the delay in detecting citation changes. Performance will be a priority, as editors need instantaneous feedback during fast-paced editing sessions.
  • UI Upgrades: Start developing a more intuitive UI, incorporating detailed tooltips that provide information on source reliability, domain provenance, and links to community discussions on the flagged sources.

3. Pilot & User Testing (Months 4-5)

[edit]
  • Small-Scale Launch: Release a pilot version of the improved tool to a small group of experienced editors and testers.
  • Iterative Feedback: Conduct surveys and open forums to gather feedback on the new features, such as the enhanced UI, source screening lists, and the persistence between sessions. This is where customization options—like user-controlled source lists—can be tested to see how well they integrate into different editorial workflows.

4. Full Implementation (Month 6)

[edit]
  • Broader Rollout: After refining the tool based on pilot feedback, expand its availability to the entire Wikipedia community. This phase will focus on refining features like session persistence, ensuring that data remains consistent across page reloads, and making the tool available to a wider set of editors.
  • User-Generated Lists: Enable the creation and management of custom source lists, with functionality for editors to toggle which lists are active. This would give the tool more flexibility, aligning with specific editing focuses, such as flagging pseudoscientific sources or deprecated references.
  • Internationalization: Move towards deploying the script in multiple languages by internationalizing the interface and working with editors from non-English Wikipedia communities to adapt the tool with their own source lists.
  • AI Integration: Work with the developer of CiteCheck (Salim Nouspo) to explore more intelligent automation and detection methods for the next generation of this tool. Perhaps it can handle all citations, even those without links.

Budget

[edit]

Budget Allocation:

  • Developer: CHF 9,000
  • Product Manager: CHF 9,000
  • Hacks/Hackers: CHF 2,000

Total: CHF 18,000

This tool has received support from private funders to develop the alpha version 1. We have also tried to receive funds from WMF, but they have recently suspended all Tech grants. As such we are looking for more movement funds to make the tool robust, build version 2, and deploy it broadly across the projects.

Community engagement

[edit]


  • Presentation at Wikimania 2024: "Introducing the Citation Watchlist!" Slides
  • Presentation at CredCo monthly meeting: "Introducing the Citation Watchlist!"

Wikimedia CH response

[edit]

The project was presented in presence and saved in a wrong place.

We are pleased to approve the grant request under the Innovation programme of Wikimedia CH for a total of 20,000 CHF.

The project can begin already in 2024, with an initial payment of 10,000 CHF. The remaining balance will be provided halfway through the project, after submission of the mid-term report.

Once the project is completed, a final report will be required for inclusion in Wikimedia CH's reports and newsletters.