User:Frostly/Fortuna
This page is kept for historical interest. Any policies mentioned may be obsolete. If you want to revive the topic, you can use the talk page or start a discussion on the community forum. |
Fortuna
The all-in-one file management tool
Fortuna is an upcoming tool designed to help editors working in image copyright patrolling be able to detect violations of copyright policy in a more streamlined way. Fortuna is similar to CopyPatrol in that it analyses files for copyright violations. The tool will also include other features, such as duplicate files detection, a colour-based visual search engine, a notifications system for file contributors to notify them that their file is being used on other sites, and a reverse image search engine. Fortuna will also provide an API for other tools to use.
Motivation
[edit]Wikimedia Commons currently has many copyright violations that go by undetected. Copyright patrollers are often underrrepresented in terms of the number of tools supporting them, and current solutions such as OgreBot's new user upload log have been deprecated or removed.
Movement Strategy
[edit]Fortuna is aligned with two of the Movement Strategy goals:
Increase the Sustainability of Our Movement
[edit]Systematic approach to improve satisfaction and productivity
[edit]Assessing the needs of groups and volunteers, taking into account their local contexts for effective support and recognition of efforts.
I've already reached out to some contributors working in copyright, and will continue to do so. I also work in copyright myself!
Continuously engaging and supporting publicly diverse types of online and offline contributors.
People working in copyright are often underrepresented in terms of technology to support them; there definitely are less tools for copyright than in other areas.
Increased Wikimedia Awareness (prioritized initiative)
[edit]to secure the attention, trust, and interest of knowledge consumers
I think that my tool would help remove many copyright violations and increase trust in the licensing of the Movement's content.
Identify Topics for Impact
[edit]Identify Wikimedia's Impact (prioritized initiative)
[edit]Understand how our projects can be misused or abused by detecting threats with significant potential for harm
I certainly believe that copyright violations are a way that the projects can be abused!
Previous feature requests
[edit]2015 and 2022 Community Wishlist Survey, Phabricator task.
Features
[edit]Patrol
[edit]Web app (all features) and gadget (one-by-one info)
Fortuna Patrol is an interface similar to CopyPatrol to review detected copyright violations. Violations will be each given a point scale in terms of how likely Fortuna believes that the file is a copyright violation.
Related solutions
[edit]- Earwig's Copyvio Detector
- CopyPatrol
Implementation notes
[edit]- Use copyviobot-esque system, with an extension?
Curio
[edit]Extension
Fortuna Curio is an interface to review images that may be duplicated across Commons and other projects. The name comes from the word "curate".
Implementation notes
[edit]- Image-Match
- Match
- Image-Match paper
- Hook into Special:DuplicateImages
Colorways
[edit]Extension
Fortuna Colorways allows for analysis of the colors in an image and searching images by color.
- Finding optimal values for k
- K-mean clustering to find dominant colors in image; 10 should probably be used as a max for # of dominant colors, similar to TinEye's recommendation
- Use Elasticsearch or other database (e.g. Redis) to store images and retrieve similar results; inspiration from TinEye
Discover
[edit]Mobile app integration and extension (with web interface and API in MW)
Fortuna Discover acts as a search engine, allowing users to find ("discover") content on Wikimedia projects by taking or uploading photos.
- Needs OSS implementation; only solution right now is TinEye or Cloud Vision, etc
Charisto
[edit]Extension
Fortuna Charisto allows contributors to subscribe to notifications of when their content is used online. The name Charisto comes from the Greek word for "thank you", "ef̱charistó̱".
- Would it be cheaper to self-monitor using TinEye's general API?
Merido
[edit]Extension and FastAPI server
Fortuna will have an API named Merido that allows other tools to search for copyright violations without needing to use a variety of APIs and without the need of additional funding. The name comes from the Greek word for "share", "merídio".
- Plug in to GraphQL extension for GraphQL API support
Grafeas
[edit]Extension integration with Wikisource OR gadget OR separate extension
Identify text in images through OCR, and link to Earwig's copyright tool to check if they are copyvios.
- mw:Help:Extension:Wikisource/Wikimedia OCR
- Potentially rename Wikisource extension
- Scribe in Greek
Aspis
[edit]Gadget, extension and mobile integration
Hide potentially explicit images on pages and file description pages.
- Google API currently; any other solutions?
- Safe in Greek
Syntomo
[edit]Extension integration
Generate file captions/descriptions/depicts statements/categories.
- mw:Extension:MachineVision
- Sort in Greek
Implementation
[edit]Technology
[edit]Fortuna will utilize a variety of APIs for detection in order to maximize results.
Services used:
Fortuna will be written in Elm for safety and quality assurance. The frontend will use OOUI to match with the interfaces of Wikimedia projects and to allow for easier user onboarding. The tool will be hosted on Toolforge and use ToolsDB for database storage and caching.