Jump to content

User:Frostly/Fortuna

From Meta, a Wikimedia project coordination wiki

Fortuna

The all-in-one file management tool

Fortuna is an upcoming tool designed to help editors working in image copyright patrolling be able to detect violations of copyright policy in a more streamlined way. Fortuna is similar to CopyPatrol in that it analyses files for copyright violations. The tool will also include other features, such as duplicate files detection, a colour-based visual search engine, a notifications system for file contributors to notify them that their file is being used on other sites, and a reverse image search engine. Fortuna will also provide an API for other tools to use.

Motivation

[edit]

Wikimedia Commons currently has many copyright violations that go by undetected. Copyright patrollers are often underrrepresented in terms of the number of tools supporting them, and current solutions such as OgreBot's new user upload log have been deprecated or removed.

Movement Strategy

[edit]

Fortuna is aligned with two of the Movement Strategy goals:

Increase the Sustainability of Our Movement

[edit]
Systematic approach to improve satisfaction and productivity
[edit]

Assessing the needs of groups and volunteers, taking into account their local contexts for effective support and recognition of efforts.

I've already reached out to some contributors working in copyright, and will continue to do so. I also work in copyright myself!

Continuously engaging and supporting publicly diverse types of online and offline contributors.

People working in copyright are often underrepresented in terms of technology to support them; there definitely are less tools for copyright than in other areas.

Increased Wikimedia Awareness (prioritized initiative)
[edit]

to secure the attention, trust, and interest of knowledge consumers

I think that my tool would help remove many copyright violations and increase trust in the licensing of the Movement's content.

Identify Topics for Impact

[edit]
Identify Wikimedia's Impact (prioritized initiative)
[edit]

Understand how our projects can be misused or abused by detecting threats with significant potential for harm

I certainly believe that copyright violations are a way that the projects can be abused!

Previous feature requests

[edit]

2015 and 2022 Community Wishlist Survey, Phabricator task.

Features

[edit]

Patrol

[edit]

Web app (all features) and gadget (one-by-one info)

Fortuna Patrol is an interface similar to CopyPatrol to review detected copyright violations. Violations will be each given a point scale in terms of how likely Fortuna believes that the file is a copyright violation.

[edit]
  • Earwig's Copyvio Detector
  • CopyPatrol

Implementation notes

[edit]
  • Use copyviobot-esque system, with an extension?

Curio

[edit]

Extension

Fortuna Curio is an interface to review images that may be duplicated across Commons and other projects. The name comes from the word "curate".

Implementation notes

[edit]

Colorways

[edit]

Extension

Fortuna Colorways allows for analysis of the colors in an image and searching images by color.

  • K-mean clustering to find dominant colors in image; 10 should probably be used as a max for # of dominant colors, similar to TinEye's recommendation
  • Use Elasticsearch or other database (e.g. Redis) to store images and retrieve similar results; inspiration from TinEye

Discover

[edit]

Mobile app integration and extension (with web interface and API in MW)

Fortuna Discover acts as a search engine, allowing users to find ("discover") content on Wikimedia projects by taking or uploading photos.

  • Needs OSS implementation; only solution right now is TinEye or Cloud Vision, etc

Charisto

[edit]

Extension

Fortuna Charisto allows contributors to subscribe to notifications of when their content is used online. The name Charisto comes from the Greek word for "thank you", "ef̱charistó̱".

  • Would it be cheaper to self-monitor using TinEye's general API?

Merido

[edit]

Extension and FastAPI server

Fortuna will have an API named Merido that allows other tools to search for copyright violations without needing to use a variety of APIs and without the need of additional funding. The name comes from the Greek word for "share", "merídio".

  • Plug in to GraphQL extension for GraphQL API support

Grafeas

[edit]

Extension integration with Wikisource OR gadget OR separate extension

Identify text in images through OCR, and link to Earwig's copyright tool to check if they are copyvios.

Aspis

[edit]

Gadget, extension and mobile integration

Hide potentially explicit images on pages and file description pages.

  • Google API currently; any other solutions?
  • Safe in Greek

Syntomo

[edit]

Extension integration

Generate file captions/descriptions/depicts statements/categories.

Implementation

[edit]

Technology

[edit]

Fortuna will utilize a variety of APIs for detection in order to maximize results.

Services used:

Fortuna will be written in Elm for safety and quality assurance. The frontend will use OOUI to match with the interfaces of Wikimedia projects and to allow for easier user onboarding. The tool will be hosted on Toolforge and use ToolsDB for database storage and caching.

Funding

[edit]