Lingua Libre/Supports

Lingua Libre/Supports gathers all projects where Wikimédia France or other actors provided human supports to the Lingua Libre project. It aims to give a quick view of each initiative's human resources, objectives, end results and possible associated reporting or documents. The most original and relecant volunteer initiatives may also be documented below. For even smaller events, please visit Lingualibre Events page as a first entry point or contact the community.


Shtooka recorder
Powered by Nicolas Vion
ContactNicolas Vion (developer), Yug

Since 2004(?), full stack developer and French student of Ukrainian language Nicolas Vion developed a rapid vocabulary recording system to provide audio to his personal learning resources.

At a time audio recording of words was tedious and dirty : the speakers mostly turning to Audacity, clicking on "Record" then "Stop" to define the start and end of the recording, resulting is irregular recordings which should then be renamed according to the word recorded.

Shtooka offered a hundred times faster and quite cleaner computer assisted way to record 1000s of words from a list provided by its user. The system included a form to define the language recorded, speaker name, gender, list name and other metadata common to all the words recorded. Sound level analysis identified audio level threshold to discriminate between irrelevant silences and when words were actually pronounced. Users could specify time margins keep at the start and end of the recorded words. The software produced regularly framed audios files, with one file by recorded items, human-friendly filenames and embedded metadata.

While very productive the program was a Desktop C/C++ software which was tedious to install and most likely required an advanced user to install it.

By the mid 2010s Shtooka was a niche software to download, install and use, mostly by language teachers and technophiles. An online service, SWAC Collection, allowed users to publish and share back to the online community about 300,000 recordings.

In 2013, Wikimedian contributor and Chinese teaching, vocabulary acquisition and e-learning PhD candidate noticed Shtooka open licence resources, meeting and befriending Nicolas. Yug enthusiastically used the new versions of the tools, providing feedbacks, bug reports, recording Chinese vocabulary in INALCO's recording studio for an adaptive learning web app CatIsSmart. Yug also promoted the tool to INALCO academics, Grenoble university, Wikimedia France and Wikimedians, looking for collaborations, funding toward Shtooka recorder or hire and the development of a more practical online version of the rapid recording tool. In 2015, Yug introduced Nicolas Vion to Rémy Gerbet.

2016 DGLFLF / Strasbourg

Lingua Libre PHP
Powered by Wikimédia France and Shtooka recorder
WebpageGithub ; Lingualibre.fr
MentorRémy Gerbet WMFr, Adélaïde Calais WMFr, Lyokoï
ContactNicolas Vion (developer)

With secured funding from Strasbourg University and DGLFLF (?), Nicolas Vion was hired to create an online, collaborative version of Shtooka recorder. The Wikimedia-backed project was eventually branded « Lingua Libre ».

Rémy 2015

DGLFLF supported Wikimedia France for past years already. In 2014, Rémy is hired into a Service civique, working among other things on :

  • Listing regional offices or actors promoting local languages
  • 24 Aout 2014 : Wikimedia France and Jean-Louis Barreau from APLLOD (Association pour la Promotion des Langues via la Lexicographie et l'Open Data) open a partnership. A mockup for a recording tool is submitted.
  • 2015 : « Une enquête sur les pratique linguistes et numériques »
  • 2015-07-30 : the first DGLFLF patnership provides Wikimedia France with 15,000€ for both a seminar on French languages and a mobile apps promoting those languages. On September 23th, WMFR adds 10,000€ to lead this effort.
  • 23 janv. 2016 : « Congrès des Langues de France ». Discussions occurs for on which kind of application to build. Yug bring Nicolas Vion, with the plan for a online word recorder based on Shtooka.

Soon after, Rémy is hired by WMFr. DGLFLF funds are then forwarded toward Nicolas Vion and into transposing his desktop Shtooka Recorder into a web version, which became LinguaLibre.fr. On March 21st 2016, Nicolas Vion is hired by Wikimedia France and soon starts working. The first formal contributions go on April 20th, via https://lingualibre.fr, for a dedicated workshop Strasbourg with OLCA (Office pour la Langue et Culture Alsacienne). In May 2016, as the web apps proves trustworthy, Rémy creates fr:Project:Lingua Libre. In September 2016, Rémy creates fr:Projet:Langues de France focused on the related workshop and content created or to be created.

Soon after, the following community worked using LinguaLibre.fr to record audios :

On July 3rd, 2017, WMFR and partners lead a end-of-project ceremony at the Maison de l'Alsace, Paris.

2016 service civique on Francophonie with part-time on Lingualibre.

Occitan languages projects

To be completed by User:Yug.

2018 Wikimedia Foundation Grant

Lingua Libre Service Civique 2022.
Powered by Wikimédia France
Lingua Libre
WebpageBlog post by WMFr, 2021
MentorAdélaïde Calais WMFr

2019 service civique on Lingualibre (Interview)

  • Contact with French regional languages groups
  • Contact with Lo Congrès (Occitan)
  • Recommandation for future actions : expand contribution tools, valorisation tools for language consumers (sonotheque), UI redesign
  • Coordination with 0x010C for 2019 RecordWizard's UI redesign
  • Relations with INALCO, initiating ContribuLing
  • Relations with Plateforme Atlas

2021 Wikivalley freelance

Wikimedia France
Prestation WikiValley 2021.
Powered by Wikimédia France
Lingua Libre
WebpageDocumentation (lien à retrouver)
MentorAdélaïde Calais WMFr

2021 Campus Digital

VueJS recordings checker
Powered by Wikimédia France and Toulouse Digital Campus
StartDecember 2021
EndedDecember 2021
MentorAdélaïde Calais WMFr

On December 2021, Digital Campus lead a one week hackathon around lingua libre. Their instruction was to offer a practical interface allowing to listen to existing recordings and to put a tag on them. To facilitate the exercise, they will not link it to lingua libre, just create a file in which the reviewed recordings and their tags are stored.

The proof of concept UI, coded in 4 days, has not been pushed further.

Lingua Librist in Residence - 2IF

Lingua Librist in Residence for DDF - 2IF
Powered by Wikimédia France and Institut International pour la Francophonie (2IF)
WebpageBlog post by WMFr
StartMay 2019
EndedSeptember 2021
MentorSebleouf, Noé

4 months internship at Institut International pour la Francophonie (Lyon, France), as a Lingua Librist in Residence for the Dictionnaire des Francophones project.

  • The main objective was to integrate Lingua Libre audio recordings into the Dictionnaire des francophones, and to improve Lingua Libre for future recordings.
  • Participation in community and technical discussions, proposition for changes on the website, and organization of a two-day hackathon for Lingua Libre developers.
  • Classification of DDF entries associated with regions into approximately 250 different Lingua Libre lists.
  • Presentation of Lingua Libre at two international events during the summer: ContribuLing and Wikimania.

MSc. Computer Science class of Toulouse

Lingua Libre audio analysis via Asteroid

2020 Cantonese project

2020 Catonese project
Powered by User:Yug.
千方百计/No stone left unturn in Cantonese

巧克力/Chocolate in Chinese.

WebpageLingua Libre/Supports
Start2020-05 – first recordings
Ended2020-07 – most recordings done
MentorYug (project lead)
See production on -other (Q9186) and/or -yue.

The 2020 Cantonese project was a test-project lead by Yug and junior Hongkongese sound engineer Luilui6666, which aimed to test paid-contributions to audio document a target language via Lingua Libre tool.

A budget of 300€ for 10 hours (30€/h) was agreed upon. Few word list (HSK) and visio training on Lingua Libre recording studio were provided to the recordist. The recordist provided professional material and known how. Autopatrol userrights had to be requested. User:Yug sponsored this test with private funding.

The recordist worked occasionally from home, via short recording sessions, progressing steadily. This setting provided an appreciated complementary and free-planning income to Luilui6666.

The project got slowed down by a speeding-up recordings bug on longer sentences, which Luilui6666 reported and, with Yug's guidance, inspected and identified more precisely but through time consuming exchanges (~3h).

Nevertheless, about 6,000 recordings (see -other (Q9186) and/or -yue) were produced by this enthusiast and pro-active paid-contributor.

With a highly productive 6,000+ recordings completed within 3 months for a modest 300€, covering known core Sino-Cantonese vocabulary, together with satisfied parties, the project was considered highly successful and validating paid-contributions on Lingua Libre as an extremely productive avenue.

A smaller 2022 Chinese project was also carried out.


Lingua Libre Service Civique 2022.
Mélody Xu YANG WMFr
Powered by Wikimédia France.
Lingua Libre
WebpageLingua Libre/Supports
StartSeptember 2022
EndedMarch 17, 2023 (6 months)
HierarchyAdélaïde Calais WMFr
Rémy Gerbet WMFr

The 2022-2023 Service Civique on Lingualibre is a 6 months mission within Wikimedia France, Paris, aimed to advance Lingualibre's outreach and partnerships.

A first phase would create supporting resources, testing and demonstrating communication avenues within the French ecosystem with the following axis:

  • Identify high-potential institutional partnerships for Langues de France
  • Design an outreach campaign, materials, with Wikimedia France's co-workers ; iterate with Lingualibre's community
  • Launch this campaign
  • Final report

These experiences will be leveraged to expand outreach to other communities demonstrating some activity and potential for growth.

2023 SignIt freelance

This Freelance was funded on budget from the French Ministry of Higher Education and Research's Wikimédian in résidence in Toulouse University. Hugo en résidence proposed a lean, agile, small scale freelance to former developper 0x010C to restore the video recording chain, successfully.

2023 Wikirésidence

Urfist Occitanie
Wikimedien en résidence.Hugo en résidence
Powered by MESR, Wikimédia France, URFIST.
WebpageLingua Libre/Supports
StartFebruary 13, 2023
EndedFebruary 12, 2024
HierarchyMathieu Denel WMFr,
Rémy Gerbet WMFr
Contact Hugo en résidence

The Wikiresidence at Université de Toulouse / URFIST Occitanie allowed User:Hugo en résidence to lead several pushes on Lingualibre, with a dual volunteer and official stance. A focus was put on SignIt, communication and formal collaborations (Occitan Whistle). In 2024, about 1.5 human-month were dedicated to leading the Lingua Libre GSoC24 described further below.

Date Event City Link
2023.05.28, 10:00–18:00 Forom des langues 2023 Toulouse – Démonstration IRL (stand)
2023.05.28, 14:00–16:00 Toulouse Hack Toulouse – Information à un public cible
2023.07.29, 09:30–10:00 COSCUP 2023 Taipei https://commons.wikimedia.org/wiki/File:Lingua_Libre_SignIt_presentation-2023-COSCUP_Taibei.pdf
2023.08.18, 17:40-17:45 Wikimania 2023 Singapore https://commons.wikimedia.org/wiki/File:Lingua_Libre_SignIt_presentation-2023-Wikimania.pdf
2023.11.19, 10:30–11:00 Capitole du Libre Toulouse https://cfp.capitoledulibre.org/cdl-2023/talk/3ZQZTR/
2023.11.20, 14:00–15:00 CRL UT2J Toulouse https://commons.wikimedia.org/wiki/File:Lingua_Libre_URFI


Whistled Occitan

Occitan Gascon is a French minority language with correct documentation and modest institutional support. In the frontline of this effort is Lo Congrès (https://locongres.org ), most notable with its multiple Occitan dictionaries, 25 000 Occitan recordings among which 4908 in Gascon. Hugo en résidence, DMontagne en résidence and Univòc64 lead since 2023 a follow up and complementary audio documentation effort on Aas whistled language. Other efforts include recording local villages names to reestablish and remind local names within their traditional territories, public communications at events and online.

Other supports

Poslovitch (1)

Lingua Libre Internship 2023.Poslovitch
Powered by Wikimédia France.
Lingua Libre
WebpageLingua Libre/Supports
StartApril 17, 2023
EndedAugust 2023 (4 months)
HierarchyAdélaïde Calais WMFr, Michael
Rémy Gerbet WMFr

The 2023 Computer Science Intership on Lingualibre is a 4 months internship within Wikimedia France, Paris. One sub-mission aimed to audit the feasibility of migrating Lingualibre's wikibase and its 800,000 items to the now available Wikimedia Commons wikibase.

Poslovitch (2)

Lingua Libre Internship 2023.Poslovitch
Powered by Wikimédia France.
Lingua Libre
WebpageLingua Libre/Supports
StartSeptember 2023
EndedApril 2024 (4 months)
HierarchyAdélaïde Calais WMFr, Michael Barbereau WMFr, Rémy Gerbet WMFr

The Lingua Libre v3 proof of concept aimed to demonstrate Lingualibre on a minimal Python Django and JS (VueJS, Vite, NodeJS) stacks.Healthy VueJS structure, unit tests, and best practice UML documentation were layed, as well as a leaner back end using MariaDB, successfully demonstrating the project's feasibility. An Ansible deployment system was also developed. Recommendations: More work needed to migrate other features and behaviors from the legacy code-base to this leaner stack.

Google Summer of Code 2024

These 2 projects designs and applications were initiated by Yug, with User:Poslovitch and Ishan Saini as technical co-mentors. Project application workload was about 4~6 workdays over a month mostly to understand the scope, how to proceed, where, and to write down the project description. Applicants reviews (~8 pers.), communication, issues assignments, assessment took about 6 workdays over 3 weeks. The coding period required 1.5 workdays a week from Yug and 2~4h/week for technical mentors. Kabir worked earlier in May & June, while Pushkar worked as encouraged in June, July & August 2024. Interns are funded by Google, Yug was mostly mentoring within the scope of his Open Science job funded by MESR France and URFIST Occitanie ; Ishan and Poslovitch were mentoring as volunteers, explaining the different workloads taken.

Lingua Libre Django migration

Urfist Occitanie
Lingua Libre GSoC24
Powered by URFIST Occitanie and Google Summer of Code.
WebpageLingua Libre/Supports
Start2024-02 – application
2024-06 – coding
Ended2024-09 – closure
MentorYug (project lead), Poslovitch (tech lead)

In the field of Language diversity, Wikimedia Foundation and Wikimedia France have supported LinguaLibre.org, a single page VueJS application to rapidly record vocabularies of the world. Over 240 languages and 1.2 millions words have been audio recorded into Wikimedia sites through this open project. Current back end (wikibase, PHP, blazegraph) while interesting have shown limitations, mostly limited query speed, no API, stack opacity and duplication of data. A revamp have been engaged but requires further full stack work to be migrated into a maintainable code base, upgraded into an elegant service and pushed into production for willful native speakers.

Lingua Libre SignIt

Urfist Occitanie
Lingua Libre SignIt GSoC24
Powered by URFIST Occitanie and Google Summer of Code.
WebpageLingua Libre/Supports
Start2024-02 – application
2024-06 – coding
Ended2024-09 – closure
MentorYug (project lead), Ishan Saini (co-tech lead)
ContactYug, Kabir
ProposalSum upReportGithub logs.

Lingua Libre's mission has been extended to Sign Languages in 2019. Both a click-and-translate Firefox extension and a video recording studio have been developed. Both system UI exist in 35+ languages allowing the global documentation and learning of various sign languages. As Manisfest v2.0 extensions are being phased out, the project is under threat. A full revamp into manifest v3.0 and a modern extension structure would allow the project to be compatible with all web navigators. This project must navigate updated in browser web extension security constrains and new web extension API.

Indonesian languages recording project

See also WikiKata, Malaysia.
Lingua Libre
Indonesian languages recording projects (Ardzun)
Powered by Wikimedia Indonesia.
WebpageWikiTutur on Wikikamus
EndedOn going
ContactArdzun (lead), Yug (support)

WikiTutur is a language preservation program by recording vocabulary pronunciation through the use of the Wiktionary and Lingua Libre. This project was previously run in Indonesia by volunteers from the Jakarta Wikimedia Community in 2023-2024 in collaboration with other local Wikimedia communities in Indonesia.

About 30+ Indonesian languages have been audio recorded, making it the most active Lingua Libre group as of early 2024. Project referent is Ardzun. Occasional second level support, guidances and languages creation is provided by Yug, mostly on Discord and Lingualibre.org.

Youth voices of Roussillon

Youth voices of Roussilon (Culex)
Powered by Médiathèque of Canet-en-Roussillon.
EndedOn going
ContactCulex (lead)

Youth voice of Roussillon is an educational project using Wiktionary and Lingua Libre as contributive tools for local youth. Lead by the public mediathèque of Canet-en-Roussillon in a positive and encouraging atmosphere, it aims to valorize the students, allow them to discuss words, record them, document them on notorious digital commons. The project is lead by public library of Canet-en-Roussillon, via its director, who is also a Wikipedia administrator.

Wikimedian administrator User:Culex initiated the project in 2023 thank to his professional position and good relations with local schools. Leveraging its initial deep knowledge of Wikimedian projets, this local project was essentially lead by Culex without the usual secondary support.

Pushkar Winter 24-25

Wikimedia France
Lingua Libre Django Freelance 1
Powered by Wikimedia France
WebpageLingua Libre/Supports
Start2024-10 – discussion
2025-01 – coding
HierarchyXavier Cailleau WMFr

Google Summer of Code 24's Django developer Pushkar is freelanced by Wikimedia France to code a week-long code-quality sprint. Immediate objectives were to

  1. demonstrate the feasibility of an extra-European Union freelance
  2. activate internationalization (i18n)
  3. split the code into a branding website (lingualibre.org) and the recording app (Lingua Libre App) repositories
  4. add other minor features.

Middle term objective was to progress further toward deploying Lingua Libre Django in 2025. Long term objective was to ease the distribution of coding workload between Wikimedian volunteers and freelancers, while also diversifying freelancing options. It fully integrates into the general objective of Lingua Libre Django which primarily aims to simplify the stack and lower the technical barrier for technical maintenance.

Wikimedia Hackathon 2025

Yug mentored the 3 young developers who are now the lead developers on Lingua Libre and Lingua Libre/SignIt to apply as first-timers to the WM Hackathon 2025's scholarship. Applying team is as follow :

  • Pushkar : focus on Lingua Libre Django deployment, polishing, and announcement. Note: the heaviest project, so we also explore a possible WMFR's support around 2025 Q1.
  • GonFreeaks : focus on Lingua Libre SignIt deployment (?) and announcement. Also sign languages extension via scrapping, with Yug.
  • SalthyKheera : support to Pushkar, following and learning, but also exploring WM deployment chain (Gitlab -> WM clouds), unit tests.
  • Yug : support to the group, will help were needed. Team up with Kabir for signed video scrapping.

Result in early 2025.

Material grants
