Tsarin Shekara-Shekara ta Gidauniyar Wikimidiya/2024-2025/Kayayyaki da Fasaha OKRs
Wannan takarda tana bayyana bangare na farko na shirin tsare-tsaren Gidaniyar Wikimidiya na shashin Kayayyaki da Fasaha. Tana bayanna “kudurorin sashin da muhimman sakamako” (OKRs). Wannan cigaba ne na tsarin Bayanan ayyuka (wanda ake kira “buckets”) wanda aka fara bara.

Na muku magana duka kwanaki a cikin Nuwamba a kan abunda da nayi imani shine tambaya mafi nauyi ga kungiyar Wikimidiya:ta yaya zamu tabbata da cewa Wikipedia da duka sauran shafukan Wikimidiya zasu dace da karni-da dama? Ina so in mika godiya ga duk wanda ya dauki lokaci wajen kula da wannan tambaya kuma ya amsa ta gareni kai-tsaye, sannan tunda a yanzu na samu daman amfani da lokaci wajenyin tunani akan amsoshin ku, zan yaɗa abunda na fahimta.
Na farko, babu wani dalili da ya sa masu sa kai suka ba da gudummawa. Don kula da tsararraki da yawa na masu sa kai, muna buƙatar fahimtar dalilan da suka sa mutane ke ba da lokacin su ga ayyukanmu. Na gaba, muna buƙatar mayar da hankali kan abin da ya raba mu: ikonmu na samar da abin dogaro da abun ciki yayin da bayanan karya da bayanan da ba daidai ba suka karu a kan intanet da kuma dandamali da ke gasa don kulawar sabbin tsararraki. Wannan ya haɗa da tabbatar da cewa mun cimma burin tattarawa da isar da jimlar duk ilimin ɗan adam ga duniya ta hanyar fadada bayananmu da suka ɓace, wanda zai iya haifar da rashin daidaito, nuna bambanci ko nuna bambanci. Abubuwan da muke ciki suna buƙatar yin hidima da kasancewa masu mahimmanci a cikin canjin intanet wanda ke motsawa ta hanyar fasaha ta wucin gadi da abubuwan da suka dace. A ƙarshe muna buƙatar neman hanyoyin da za mu ci gaba da tallafawa motsi namu ta hanyar gina dabarun raba don samfuranmu da kudaden shiga don mu iya tallafawa wannan aikin na dogon lokaci.
Wadannan ra'ayoyin za su bayyana a cikin shirin shekara-shekara na gidauniyar Wikimedia Foundation na 2024-2025, kashi na farko da nake rabawa tare da ku a yau ta hanyar daftarin manufofin kayan aikin mu & fasaharmu. Waɗannan manufofin suna haɓaka ra'ayoyin da muke ji daga membobin al'umma a cikin watanni da yawa da suka gabata ta hanyar Magana:2024, akan jerin wasiƙa da shafukan magana, da kuma abubuwan al'amuran al'umma game da samfuranmu da dabarun fasaharmu na shekara mai zuwa. Kuna iya duba cikakken jerin manufofin daftarin aiki a ƙasa.
"Manufa" jagora ce mai girma wacce za ta tsara samfura da ayyukan fasaha da muke ɗauka don shekara ta gaba. Suna da faɗi da niyya, suna wakiltar alkiblar dabarunmu kuma, mahimmanci, waɗanne ƙalubalen da muke ba da shawara don ba da fifiko a cikin wurare da yawa da za a iya mayar da hankali a cikin shekara mai zuwa. Muna raba wannan a yanzu don membobin al'umma su taimaka wajen tsara tunaninmu na farko kafin a ƙaddamar da kasafin kuɗi da maƙasudan aunawa na shekara.
Maidasaqo
Wani yanki da muke son amsawa musamman shine aikinmu wanda aka haɗa a ƙarƙashin sunan "Kwarewar Wiki." "Kwarewar Wiki" shine game da yadda muke isarwa, haɓakawa, da haɓaka yadda mutane ke amfani da wiki kai tsaye, ko a matsayin masu ba da gudummawa, masu siye, ko masu ba da gudummawa. Wannan ya haɗa da aiki don tallafawa ainihin fasaharmu da iyawarmu da kuma tabbatar da cewa za mu iya inganta ƙwarewar masu gyara masu sa kai - musamman, masu gyara tare da haƙƙin haƙƙin haƙƙin - ta hanyar mafi kyawun fasali da kayan aiki, sabis na fassarar, da haɓaka dandamali.
Anan akwai wasu tunani daga tattaunawar tsare-tsarenmu na baya-bayan nan, da kuma wasu tambayoyi don ku duka don taimaka mana mu gyara ra'ayoyinmu:
- Ya kamata aikin sa kai kan ayyukan Wikimedia ya ji daɗi. Muna kuma tunanin cewa kwarewar haɗin gwiwar kan layi ya kamata ya zama babban ɓangare na abin da ke sa masu sa kai su dawo. Menene ake ɗauka don masu sa kai don samun lada mai kyau, kuma suyi aiki tare don gina ingantaccen abun ciki?
- Amincewar abubuwan da ke cikin mu wani bangare ne na keɓancewar gudunmawar Wikimedia ga duniya, da kuma abin da ke sa mutane ke zuwa dandalinmu da yin amfani da abubuwan da muke ciki. Menene za mu iya ginawa wanda zai taimaka haɓaka abun ciki amintacce cikin sauri, amma har yanzu yana cikin ingantattun hanyoyin tsaro da al'ummomi suka saita akan kowane aiki?
- Don ci gaba da dacewa da yin gasa tare da sauran manyan dandamali na kan layi, Wikimedia yana buƙatar sabon ƙarni na masu amfani don jin alaƙa da abun cikin mu. Ta yaya za mu sauƙaƙe abubuwan mu don ganowa da hulɗa da masu karatu da masu ba da gudummawa?
- A cikin zamanin da cin zarafi na kan layi ya bunƙasa, muna buƙatar tabbatar da cewa an kare al'ummominmu, dandamali, da tsarin hidima. Har ila yau, muna fuskantar sauye-sauyen wajibcin bin doka, inda masu tsara manufofin duniya ke neman siffanta sirri, ainihi, da musayar bayanai akan layi. Waɗanne gyare-gyare ga iyawar yaƙinmu na cin zarafi za su taimake mu mu magance waɗannan ƙalubalen?
- MediaWiki, dandali na software da musaya waɗanda ke ba da damar Wikipedia yayi aiki, yana buƙatar goyon baya mai gudana har tsawon shekaru goma masu zuwa don samar da ƙirƙira, daidaitawa, ajiya, ganowa, da amfani da buɗaɗɗen abun ciki na harsuna da yawa a sikelin. Wadanne shawarwari da haɓaka dandamali za mu iya yanke a wannan shekara don tabbatar da cewa MediaWiki yana da dorewa?
Manufofi
A halin yanzu da aka buga sune matakin shiryawa - 'manufofin "' '".
Mataki na gaba - Sakamakon Maɓalli'' (KR) na kowane maƙasudi da aka kammala ana bayar da su a ƙasa.
Za a sabunta abubuwan da ke cikin Hassosin na kowane KR a ƙasa kuma za a sabunta su akan ayyukan da suka dace/shafukan wiki na ƙungiyar a cikin shekara don sabunta su cikin shekara yayin da ake koyan darussa.
Kwarewar Wiki (WE) makasudin | ||||
---|---|---|---|---|
Manufa | Yankin manufa | Manufa | mahallin manufa | Mai shi |
WE1 | Kwarewar mai ba da gudummawa | Dukansu ƙwararrun masu ba da gudummawa da sababbin masu ba da gudummawa sun haɗu tare akan layi don gina ingantaccen kundin sani, tare da ƙarin sauƙi da ƙarancin takaici. | Domin Wikipedia ya zama mai fa'ida a cikin shekaru masu zuwa, dole ne mu yi aiki da ke haɓaka tsararraki masu sa kai da kuma ba da gudummawar wani abu da mutane ke son yi. Ƙungiyoyin masu sa kai daban-daban suna buƙatar saka hannun jari daban-daban - ƙwararrun masu ba da gudummawa suna buƙatar daidaita ayyukansu masu ƙarfi da kuma gyara su, yayin da sabbin masu ba da gudummawa suna buƙatar sabbin hanyoyin gyara masu ma'ana. Kuma a cikin waɗannan tsararraki, duk tana bukatar su sami damar yin cudanya da hada kai da juna don yin aikin da ya fi tasiri. Tare da wannan manufar, za mu inganta ayyukan aiki mai mahimmanci ga masu ba da gudummawa ƙwararrun, za mu rage shinge ga ingantacciyar gudunmawa ga sababbin masu shigowa, kuma za mu saka hannun jari a hanyoyin da masu sa kai za su iya samu da kuma sadarwa tare da juna a kusa da bukatun gama gari. | Marshall Miller |
WE2 | Abun cikin Encyclopedic | Ana tallafawa al'ummomi don rufe gibin ilimi yadda ya kamata ta kayan aiki da tsarin tallafi waɗanda ke da sauƙin samun dama, daidaitawa, da haɓakawa, tabbatar da haɓaka haɓaka cikin amintaccen abun ciki na encyclopedic. | Abubuwan da ke cikin encyclopedic da farko akan Wikipedia ana iya haɓakawa da haɓaka ta hanyar ci gaba da haɗa kai da sabbin abubuwa. Kayan aiki da albarkatu (duka na fasaha da na fasaha) waɗanda ke akwai don masu ba da gudummawa don amfani da su don buƙatun su za a iya ƙara gano su, kuma abin dogaro. Waɗannan kayan aikin yakamata su kasance mafi kyawun goyan bayan WMF, ta hanyar haɓaka fasalin da za'a iya samu a cikin gajerun zagayowar. Dangane da abubuwan da suka faru na baya-bayan nan game da samar da abun ciki na taimakon AI da canza halayen mai amfani, za mu kuma bincika aikin ƙasa don ɗimbin canje-canje (misali Wikifunctions) waɗanda zasu iya taimakawa haɓaka haɓakar haɓakar abun ciki da sake amfani da su. Hanyoyi don gano gibin abun ciki yakamata su kasance cikin sauƙin ganowa, da tsarawa da su. Abubuwan da ke goyan bayan haɓaka abun ciki na encyclopedic, gami da abun ciki akan ayyukan ƴan'uwa, ayyuka kamar Laburaren Wikipedia, da yaƙin neman zaɓe za a iya haɗa su da ayyukan gudummuwa. Har ila yau, hanyoyin da ake amfani da su don haɓaka ya kamata su kasance suna da kariya daga barazanar girma, wanda zai iya tabbatar da cewa an ci gaba da amincewa a cikin tsari yayin da yake kasancewa da gaskiya ga ainihin ka'idoji na abun ciki na encyclopedic kamar yadda aka gane a cikin ayyukan Wikimedia.
Masu sauraro: Editoci, Masu Fassara |
Runa Bhattacharjee |
WE3 | Kwarewar mabukaci (Karanta & Mai jarida) | Sabbin ƙarni na masu amfani sun isa Wikipedia don gano wurin da aka fi so don ganowa, shiga, da gina alaƙa mai dorewa tare da abun ciki na encyclopedic. | Cinma manufa:
Riƙe ƙwararrun masu amfani da masu ba da gudummawa da suke da su. Haɓaka dacewa ga data kasance da sabbin tsararrun masu amfani ta hanyar sanya abun cikinmu ya fi sauƙi ganowa da hulɗa da su. Yi aiki a kan dandamali don daidaita abubuwan da muke da su da abubuwan da ke ciki, ta yadda za a iya bincika abubuwan da ke cikin encyclopedic da kuma sarrafa su da kuma zuwa ga sabon ƙarni na masu amfani da masu ba da gudummawa. |
Olga Vasileva |
WE4 | Amincewa & Tsaro | Haɓaka abubuwan more rayuwa, kayan aikinmu, da matakai don mu sami ingantattun kayan aiki don kare al'ummomi, dandamali, da tsarin sabis ɗinmu daga nau'ikan ma'auni daban-daban da cin zarafi yayin da muke ci gaba da bin ka'idoji masu tasowa. | Wasu fuskoki na iyawar mu na cin zarafi suna buƙatar haɓakawa. Rage cin zarafi na tushen IP yana zama ƙasa da tasiri, kayan aikin gudanarwa da yawa suna buƙatar haɓaka ingantaccen aiki, kuma muna buƙatar haɗa dabarun haɗin kai wanda ke taimaka mana yaƙi da cin zarafi ta amfani da sigina daban-daban da hanyoyin ragewa (captchas, tubalan, da sauransu) a cikin wasan kwaikwayo. A cikin wannan shekara, za mu fara samun ci gaba a kan manyan matsalolin da ke cikin wannan sararin samaniya. Bugu da ƙari, wannan saka hannun jari na kariyar cin zarafi dole ne a daidaita shi ta hanyar saka hannun jari a cikin fahimta da inganta lafiyar al'umma, waɗanda da yawa daga cikinsu suna cikin buƙatun tsari daban-daban. | Suman Cherukuwada |
WE5 | Dandalin Ilimi I (Tsarin Juyin Halitta) | Haɓaka dandali na MediaWiki da mu'amalarsa don saduwa da ainihin buƙatun Wikipedia. | An gina MediaWiki don ba da damar ƙirƙira, daidaitawa, ajiya, ganowa da amfani da buɗaɗɗen abubuwan cikin harsuna da yawa a sikelin. A cikin wannan shekara ta biyu na Dandalin Ilimi za mu duba tsarin kuma mu fara aiki don inganta dandamali don tallafawa ayyukan Wikimedia yadda ya kamata a cikin shekaru goma masu zuwa, farawa da Wikipedia. Wannan ya haɗa da ci gaba da aiki don ayyana dandalin samar da ilimin mu, ƙarfafa ɗorewa na dandamali, mai da hankali kan tsarin haɓakawa / ƙugiya don fayyace da haɓaka haɓaka fasalin fasali, da ci gaba da saka hannun jari a cikin raba ilimi da ba da damar mutane su ba da gudummawa ga MediaWiki. | Birgit Müller |
WE6 | Dandalin Ilimi II (Sabis na Haɓakawa) | Ma'aikatan fasaha da masu haɓaka aikin sa kai suna da kayan aikin da suke buƙata don tallafawa ayyukan Wikimedia yadda ya kamata. | Za mu ci gaba da aikin da aka fara don inganta haɓaka (da sikelin), gwaji da tura ayyukan aiki a cikin Ayyukan Wikimedia da fadada ma'anar don haɗawa da sabis don masu haɓaka kayan aiki. Har ila yau, muna nufin haɓaka ikonmu na amsa tambayoyin da ake yi akai-akai a fagen ayyukan masu haɓakawa / injiniyoyi da masu sauraro da kuma samar da bayanan da suka dace don ba da damar yanke shawara. Wani ɓangare na wannan aikin shine duba ayyuka (ko rashin irin waɗannan) waɗanda a halin yanzu ke ba da ƙalubale ga yanayin mu. | Birgit Müller |
Signals and Data Services (SDS) makasudin | ||||
---|---|---|---|---|
Manufa | Yankin manufa | Manufa | mahallin manufa | Mai shi |
SDS1 | Abubuwan fahimta da aka raba | Hukunce-hukuncen mu game da yadda za mu tallafa wa manufa da motsi na Wikimedia ana sanar da su ta manyan ma'auni da fahimta. | Domin mu samar da fasaha mai inganci da inganci, tallafawa masu sa kai, da bayar da shawarwari ga manufofin da ke karewa da ci gaban samun ilimi, muna buƙatar fahimtar yanayin yanayin Wikimedia kuma mu daidaita kan yadda nasara ta kasance. Wannan yana nufin bin diddigin ma'auni na gama gari waɗanda ke dogara, da fahimta, kuma ana samun su cikin kan kari. Har ila yau, yana nufin zurfafa bincike da fahimi waɗanda ke taimaka mana mu fahimci dalilai da yadda suke bayan ma'aunin mu. | Kate Zimmerman |
SDS2 | Dandalin gwaji | Manajojin samfur na iya sauri, sauƙi, da amintaccen kimanta tasirin fasalulluka na samfur. | Don kunnawa da haɓaka yanke shawara na bayanai game da haɓaka fasalin samfur, masu sarrafa samfur suna buƙatar dandamali na gwaji wanda za su iya ayyana fasali, zaɓi masu sauraron masu amfani, da ganin ma'aunin tasiri. Gudun lokaci daga farawa zuwa bincike yana da mahimmanci, saboda rage lokacin koyo zai hanzarta gwaji, kuma a ƙarshe, ƙirƙira. An gano ayyuka na hannu da kuma hanyoyin da za a bi don auna su azaman shingen gaggawa. Kyakkyawan yanayin shine cewa manajojin samfur zasu iya samu daga ƙaddamar da gwaji zuwa ganowa tare da ɗan ko babu sa hannun hannu daga injiniyoyi da manazarta. | Tajh Taylor |
Masu sauraro na gaba (FA) manufa | ||||
---|---|---|---|---|
Manufa | Yankin manufa | Manufa | mahallin manufa | Mai shi |
FA1 | Gwajin hasashe | Bayar da shawarwari game da dabarun saka hannun jari don Gidauniyar Wikimedia don bi - bisa la'akari daga gwaje-gwajen da ke haɓaka fahimtar yadda ake raba ilimi da cinyewa akan layi - waɗanda ke taimakawa motsinmu hidima ga sabbin masu sauraro a cikin intanet mai canzawa. | Saboda ci gaba da sauye-sauyen fasaha da halayen masu amfani da kan layi (misali, ƙara fifiko don samun bayanai ta hanyar aikace-aikacen zamantakewa, shaharar ɗan gajeren bidiyo edu-tainment, haɓakar haɓakar AI), ƙungiyar Wikimedia tana fuskantar ƙalubale wajen jawowa da riƙe masu karatu da masu ba da gudummawa. Waɗannan canje-canjen kuma suna kawo damar yin hidima ga sabbin masu sauraro ta hanyar ƙirƙira da isar da bayanai ta sabbin hanyoyi. Duk da haka, mu a matsayinmu na ƙungiyoyi ba mu da cikakken bayani game da fa'idodi da ciniki na dabaru daban-daban da za mu iya bi don shawo kan ƙalubalen ko kuma amfani da sabbin damammaki. Misali, ya kamata mu...
Don tabbatar da cewa Wikimedia ta zama aikin tsararraki da yawa, za mu gwada hasashe don ƙarin fahimta da ba da shawarar dabaru masu ban sha'awa - don Gidauniyar Wikimedia da kuma motsin Wikimedia - don bi don jawo hankali da riƙe masu sauraro na gaba. |
Maryana Pinchuk |
Taimakon Samfura da Injiniya (PES). | ||||
---|---|---|---|---|
Manufa | Yankin manufa | Manufa | mahallin manufa | Mai shi |
PES1 | Ingantattun ayyuka | Sanya aikin Gidauniyar cikin sauri, mai rahusa, kuma mafi tasiri. | Ma'aikata suna yin abubuwa da yawa a cikin ayyukansu na yau da kullun don sanya ayyukanmu sauri, rahusa, kuma mafi tasiri. Wannan haƙiƙa tana ba da ƙayyadaddun ƙayyadaddun yunƙuri waɗanda duka biyu za su a) samun riba mai yawa zuwa ga sauri, mai rahusa, ko mafi tasiri; da b) yunƙurin haɗin kai da sauye-sauyen ayyuka na yau da kullun da na yau da kullun a Gidauniyar. Mahimmanci, KRs ɗin da aka haɗa cikin wannan makasudin shine mafi wahala kuma mafi kyawun haɓakawa da za mu iya yi a wannan shekara don ingantaccen aiki na aikin da ya taɓa samfuranmu da fasaharmu. | Amanda Bittaker |
Sakamakon Mahimmanci
Sakamakon "'"Maɓalli"" (KR) ga kowane maƙasudin kammalawa suna nan. Sun dace da kowane makasudin, a sama.
Ana buga tushen Hassoshi na kowane KR a ƙasa akan wannan shafin kuma za'a sabunta su akan aikin da ya dace ko shafukan wiki na ƙungiyar cikin shekara yayin da ake koyon darussa.
Sakamakon Maɓalli na Wiki (WE).
[ Makasudai ] | |||
---|---|---|---|
Gajeren Sunan Maɓalli | Mabuɗin Sakamakon rubutu | Mahimmin Sakamakon Maɓalli | Mai shi |
WE1.1 | Haɓaka ko haɓaka aikin guda ɗaya wanda ke taimakawa masu ba da gudummawa tare da buƙatun gama gari don haɗawa da juna da ba da gudummawa tare. | Muna tsammanin filayen al'umma da mu'amala a kan wikis suna sa mutane farin ciki da ƙarin fa'ida a matsayin masu ba da gudummawa. Bugu da ƙari, wuraren zama na al'umma suna taimaka wa kan jirgin da ba da jagoranci ga sababbin shigowa, suna tsara mafi kyawun ayyuka na ba da gudummawa, da kuma taimakawa wajen magance gibin ilimi. Koyaya, albarkatun da ke akwai, kayan aiki, da sarari waɗanda ke tallafawa haɗin ɗan adam akan wikis ba su da ƙarfi kuma ba sa saduwa da ƙalubale da buƙatun yawancin editoci a yau. A halin yanzu, aikin ƙungiyar Gangamin ya nuna cewa yawancin masu shiryawa suna ɗokin ɗauka da gwaji tare da sababbin kayan aiki tare da tsarin aiki mai tsari wanda ke taimaka musu a cikin aikin al'umma. Don waɗannan dalilai, muna so mu mai da hankali kan ƙarfafawa da haɓaka fahimtar kasancewa tsakanin masu ba da gudummawa kan wikis. | Ilana Fried |
WE1.2 | Constructive Activation: Widespread deployment of interventions shown to collectively cause a 10% relative increase (y-o-y) on mobile web and a 25% relative increase (y-o-y) on iOS of newcomers who publish ≥1 constructive edit in the main namespace on a mobile device, as measured by controlled experiments. Lura: za a auna wannan KR akan kowane dandamali. |
Kwarewar gyare-gyare na cikakken shafi na yanzu yana buƙatar mahallin mahallin da yawa, haƙuri, da gwaji da kuskure don yawancin sababbin shigowa don ba da gudummawa mai inganci. Don tallafawa sabon ƙarni na masu aikin sa kai, za mu ƙara lamba da samuwa na ƙanana, tsararru, da ƙarin ƙayyadaddun ayyukan gyare-gyare na ƙayyadaddun ayyuka (misali Binciken Gyara da Tsarin Ayyuka).
Lura: Za a kafa ginshiƙai ne kawai zuwa ƙarshen Q4 na FY na yanzu, bayan haka ma'aunin ƙimar mu na KR zai kuma a kafa. |
Peter Pelberg |
WE1.3 | Ƙara gamsuwar mai amfani na samfuran daidaitawa 4 da 5pp kowane. | Editocin da ke da haƙƙin haƙƙinsu suna amfani da fa'idodin da ke akwai, kari, kayan aiki, da rubutun don yin ayyukan daidaitawa akan ayyukan Wikimedia. A wannan shekara muna so mu mai da hankali kan inganta wannan kayan aiki, maimakon aiwatar da ayyukan gina sabbin ayyuka a cikin wannan sarari. Muna fatan taɓa samfura da yawa a cikin tsawon shekara, kuma muna son yin haɓaka mai tasiri ga kowane. A yin haka muna fatan inganta ƙwarewar daidaita abun ciki gabaɗaya.
Za mu ayyana tushen tushe don kayan aikin daidaitawa gama gari waɗanda za mu iya yi niyya tare da wannan aikin don ƙayyade haɓakar gamsuwa ga kowane kayan aiki. Jerin fatan Al'umma zai zama babban mai ba da gudummawa ga yanke shawara kan fifikon wannan KR. |
Sam Walton |
WE2.1 | A ƙarshen Q2, masu shiryawa, masu ba da gudummawa, da cibiyoyi suna da wuraren farawa guda 3 masu dacewa kuma masu dacewa don ƙara ɗaukar bayanai a cikin mahimman batutuwan batutuwa wato Jinsi (lafin lafiyar mata, tarihin rayuwar mata), da Geography (Rayukan halittu). | Wannan KR shine game da haɓaka batun batun don rage gibin ilimin da ke akwai. Mun kafa cewa al'ummomi suna amfana daga ingantattun kayan aikin da aka haɗa tare da kamfen da aka yi niyya don haɓaka ingancin abun ciki a cikin ayyukanmu. A wannan shekara muna son mayar da hankali kan inganta kayan aikin da ake da su da kuma yin gwaji tare da sababbin hanyoyin ba da fifiko ga muhimman batutuwan da ke magance gibin ilimi. | Purity Waigi & Fiona Romeo |
WE2.2 | A ƙarshen Q2, aiwatar da gwada shawarwari guda biyu, na zamantakewa da na fasaha, don tallafawa harsuna kan shiga cikin ƙananan harsuna, tare da kimantawa don nazarin ra'ayoyin al'umma. | Akwai bugu na Wikipedia a cikin harsuna kusan 300. Amma duk da haka, akwai ƙarin harsuna da yawa waɗanda miliyoyin mutane ke magana, waɗanda babu Wikipedia ko Wiki kwata-kwata. Wannan shi ne mai toshewa ga cikar hangen nesanmu: cewa kowane ɗan adam zai iya rabawa cikin yardar rai a cikin jimlar duk ilimi. Wikimedia Incubator, shine inda za a iya tsara wikis na aikin Wikimedia a cikin sabbin harsuna, rubutawa, gwadawa da kuma tabbatar da cancantar gidauniyar Wikimedia ta shirya. Incubator an ƙaddamar da shi a cikin 2006 tare da tsammanin cewa masu amfani da shi za su sami ilimin gyara wiki kafin. Wannan matsala ta ta'azzara saboda gaskiyar cewa wannan tsari ya kamata a yi shi ne mafi yawan mutane waɗanda suka kasance sababbi kuma mafi ƙarancin gogewa a cikin motsinmu. Yayin da editan kan Wikimedia wikis ya inganta sosai tun lokacin, Incubator bai sami waɗannan sabuntawa ba saboda ƙarancin fasaha. A halin yanzu, yana ɗaukar makonni da yawa don wiki don kammala karatunsa daga Incubator kuma kisan wiki 12 ne kawai ake ƙirƙira kowace shekara, yana nuna ƙaƙƙarfan ƙugiya.
Tsarin bincike da kayan aiki yana bayyana ƙalubalen fasaha a kowane fanni na harshe a kan jirgin: ƙara sabbin harsuna zuwa ga Incubator, rikiɗar haɓakawa da nazarin abun ciki, da tafiyar hawainiya wajen ƙirƙirar rukunin wiki lokacin da harshe ya kammala karatunsa. Kowane lokaci yana jinkirin, mai hannu, kuma mai rikitarwa, yana nuna buƙatar haɓakawa. Magance wannan matsala zai ba da damar ƙirƙirar wikis a cikin sabbin harsuna cikin sauri da sauƙi, da ba da damar ƙarin mutane su raba ilimi. Daban-daban masu ruwa da tsaki, bincike da albarkatun da ake da su sun haskaka shawarwari da aka gabatar na zamantakewa da fasaha. Wannan mahimmin sakamakon yana ba da shawarar gwada shawarwari biyu na zamantakewa da fasaha da kuma kimanta ra'ayoyin al'umma. |
Satdeep Gill & Mary Munyoki |
WE2.3 | A ƙarshen Q2, sabbin fasalulluka 2 suna jagorantar masu ba da gudummawa don ƙara kayan tushe waɗanda suka bi jagororin aikin, kuma abokan hulɗa 3-5 sun ba da guddina kayan tushe wanda ke magance gibin harshe da ƙasa. | Don haɓaka samun dama ga ingantaccen kayan tushen da ake buƙata don rufe tazarar abubuwan abun ciki, za mu:
|
Fiona Romeo & Alexandra Ugolnikova |
WE2.4 | A ƙarshen Q2, kunna Wikifunctions kira akan test2wiki don samar da mafi girman hanyar da za a iya shuka sabon abun ciki. | Don rage gibin ilimin mu yadda ya kamata, muna buƙatar haɓaka ayyukan aiki waɗanda ke tallafawa haɓaka mai ƙarfi a cikin ingantaccen abun ciki, musamman a cikin ƙananan al'ummomin harshe. | Amy Tsay |
WE2.5 | A ƙarshen Q4, masu shirya shirye-shirye, masu ba da gudummawa, da cibiyoyi suna tallafawa don haɓaka ɗaukar hoto mai inganci a cikin mahimman batutuwa kamar Jinsi (lafin lafiyar mata, tarihin rayuwar mata), da Geography (Rayukan halittu) ta labarai 138 ta hanyar gwaje-gwaje. | Bibiyar kai tsaye zuwa WE2.1, wannan KR shine game da haɓaka ɗaukar hoto don rage gibin ilimin da ake ciki. Mun kafa cewa al'ummomi suna amfana daga ingantattun kayan aikin da aka haɗa tare da kamfen da aka yi niyya don haɓaka ingancin abun ciki a cikin ayyukanmu. A wannan shekara muna son mayar da hankali kan inganta kayan aikin da ake da su da kuma yin gwaji tare da sababbin hanyoyin ba da fifiko ga muhimman batutuwan da ke magance gibin ilimi. | Purity Waigi & Satdeep Gill |
WE3.1 | Saki guda biyu da aka keɓance, samun dama, da bincike na al'umma da ƙwarewar koyo ga wakilin wikis, tare da burin haɓaka riƙewar mai karatu na masu amfani da gogewa da kashi 5%. | Wannan KR yana mai da hankali kan haɓaka sabbin masu karatu a kan gidan yanar gizon mu, yana ba da damar sabon ƙarni don gina alaƙa mai dorewa tare da Wikipedia, ta hanyar binciko damar masu karatu don samun sauƙin ganowa da koyo daga abubuwan da suke sha'awar. Wannan zai haɗa da bincike da haɓaka sabbin abubuwan bincike, na keɓancewa, da bincike da al'umma ke jagoranta da ƙwarewar koyo (misali, ciyarwar abubuwan da suka dace da abubuwan da suka dace, abubuwan ba da shawara, abubuwan da suka shafi al'umma da sauransu).
Muna shirin farawa na shekara ta kasafin kuɗi ta hanyar gwaji tare da jerin gwaje-gwaje na abubuwan bincike don sanin abin da muke so a daidaita don amfanin samarwa, da kuma akan wanne dandamali (yanar gizo, ƙa'idodi, ko duka biyun). Sannan za mu mai da hankali kan daidaita waɗannan gwaje-gwajen da gwada ingancinsu wajen haɓaka riƙewa a cikin yanayin samarwa. Burinmu a ƙarshen shekara shine ƙaddamar da aƙalla ƙwarewa biyu akan wikis na wakilci kuma don auna daidai ƙimar 5% na riƙe mai karatu ga masu karatun da ke cikin waɗannan abubuwan. Don samun ingantacciyar tasiri wajen cimma wannan KR, za mu buƙaci ikon yin gwajin A/B tare da masu amfani da aka fita, da kuma kayan aikin da ke iya auna riƙe mai karatu. Hakanan muna iya buƙatar sabbin APIs ko ayyuka masu mahimmanci don gabatar da shawarwari da sauran hanyoyin gyarawa. |
Olga Vasileva |
WE3.2 | 50% karuwa a cikin adadin gudummawa ta hanyar abubuwan taɓawa a waje da banner na shekara-shekara da roƙon imel a kowane dandamali. | Manufarmu ita ce samar da nau'ikan hanyoyin samun kudaden shiga yayin da muke fahimtar masu ba da gudummawarmu. Dangane da martani da bayanai, abin da muka fi mayar da hankali a kai shi ne kara yawan gudummawar da aka bayar fiye da hanyoyin da Gidauniyar ta dogara da su a baya, musamman ma tuta na shekara-shekara. Muna so mu nuna cewa ta hanyar saka hannun jari a cikin ƙarin abubuwan haɗin gwiwar masu ba da gudummawa, za mu iya ci gaba da aikinmu da faɗaɗa tasirinmu ta hanyar samar da madadin masu ba da gudummawa da masu ba da gudummawa waɗanda ba su da amsa ga buƙatun banner. Kashi 50% kiyasin farko ne dangane da rage gani na maɓallin gudummawar akan Yanar gizo sakamakon Vector 2022, da ƙaruwar adadin gudummawar da aka bayar daga aikin gwaji na FY 2023-2024 akan aikace-aikacen Wikipedia don haɓaka ƙwarewar masu ba da gudummawa (ƙarin gudummawar 50.1%). Ƙimar wannan ma'auni ta dandamali zai taimaka mana mu fahimci abubuwan da ke faruwa a cikin dandamali kuma idan ya kamata a yi amfani da dabaru daban-daban a nan gaba dangane da bambancin hali dangane da masu sauraron dandamali. | Jazmin Tanner |
WE3.3 | A ƙarshen Q2 2024-25, masu sa kai za su fara canza zane-zane na gado zuwa sabon tsawo na zane akan samar da labaran Wikipedia. | An kashe tsawaita Graph saboda dalilai na tsaro tun daga Afrilu 2023, yana barin masu karatu ba su iya duba hotuna da yawa waɗanda membobin al'umma suka kashe lokaci da kuzari cikin shekaru 10 da suka gabata.
Hannun bayanai yana taka rawa wajen ƙirƙirar abun ciki na encyclopedic, don haka a cikin FY 2024-25, za mu gina sabon amintaccen sabis don maye gurbin Tsawon Graph wanda zai kula da mafi yawan lokuta masu sauƙin gani na bayanan amfani akan shafukan labarin Wikipedia. Wannan sabon sabis ɗin za a gina shi ta hanya mai sauƙi don tallafawa ƙarin ƙwararrun lokuta na amfani idan WMF ko masu haɓaka al'umma sun zaɓi yin hakan a nan gaba. Za mu san mun sami nasara lokacin da membobin al'umma suka sami nasarar canza zane-zane na gado da buga sabbin hotuna ta amfani da sabon sabis. Za mu ƙayyade ainihin ɗakin karatu na ganin bayanan da za mu yi amfani da shi da kuma nau'ikan jadawali don tallafawa yayin farkon matakin aikin. |
Christopher Ciufo |
WE3.4 | Ci gaba da samfurin iyawa don inganta aikin gidan yanar gizon ta hanyar karamin sikelin cache site turawa wanda ke ɗaukar wata ɗaya don aiwatarwa yayin kiyaye damar fasaha, tsaro da sirri. |
Ƙungiyar Traffic ita ce ke da alhakin kiyaye Cibiyar Bayar da Abun ciki (CDN). Wannan Layer yana adana abubuwan da ake samu akai-akai, shafuka, da sauransu, a cikin ƙwaƙwalwar ajiya da akan faifai. Wannan yana rage lokacin da ake ɗauka don aiwatar da buƙatun ga masu amfani. Abu na biyu shine adana abun ciki kusa da mai amfani a zahiri. Wannan yana rage lokacin da ake ɗaukar bayanai don isa ga mai amfani (latency). A bara, mun ba da damar wani shafi a Brazil yana nufin rage jinkiri a yankin Kudancin Amurka. Ƙaddamar da sababbin cibiyoyin bayanai zai zama mai girma amma yana da tsada, mai cin lokaci, kuma yana buƙatar aiki mai yawa don yin aiki - alal misali, aikin bara ya kai tsawon shekara. Muna son samun cibiyoyi a Afirka da kudu maso gabashin Asiya, kuma za mu so a samu su a duk faɗin duniya. Hasashen mu shine zazzage ƙananan shafuka a wasu wurare a duniya inda zirga-zirga ya ragu. Waɗannan suna buƙatar ƙarancin sabar, na waɗanda ba su wuce sabar huɗu ko biyar ba. Wannan yana rage mana farashi. Har yanzu zai taimaka mana rage jinkiri ga masu amfani a waɗannan yankuna, yayin da muke ƙara nauyi dangane da lokaci da ƙoƙarin kiyaye su. |
Kwaku Ofori |
WE3.5 | A ƙarshen Q3 2024-25, masu sa kai masu sha'awar kowane Wikipedia na iya ƙirƙirar sigogi kuma ƙungiyar ta sami nasarar ba da kulawa ga ƙungiyar Ƙwarewar Karatu. |
Tsawaita Chart yana rayuwa cikin samarwa kuma ana kunna shi don zaɓin jerin wikis matukin jirgi (itwiki, svwiki, hewiki). Manufar matukin jirgin shine ya gano kurakuran farko da matsalolin amfani kafin mu haɓaka aikin zuwa ƙarin wikis. Umurnin aikin ya haɗa da samar da magaji ga tsawo na jadawali akan duk wikis, kuma akwai ƙarin aiki don kunna hakan. Har ila yau, aikin na wucin gadi ne, ma'ana kulawa da duk wani ci gaban fasalin gaba yana buƙatar a ba da shi lokacin da aikin ya ƙare. |
Chris Ciufo |
WE4.1 | Bayar da shawarwarin matakan magance cutarwa guda 3 zuwa halaye masu cutarwa waɗanda aka sanar da su ta hanyar bayanai kuma daidai da yanayin haɓakar ƙa'idodi a ƙarshen Q4 | Tabbatar da amincin mai amfani da jin daɗin rayuwa babban nauyi ne na dandamali na kan layi. Yawancin hukunce-hukuncen suna da dokoki da ƙa'idodi waɗanda ke buƙatar dandamali na kan layi don ɗaukar mataki kan tsangwama, cin zarafi ta yanar gizo da sauran abubuwan da ke cutarwa. Rashin magance waɗannan na iya fallasa dandamali ga alhaki na doka da takunkumi na tsari.
A halin yanzu ba mu da wani kyakkyawan tunani game da girman girman waɗannan matsalolin ko dalilan da ke tattare da su. Mun dogara kacokan akan shedar tatsuniyoyi da tsarin aiki wanda ke barin mu fallasa duka biyu ga haɗarin doka da sauran sakamako masu nisa: rashin ƙima game da matsalar, haɓakar cutarwa, lalata suna da lalata amincin mai amfani. Muna buƙatar gina ƙaƙƙarfan al'ada don auna abin da ya faru na hargitsi & abun ciki mai cutarwa da aiwatar da matakan da za a bi don magancewa. |
Madalina Ana |
WE4.2 | Haɓaka aƙalla sigina guda biyu don amfani a cikin ayyukan aikin hana cin zarafi don inganta daidaiton ayyuka akan miyagun ƴan wasan a ƙarshen Q3. | The wikis rely heavily on IP blocking as a mechanism for blocking vandalism, spam and abuse. But IP addresses are increasingly less useful as stable identifiers of an individual actor, and blocking IP addresses has unintended negative effects on good faith users who happen to share the same IP address as bad actors. The combination of the decreasing stability of IP addresses and our heavy reliance on IP blocking result in less precision and effectiveness in targeting bad actors, in combination with increasing levels of collateral damage for good faith users. We want to see the opposite situation: decreased levels of collateral damage and increased precision in mitigations targeting bad actors. To better support the anti-abuse work of functionaries and to provide building blocks for reuse in existing (e.g. CheckUser, Special:Block) and new tools, in this KR we propose to explore ways to reliably associate an individual with their actions (sockpuppetting mitigation), and combine existing signals (e.g. IP addresses, account history, request attributes) to allow for more precise targeting of actions on bad actors. |
Kosta Harlan |
WE4.3 | Reduce the effectiveness of a large-scale distributed attack by 50% as measured by the time it takes us to adapt our measures and the traffic volume we can sustain in a simulation. | The evolution of the landscape of the internet, including the rise of large-scale botnets and more frequent attacks have made our traditional methods of limiting large-scale abuse obsolete. Such attacks can make our sites unavailable by flooding our infrastructure with requests, or overwhelm the ability of our community to combat large-scale vandalism. This also puts an unreasonable strain on our high privilege editors and our technical community. We urgently need to improve our ability to automatically detect, withstand, and mitigate or stop such attacks. In order to measure our improvements, we can't rely solely on frequency/intensity of actual attacks, as we would be dependent on external actions and it would be hard to get a clear quantitative picture of our progress. By setting up multiple simulated attacks of varying nature/complexity/duration to be run safely against our infrastructure, and running them every quarter, we will be able to both test our new countermeasures while not under attack, and to report objectively on our improvements. |
Giuseppe Lavagetto |
WE4.4 | Launch temp accounts to 100% of all wikis. | Temporary accounts are a solution for complying with various regulatory requirements around the exposure of IPs on our platform on various surfaces. This work involves updating many products, data pipelines, functionary tools, and various volunteer workflows to cope with the existence of an additional type of account. | Madalina Ana |
WE5.1 | By the end of Q3, complete at least 5 interventions that are intended to increase the sustainability of the platform. | The MediaWiki platform sustainability is an evergreen effort important for our ability to scale, increase or avoid degradation of developer satisfaction, and grow our technical community. This is hard to measure and depends on technical and social factors. However, we carry tacit knowledge about specific areas of improvements that are strategic for sustainability. The planned interventions may help increase the sustainability and maintainability of the platform or avoid its degradation. We plan to evaluate the impact of this work in Q4 with recommendations for sustainability goals moving forward. Examples of sustainability interventions are: simplify complex code domains that are core to MediaWiki but just a handful of people know how it works; increase the usage of code analysis tooling to inform quality of our codebase; streamline processes like packaging and releases. | Mateus Santos |
WE5.2 | Identify by the end of Q2 and complete by the end of Q4 one or more interventions to evolve the MediaWiki ecosystem’s programming interfaces to empower decoupled, simpler and more sustainable feature development. | The main goal of KR 5.2 is to improve and clarify the interaction between MediaWiki's core platform and its extensions, skins, and other parts. Our intent is to provide functional improvements to MediaWiki’s architecture that enable practical modularity and maintainability, for which it is easier to develop extensions, and to empower the requirements from the wider MediaWiki product vision. This work also aims to inform what should exist (or not) within core, extensions, or the interfaces between them. The year will be divided into two phases: a 5-month research and experimentation phase that will inform the second phase where specific interventions are implemented. | Jonathan Tweed |
WE5.3 | By the end of Q2, complete one data gathering initiative and one performance improvement experiment to inform followup product and platform interventions to leverage capabilities unlocked by MediaWiki’s modeling of a page as a composition of structured fragments. | The primary goal here is to empower developers and product managers to leverage new MediaWiki platform capabilities to meet current and future needs of encyclopedic content by making possible new product offerings that are currently difficult to implement as well as improve performance and resiliency of the platform. Specifically, at a MediaWiki platform level, we want to shift the processing model of MediaWiki from treating a page as a monolithic unit to treating a page as a composition of structured content units. Parsoid-based read views, Wikidata integration, and Wikifunctions integration into wikis are all implicit moves towards that. As part of this KR, we want to more intentionally experiment with and gather data to inform future interventions based on these new capabilities to ensure we can achieve the intended infrastructure and product impacts. |
Subramanya Sastry |
WE5.4 | Execute the MediaWiki release with a new process that synchronizes with PHP upgrades by Q4. | The MediaWiki software platform relies on regular updates to the next PHP version to remain secure and sustainable, which is a pain point in our process and important for the modernization of our infrastructure. At the same time, we regularly release new versions of the MediaWiki software, which e.g. translatewiki.net depends on, the platform used to translate software messages for the Wikimedia projects. Synchronizing the PHP upgrades with the release process ensures we are not staying behind on PHP versions. This will improve the maintenance and security of the MediaWiki platform and developers’ experience. |
Mateus Santos |
WE6.1 | Resolve 5 questions to enable efficiency and informed decisions on developer and engineering workflows and services and make relevant data accessible by the end of Q4. | “It’s complicated” is a frequent response to questions like “which repositories are deployed to Wikimedia production”. In this KR we will explore some of our “evergreens” in the field of engineering productivity and experience - recurring questions that seem easy but are hard to answer, questions that we can answer, but the data is not accessible and require custom queries by subject matter experts, or questions that are cumbersome to get a response on for process gap or other reasons. We will define what “resolve” means for each of the questions: For some this may just mean to make existing, accurate data accessible. Other questions will require more research and engineering time to address them. The overarching goal of this work is to reduce the time, workarounds and effort it takes to gain insights in key aspects of the developer experience and enable us to make improvements to engineering and developer workflows and services. | [TBD] |
WE6.2 | By the end of Q4, enhance an existing project and perform at least two experiments aimed at providing maintainable, targeted environments moving us towards safe, semi-continuous delivery. | Developers and users depend on the Wikimedia Beta Cluster (beta) to catch bugs before they affect users in production. Over time, the uses of beta have grown and come into conflict—the uses are too diverse to fit in a single environment. We will enhance one existing alternative environment and perform experiments aimed at replacing a single high-priority testing need currently fulfilled by beta with a maintainable alternative environment that better serves each use case's needs. | Tyler Cipriani |
WE6.3 | Develop a Toolforge sustainability scoring framework by Q3. Apply it to improve at least one critical platform aspect by Q4 and inform longer-term strategy. | Toolforge, the key platform for Wikimedia’s volunteer-built tools, plays a crucial role from editing to anti-vandalism. Our goal is to enhance Toolforge usability, lower the barriers to contribution, improve community practices, and promote adherence to established policies. To this effect, we will introduce a scoring system by the end of Q2 to evaluate the Toolforge platform sustainability, focusing on technical and social aspects. Using this system as a guide, we aim to improve one of the key technical factors by 50%. | Slavina Stefanova |
Signals & Data Services (SDS) Key Results
[ Makasudai ] | |||
---|---|---|---|
Gajeren Sunan Maɓalli | Mabuɗin Sakamakon rubutu | Mahimmin Sakamakon Maɓalli | Mai shi |
SDS1.1 | By the end of Q3, 2 programs or KR driven initiatives have evaluated the direct impact of their work on one or more core metrics. | Our core organizational metrics serve as key tools to assess the Foundation's progress toward its goals. As we allocate resources to programs and design key result (KR) oriented workstreams, these high-level metrics should guide how we link these investments to the Foundation's overarching goals as defined in the annual plan. The work in this key result acknowledges that the Foundation as a whole is at an early stage in its ability to quantitively link the impacts of all planned interventions to high-level, or core metrics. In pursuit of that eventual goal, this KR aims to develop the process by which we share the logical and theoretical links between our initiatives and our high-level metrics. In practice, this means partnering with initiative owners throughout the Foundation to understand how the output of their work at a project level is linked to and impacts our core metrics at a Foundation level. Currently, the Foundation is in an early stage in its goal of being able to execute program or product-driven initiatives and attribute the impact of those activities on Core Foundation level metrics. In pursuit of this goal, this KR aims to do the following: identify at least two candidate program or product driven initiatives, design an evaluation strategy to assess core metric impacts, and execute this evaluation strategy. Starting with two initiatives will help us quickly understand the challenges of performing analyses that allow us to attribute the impact of our work to observable changes in our core metrics. Learnings from this KR will inform a broader strategy to apply this measurement strategy to a wider range and quantity of Foundation initiatives. |
Omari Sefu |
SDS1.2 | Answer 3 strategic open research questions by December 2024 in order to provide recommendations or inform FY26 annual planning. | There are many open research questions in the Wikimedia ecosystem, and answering some of those questions is strategic for WMF or the affiliates. The answers to these questions can inform future product or technology development or can support decision-making/advocacy in the policy space. While some of these questions can be answered by utilizing purely research or research engineering expertise, given the socio-technical nature of the WM projects arriving at trustworthy insights often requires cross-team collaboration for data collection, context building, user interaction, careful design of experiments, and more. Through this KR we aim to prioritize some of our resources towards answering one or more of such questions. The work in this KR includes prioritizing a list of strategic open questions, as well as doing experimental work to find an answer for X number (currently estimated 2) of them. The ideal type of questions we tackle in this KR are questions that once answered can have an unlocking effect by enabling multiple other teams or groups to do (better? informed) product, technology, or policy work. We intend the work in this KR to be complementary to the following KRs:
|
Leila Zia |
SDS1.3 | Achieve at least a 50% reduction in the average time required for data stakeholders to trace data flows for 3 core and essential metrics. | Required for Data Governance standards. Tracing back the transformation and source of datasets is difficult and requires knowledge of different repos and systems. We should make it easy to understand how data flows around our systems so that data stakeholders can work in a more self service way. This work will support workflows where data is transformed and used for analytics, features, API’s and data quality jobs. There will be a follow up KR around documenting metrics. |
Luke Bowmaker |
SDS2.1 | By the end of Q2, we can support 1 product team to evaluate a feature or product via basic A/B testing that reduces their time to user interaction data by 50%. | We think using shared tools will increase product teams' data-driven decision making, improve efficiency and productivity, and enhance product strategy and innovation. Establishing the UX and technical systems for logged-in users allows us to advance towards the long term goal of supporting A/B tests on logged out users while the feasibility work of SDS 2.3 is underway. We will look at adopting team's individual time to user interaction data baselines and improve it by 50%. We will also investigate how we can contextualize these gains in the fuller context of all product teams. We expect to learn how we can improve the experience and identify and prioritize capability enhancements based on feedback from the adopting team and results of SDS 2.2. |
Virginia Poundstone |
SDS2.2 | By end of Q2, we will have 3 essential metrics for analyzing experiments (A/B tests) to support testing product/feature hypotheses related to FY24-25 KRs. | When a product manager (or designer) has a hypothesis that a product/feature will address a problem/need for the users or the organization, an experiment is how they test that hypothesis and learn about the potential impact of their idea on a metric. The results of the experiment inform the product manager and help them make a decision about what action to take next (abandon this idea and try a different hypothesis, continue development if the experiment was performed early in the development lifecycle, or release the product/feature to more users). Product managers must be able make such a decision with confidence, supported by evidence they trust and understand. A major hurdle to this is that product teams currently formulate their hypotheses with custom project-specific metrics which require dedicated analyst support to define, measure, analyze, and report on them. Switching to a set of essential metrics for formulating all testable product/feature hypothesis statements would make it:
We think that a set of essential metrics which are widely understood and consistently used – and informed/influenced by industry standard metrics – would also improve organizational data literacy and promote a culture of review, experimentation, and learning. We are focusing on essential metrics that (1) are needed for best measurement and evaluation of success/impact of products/features related to 2 Wiki Experiences KRs – WE3.1 and WE1.2 – and (2) reflect or map to industry-standard metrics used in web analytics. |
Mikhail Popov |
SDS2.3 | Deploy a unique agent tracking mechanism to our CDN which enables the A/B testing of product features with anonymous readers. | Without such a tracking mechanism, it is not reasonable to implement A/B testing of product features with anonymous readers.
This is basically a milestone-based result to create a new technical capability that others can build measurable things on top of. The key priority use-case will be A/B testing of features with anonymous readers, but this work also enables other important future things, which may create follow-on hypotheses later in WE4.x (for request risk ratings and mitigating large-scale attacks) and for metrics/research about unique device counts as their resourcing and priorities allow. |
Brandon Black |
Future Audiences (FA) Key Result
[ Makasudai ] | |||
---|---|---|---|
Gajeren Sunan Maɓalli | Mabuɗin Sakamakon rubutu | Mahimmin Sakamakon Maɓalli | Mai shi |
FA1.1 | As result of Future Audiences experimental insights and recommendations, by the end of Q3 at least one objective or key result owned by a non-Future Audiences team is present in the draft for the following year's annual plan. | Since 2020, the Wikimedia Foundation has been tracking external trends that may impact our ability to serve future generations of knowledge-consumers and knowledge-contributors and remain a thriving free knowledge movement for generations to come. Future Audiences, a small R&D team, will:
|
Maryana Pinchuk |
Product and Engineering Support (PES) Key Results
[ Makasudai ] | |||
---|---|---|---|
Gajeren Sunan Maɓalli | Mabuɗin Sakamakon rubutu | Mahimmin Sakamakon Maɓalli | Mai shi |
PES1.1 | Culture of Review: Incrementally improve scores for P+T staff sentiment related to our delivery, alignment, direction, and team health in a quarterly survey. | A culture of review is a product development culture based on shorter cycles of iteration, learning, and adaptation. This means that our organization may set yearly goals, but what we do to achieve these goals will change and adapt over the course of the year as we learn. There are two components to building a culture of review: processes and behaviors. This KR focuses on the latter. Behavior changes can grow and strengthen our culture of review. This involves changes in individual habits and routines as we move towards more iterative product development. This KR will be based on self-reported changes in individual behaviors, and measuring resulting changes, if any, in staff sentiment. | Amy Tsay |
PES1.2 | By the end of Q2, the new Wishlist better connects movement ideas and requests to Foundation P+T activities: items from the Wishlist backlog are addressed via a 2024-5 KR, the Foundation has completed 10 smaller Wishes, and the Foundation has partnered with volunteers to identify 3+ areas of opportunity for the 2025-26 FY. | The Community Wishlist represents a narrow slice of the movement; approximately 1k people participate, most of whom are contributors or admins. People often bypass the Wishlist by writing feature requests and bug reports via Phabricator, where it’s hard to discern requests from WMF or the community. For participants, the Wishlist is a costly time investment with minimal payoff. They still engage with the Wishlist because they feel it is the only vehicle to call attention to impactful bugs and feature improvements, or signal a need for broader, strategic opportunities. Wishes are often written as solutions, vs problems. The solutions may seem sensible on paper, but don’t necessarily consider the technical complexity or movement strategy implications. The scope and breadth of wishes sometimes exceeds the scope and capacity of Community Tech or a single team, perpetuating the frustration, leading to RFCs and calls to dismantle the Wishlist. Whereas community members prefer to use the Wishlist for project ideas, teams at the Foundation look at the Wishlist and other intake processes for prioritization, in part because wishes are ill-timed for Annual Planning and are hard to incorporate into roadmaps / OKRs. The Future Wishlist should be a bridge between the community and the Foundation, where communities provide input in a structured way, so that we are able to take action and in turn make volunteers happy. We’re creating a new intake process for any logged in volunteer to submit a wish, 365 days a year. Wishes can report or highlight a bug, request an improvement, or ideate on a new feature. Anyone can comment on, workshop, or support a Wish to influence prioritization. The Foundation won’t categorize wishes as “too big” or “too small.” Wishes that thematically map to a larger problem area can influence annual planning and team roadmaps, offering strategic directions and opportunities. Wishes will be visible to the Movement in a dashboard that categorizes wishes by project, product/problem area, and wish type. The Foundation will respond to wishes in a timely manner, and partner with the Community to categorize and prioritize wishes. We will partner with Wikimedians to identify and prioritize three areas of improvement, incorporated in the Foundation’s 2025-26 Annual Plan, which should improve the adoption rate and fulfillment of impactful wishes. We will flag well-scoped wishes for the volunteer developer community and Foundation teams, leading to more team and developer engagement and more wishes fulfilled, leading to community satisfaction. Addressing more wishes improves contributor happiness, efficacy, and retention, which should generate more quality edits, higher quality content, and more readers. |
Jack Wheeler |
PES1.3 | Run and conclude two experiments from existing exploratory products/features that provides us with data/insights into how we grow Wikipedia as a knowledge destination for our current consumer and volunteer audiences in Q1 and Q2. Complete and share learnings and recommendations for potential adoption for future OKR work in the Wiki Experiences bucket by the end of Q3. | This work is a counterpart to the Future Audiences objective, but focuses instead on uncovering opportunities to increase and deepen engagement of our existing audiences (of Wikipedia consumers and contributors) through more nimbly testing more on-platform product ideas. It lives in PES1 as it is an energiser and multiplier - channelling the time individuals and teams have already devoted to hacking/experimenting on side projects to bring more promising features into focus. Instead of these side projects languishing (not a good use of our limited resources), this KR provides a path for some of these ideas to potentially make it into larger APP setting through proven experiments, thus more efficiently using staff time and motivating their creativity and productivity. By shepherding more of these smaller, shorter projects into play, we also diversify our spread of ‘bets’ for more learnings and trials of ideas that may transform Wikipedia in line with the changing needs and expectations of our current audiences. This will make our work more impactful and faster as it helps the foundation to align on the correct goal in less time. |
Rita Ho |
PES1.4 | Learn how to: set, monitor, and make decisions on SLOs. Pick at least one new thing to define SLOs for as we release it. Collaborate with the respective team(s) (typically: product, development teams, SRE) to define that SLO. Reflect and document guidelines for what releases should have SLOs in the future and how to set them. | FUTURE KR: Set up process and rudimentary tools for setting and monitoring SLOs for new releases. Report on a quarterly basis, and use it to make decisions on when to (and not to) prioritize work to fix something. Share report with the community. WHY: We don’t know when we need to prioritize work to fix something. And we have a lot of code. As this footprint continues to grow, there are more situations where we may need to decide between addressing issues or focus on innovation, and more uncertainty around when we should. Also, not clear to staff and community what our level of support/commitment on reliability and performance is for all the different features and functionality they interact with. If we define a expected level of service, we can know when we should allocate resources to it or not. |
Mark Bergsma |
PES1.5 | Define ownership and commitments (including SLOs) on services and learn how to track, report and make decisions as a standard and scalable practice by trialing it in 3 teams across senior leaders in the department. | After collaboratively defining an SLO for the EditCheck feature as part of PES1.5, we will now trial and learn from using the SLO in practice to help prioritisation of reliability work. We will also document roles and responsibilities for ownership of code/services, allowing us to make clear shared commitments on the level of ongoing support. We will try to use these as practices in 3 teams across the department. | Mark Bergsma |
Hypotheses
The hypotheses below are the specific things we are doing each quarter to address the associated key results above.
Each hypothesis is an experiment or stage in an experiment we believe will help achieve the key result. Teams make a hypothesis, test it, then iterate on their findings or develop an entirely different new hypothesis. You can think of the hypotheses as bets of the teams’ time–teams make a small bet of a few weeks or a big bet of several months, but the risk-adjusted reward should be commensurate with the time the team puts in. Our hypotheses are meant to be agile and adapt quickly. We may retire, adjust, or start a hypothesis at any point in the quarter.
To see the most up-to-date status of a hypothesis and/or to discuss a hypothesis with the team please click the link to its project page below.
Q1
The first quarter (Q1) of the WMF annual plan covers July-September.
Wiki Experiences (WE) Hypotheses
[ WE Key Results ] | ||
---|---|---|
Hypothesis shortname | Q1 text | Details & Discussion |
WE1.1.1 | If we expand the Event List to become a Community List that includes WikiProjects, then we will be able to gather some early learnings in how to engage with WikiProjects for product development. | |
WE1.1.2 | If we identify at least 15 WikiProjects in 3 separate Wikipedias to be featured in the Community List, then we will be able to advise Campaigns Product in the key characteristics needed to build an MVP of the Community List that includes WikiProjects. | |
WE1.1.3 | If we consult 20 event organizers and 20 WikiProject organizers on the best use of topics available via LiftWing, then we can prioritize revisions to the topic model that will improve topical connections between events and WikiProjects. | |
WE1.2.1 | If we build a first version of the Edit Check API, and use it to introduce a new Check, we can evaluate the speed and ease with other teams and volunteers could use the API to create new Checks and Suggested Edits. | |
WE1.2.2 | If we build a library of UI components and visual artefacts, Edit Check’s user experience can extend to accommodate Structured Tasks patterns. | |
WE1.2.3 | If we conduct user tests on two or more design prototypes introducing structured tasks to newcomers within/proximate to the Visual Editor, then we can quickly learn which designs will work best for new editors, while also enabling engineers to assess technical feasibility and estimate effort for each approach. | mw:Growth/Constructive activation experimentation |
WE1.2.4 | If we train an LLM on detecting "peacock" behavior, then we can learn if it can detect this policy violation with at least >70% precision and >50% recall and ultimately, decide if said LLM is effective enough to power a new Edit Check and/or Suggested Edit. | |
WE1.2.5 | If we conduct an A/B/C test with the alt-text suggested edits prototype in the production version of the iOS app we can learn if adding alt-text to images is a task newcomers are successful with and ultimately, decide if it's impactful enough to implement as a suggested edit on the Web and/or in the Apps. | mw:Wikimedia Apps/iOS Suggested edits project/Alt Text Experiment |
WE1.3.1 | If we enable additional customisation of Automoderator's behaviour and make changes based on pilot project feedback in Q1, more moderators will be satisfied with its feature set and reliability, and will opt to use it on their Wikimedia project, thereby increasing adoption of the product. | mw:Automoderator |
WE1.3.2 | If we are able interpret subsets of wishes as moderator-related focus areas and share these focus areas for community input in Q1-Q2, then we will have a high degree of confidence that our selected focus area will improve moderator satisfaction, when it is released in Q3. | |
WE2.1.1 | If we build a country-level inference model for Wikipedia articles, we will be able to filter lists of articles to those about a specific region with >70% precision and >50% recall. | m:Research:Language-Agnostic Topic Classification/Countries |
WE2.1.2 | If we build a proof-of-concept providing translation suggestions that are based on user-selected topic areas, we will be set up to successfully test whether translators will find more opportunities to translate in their areas of interest and contribute more compared to the generic suggestions currently available. | mw: Translation suggestions: Topic-based & Community-defined lists |
WE2.1.3 | If we offer list-making as a service, we’ll enable at least 5 communities to make more targeted contributions in their topic areas as measured by (1) change in standard quality coverage of relevant topics on the relevant wiki and (2) a brief survey of organizer satisfaction with topic area coverage on-wiki. | |
WE2.1.4 | If we developed a proof of concept that adds translation tasks sourced from WikiProjects and other list-building initiatives, and present them as suggestions within the CX mobile workflow, then more editors would discover and translate articles focused on topical gaps. By introducing an option that allows editors to select translation suggestions based on topical lists, we would test whether this approach increases the content coverage in our projects. | mw:Translation suggestions: Topic-based & Community-defined lists |
WE2.2.1 | If we expand Wikimedia's State of Languages data by securing data sharing agreements with UNESCO and Ethnologue, at least one partner will decide to represent Wikimedia’s language inclusion progress in their own data products and communications. On top of being useful to our partner institutions, our expanded dataset will provide important contextual information for decision-making and provide communities with information needed to identify areas for intervention. | |
WE2.2.2 | If we map the language documentation activities that Wikimedians have conducted in the last 2 years, we will develop a data-informed baseline for community experiences in onboarding new languages. | |
WE2.2.3 | If we provide production wiki access to 5 new languages, with or without Incubator, we will learn whether access to a full-fledged wiki with modern features such as those available on English Wikipedia (including ContentTranslation and Wikidata support, advanced editing and search results) aids in faster editing. Ultimately, this will inform us if this approach can be a viable direction for language onboarding for new or existing languages, justifying further investigation. | mw:Future of Language Incubation |
WE2.3.1 | If we make two further improvements to media upload flow on Commons and share them with community, the feedback will be positive and it will help uploaders make less bad uploads (with the focus on copyright) as measured by the ratio of deletion requests within 30 days of upload. This will include defining designs for further UX improvements to the release rights step in the Upload Wizard on Commons and rolling out an MVP of logo detection in the upload flow. | |
WE2.4.1 | If we build a prototype of Wikifunctions calls embedded within MediaWiki content, we will be ready to use MediaWiki’s async content processing pipeline and test its performance feasibility in Q2. | phab:T261472 |
WE2.4.2 | If we create a design prototype of an initial Wikifunctions use case in a Wikipedia wiki, we will be ready to build and test our integration when performance feasibility is validated in Q2 (see hypothesis 1). | phab:T363391 |
WE2.4.3 | If we make it possible for Wikifunctions users to access Wikidata lexicographical data, they will begin to create natural language functions that generate sentence phrases, including those that can handle irregular forms. If we see an average monthly creation rate of 31 for these functions, after the feature becomes available, we will know that our experiment is successful. | phab:T282926 |
WE3.1.1 | Designing and qualitatively evaluating three proofs of concept focused on building curated, personalized, and community-driven browsing and learning experiences will allow us to estimate the potential for increased reader retention (experiment 1: providing recommended content in search and article contexts, experiment 2: summarizing and simplifying article content, experiment 3: making multitasking easier on wikis. | |
WE3.1.3 | If we develop models for remixing content such as a content simplification or summarization that can be hosted and served via our infrastructure (e.g. LiftWing), we will establish the technical direction for work focused on increasing reader retention through new content discovery features. | |
WE3.1.4 | If we analyze the projected performance impact of hypothesis WE3.1.1 and WE3.1.2 on the Search API, we can scope and address performance and scalability issues before they negatively affect our users. | |
WE3.1.5 | If we enhance the search field in the Android app to recommend personalized content based on a user's interest and display better results, we will learn if this improves user engagement by observing whether it increases the impression and click-through rate (CTR) of search results by 5% in the experimental group compared to the control group over a 30-day A/B test. This improvement could potentially lead to a 1% increase in the retention of logged out users. | phab:T370117 |
WE3.2.1 | If we create a clickable design prototype that demonstrates the concept of a badge representing donors championing article(s) of interest, we can learn if there would be community acceptance for a production version of this method for fundraising in the Apps. | Fundraising Experiment in the iOS App |
WE3.2.2 | Increasing the prominence of entry points to donations on the logged-out experiences of the web mobile and desktop experience will increase the clickthrough rate of the donate link by 30% Year over Year | phab:T368765 |
WE3.2.3 | If we make the “Donate” button in the iOS App more prominent by making it one click or less away from the main navigation screen, we will learn if discoverability was a barrier to non banner donations. | |
WE3.3.1 | If we select a data visualization library and get an initial version of a new server-rendered graph service available by the end of July, we can learn from volunteers at Wikimania whether we’re working towards a solution that they would use to replace legacy graphs. | |
WE4.1.1 | If we implement a way in which users can report potential instances of harassment and harmful content present in discussions through an incident reporting system, we will be able to gather data around the number and type of incidents being reported and therefore have a better understanding of the landscape and the actions we need to take. | |
WE4.2.1 | If we explore and define Wikimedia-specific methods for a unique device identification model, we will be able to define the collection and storage mechanisms that we can later implement in our anti-abuse workflows to enable more targeted blocking of bad actors. | phab:T368388 |
WE4.2.9 | If we provide contextual information about reputation associated with an IP that is about to be blocked, we will see fewer collateral damage IP and IP range blocks, because administrators will have more insight into potential collateral damage effects of a block. We can measure this by instrumenting Special:Block and observing how behavior changes when additional information is present, vs when it is not. | WE4.2.9 Talk page |
WE4.2.2 | If we define an algorithm for calculating a user account reputation score for use in anti-abuse workflows, we will prepare the groundwork for engineering efforts that use this score as an additional signal for administrators targeting bad actors on our platform. We will know the hypothesis is successful if the algorithm for calculating a score maps with X% precision to categories of existing accounts, e.g. a "low" score should apply to X% of permanently blocked accounts | WE4.2.2 Talk page |
WE4.2.3 | If we build an evaluation framework using publicly available technologies similar to the ones used in previous attacks we will learn more about the efficacy of our current CAPTCHA at blocking attacks and could recommend a CAPTCHA replacement that brings a measurable improvement in terms of the attack rate achievable for a given time and financial cost. | |
WE4.3.1 | If we apply some machine learning and data analysis tools to webrequest logs during known attacks, we'll be able to identify abusive IP addresses with at least >80% precision sending largely malicious traffic that we can then ratelimit at the edge, improving reliability for our users. | phab:T368389 |
WE4.3.2 | If we limit the load that known IP addresses of persistent attackers can place on our infrastructure, we'll reduce the number of impactful cachebusting attacks by 20%, improving reliability for our users. | |
WE4.3.3 | If we deploy a proof of concept of the 'Liberica' load balancer, we will measure a 33% improvement in our capacity to handle TCP SYN floods. | |
WE4.3.4 | If we make usability improvements and also perform some training exercises on our 'requestctl' tool, then SREs will report higher confidence in using the tool. | phab:T369480 |
WE4.4.1 | If we run at least 2 deployment cycles of Temp Accounts we will be able to verify this works successfully. | |
WE5.1.1 | If we successfully roll out Parsoid Read Views to all Wikivoyages by Q1, this will boost our confidence in extending Parsoid Read Views to all Wikipedias. We will measure the success of this rollout through detailed evaluations using the Confidence Framework reports, with a particular focus on Visual Diff reports and the metrics related to performance and usability. Additionally, we will assess the reduction in the list of potential blockers, ensuring that critical issues are addressed prior to wider deployment. | |
WE5.1.2 | If we disable unused Graphite metrics, target migrating metrics using the db-prefixed data factory and increase our outreach efforts to other teams and the community in Q1, then we would be on track to achieve our goal of making Graphite read-only by Q3 FY24/25, by observing an increase of 30% in migration progress. | |
WE5.1.3 | If we implement a canonical url structure with versioning for our REST API then we can enable service migration and testing for Parsoid endpoints and similar services by Q1. | phab:T344944 |
WE5.1.4 | If we complete the remaining work to mitigate the impact of browsers' anti-tracking measures on CentralAuth autologin and move to a more resilient authentication infrastructure (SUL3), we will be ready to roll out to production wikis in Q2. | |
WE5.1.5 | If we increase the coverage of Sonar Cloud to include key MediaWiki Core repos, we will be able to improve the maintainability of the MediaWiki codebase. This hypothesis will be measured by spliting the selected repos into test and control groups. These groups will then be compared over the course of a quarter to measure impact of commit level feedback to developers. | |
WE5.2.1 | If we make a classification of the types of hooks and extension registry properties used to influence the behavior of MediaWiki core, we will be able to focus further research and interventions on the most impactful. | Simplify feature development |
WE5.2.2 | If we explore a new architecture for notifications in MW core and Echo, we will discover new ways to provide modularity and new ways for extensions to interact with core. | Simplify feature development |
WE5.3.1 | If we instrument parser and cache code to collect template structure and fine-grained timing data, we can quantify the expected performance improvement which could be realized by future evolution of the wikitext parsing platform. | T371713 |
WE5.3.2 | On template edits, if we can implement an algorithm in Parsoid to reuse HTML of a page that depends on the edited template without processing the page from scratch and demonstrate 1.5x or higher processing speedup, we will have a potential incremental parsing solution for efficient page updates on template edits. | T363421 |
WE5.4.1 | If the MediaWiki engineering group is successful with release process accountability and enhances its communication process by the end of Q2 in alignment with the product strategy, we will eliminate the current process that relies on unplanned or volunteer work and improve community satisfaction with the release process. Measured by community feedback on the 1.43 LTS release coupled with a significant reduction in unplanned staff and volunteer hours needed for release processes. | |
WE5.4.2 | If we research and build a process to more regularly upgrade PHP in conjunction with our MediaWiki release process we will increase speed and security while reducing the complexity and runtime of our CI systems, by observing the success of PHP 8.1 upgrade before 1.43 release. | |
WE6.1.1 | If we design and complete the initial implementation of an authorization framework, we’ll establish a system to effectively manage the approval of all LDAP access requests. | |
WE6.1.2 | If we research available documentation metrics, we can establish metrics that measure the health of Wikimedia technical documentation, using MediaWiki Core documentation as a test case. | mw:Wikimedia Technical Documentation Team/Doc metrics |
WE6.1.3 | If we collect insights on how different teams are making technical decisions we are able to gather good practices and insights that can enable and scale similar practices across the organization. | |
WE6.2.1 | If we publish a versioned build of MediaWiki, extensions, skins, and Wikimedia configuration at least once per day we will uncover new constraints and establish a baseline of wallclock time needed to perform a build. | mw:Wikimedia Release Engineering Team/Group -1 |
WE6.2.2 | If we replace the backend infrastructure of our existing shared MediaWiki development and testing environments (from apache virtual servers to kubernetes), it will enable us to extend its uses by enabling MediaWiki services in addition to the existing ability to develop MediaWiki core, extensions, and skins in an isolated environment. We will develop one environment that includes MediaWiki, one or more Extensions, and one or more Services. | wikitech:Catalyst |
WE6.2.3 | If we create a new deployment UI that provides more information to the deployer and reduce the amount of privilege needed to do deployment, it will make deployment easier and open deployments to more users as measured by the number of unique deployers and number of patches backported as a percentage of our overall deployments. | Wikimedia Release Engineering Team/SpiderPig |
WE6.2.4 | If we migrate votewiki, wikitech and commons to MediaWiki on Kubernetes we reap the benefits of consistency and no longer need to maintain 2 different infrastructure platforms in parallel, allowing to reduce the amount of custom written tooling, making deployments easier and less toilous for deployers. This will be measured by a decrease in total deployment times and a reduction in deployment blockers. | aiki T292707 |
WE6.2.5 | If we move MultiVersion routing out of MediaWiki, we 'll be able to ship single version MediaWiki containers, largely cutting down the size of containers allowing for faster deployments, as measured by the deployment tool. | SingleVersion MW: Routing options |
WE6.3.1 | By consulting toolforge maintainers about the least sustainable aspects of the platform, we will be able to gather a list of potential categories to measure. | |
WE6.3.2 | By creating a "standard" tool to measure the number of steps for a deployment we will be able to assess the maximal improvement in the deployment process. | |
WE6.3.3 | If we conduct usability tests, user interviews, and competitive analysis to explore the existing workflows and use cases of Toolforge, we can identify key areas for improvement. This research will enable us to prioritize enhancements that have the most significant impact on user satisfaction and efficiency, laying the groundwork for a future design of the user interface. |
Signals & Data Services (SDS) Hypotheses
[ SDS Key Results ] | ||
---|---|---|
Hypothesis shortname | Q1 text | Details & Discussion |
SDS 1.1.1 | If we partner with an initiative owner and evaluate the impact of their work on Core Foundation metrics, we can identify and socialize a repeatable mechanism by which teams at the Foundation can reliably impact Core Foundation metrics. | |
SDS1.2.2 | If we study the recruitment, retention, and attrition patterns among long-tenure community members in official moderation and administration roles, and understand the factors affecting these phenomena (the ‘why’ behind the trends), we will better understand the extent, nature, and variability of the phenomenon across projects. This will in turn enable us to identify opportunities for better interventions and support aimed at producing a robust multi-generational framework for editors. | phab:T368791 |
SDS1.2.1 | If we gather use cases from product and feature engineering managers around the use of AI in Wikimedia services for readers and contributors, we can determine if we should test and evaluate existing AI models for integration into product features, and if yes, generate a list of candidate models to test. | phab:T369281 |
SDS1.3.1 | If we define the process to transfer all data sets and pipeline configurations from the Data Platform to DataHub we can build tooling to get lineage documentation automatically. | |
SDS 1.3.2 | If we implement a well documented and understood process to produce an intermediary table representing MediaWiki Wikitext History, populated using the event platform, and monitor the reliability and quality of the data we will learn what additional parts of the process are needed to make this table production ready and widely supported by the Data Platform Engineering team. | |
SDS2.1.2 | If we investigate the data products current sdlc, we will be able to determine inflection points where QTE knowledge can be applied in order to have a positive impact on Product Delivery. | |
SDS2.1.3 | If the Growth team learns about the Metrics Platform by instrumenting a Homepage Module on the Metrics Platform, then we will be prepared to outline a measurement plan in Q1 and complete an A/B test on the new Metrics platform by the end of Q2. | |
SDS2.1.4 | If we conduct usability testing on our prototype among pilot users of our experimentation process, we can identify and prioritize the primary pain points faced by product managers and other stakeholders in setting up and analyzing experiments independently. This understanding will lead to the refinement of our tools, enhancing their efficiency and impact. | |
SDS2.1.5 | If we design a documentation system that guides the experience of users building instrumentation using the Metrics Platform, we will enable those users to independently create instrumentation without direct support from Data Products teams, except in edge cases. | phab:T329506 |
SDS2.2.1 | If we define a metric for logged-out mobile app reader retention, which is applicable for analyzing experiments (A/B test), we can provide guidance for planning instrumentation to measure retention rate of logged out readers in the mobile apps and enable the engineering team to develop an experiment strategy targeting logged out readers. | |
SDS2.2.2 | If we define a standard approach for measuring and analyzing conversion rates, it will help us establish a collection of well-defined metrics to be used for experimentation and baselines, and start enabling comparisons between experiments/projects to increase learning from these. | |
SDS2.2.3 | If we define a standard way of measuring and analyzing clickthrough rate (CTR) in our products/features, it will help us design experiments that target CTR for improvement, standardize click-tracking instrumentation, and enable us to make CTR available as a target metric to users of the experimentation platform. | |
SDS2.3.1 | If we conduct a legal review of proposed unique cookies for logged out users, we can determine whether there are any privacy policy or other legal issues which inform the community conversation and/or affect the technical implementation itself. |
Future Audiences (FA) Hypotheses
[ FA Key Results ] | ||
---|---|---|
Hypothesis shortname | Q1 text | Details & Discussion |
FA1.1.1 | If we make off-site contribution very low effort with an AI-powered “Add a Fact” experiment, we can learn whether off-platform users could help grow/sustain the knowledge store in a possible future where Wikipedia content is mainly consumed off-platform. | m:Future Audiences/Experiment:Add a Fact |
Product and Engineering Support (PES) Hypotheses
[ PES Key Results ] | ||
---|---|---|
Hypothesis shortname | Q1 text | Details & Discussion |
PES1.1.1 | If the P&T leadership team syncs regularly on how they’re guiding their teams towards a more iterative software development culture, and we collect baseline measurements of current development practices and staff sentiment on how we work together to ship products, we will discover opportunity areas for change management. The themes that emerge will enable us to build targeted guidance or programs for our teams in coming quarters. | |
PES1.2.2 | If the Moderator Tools team researches the Community Wishlist and develops 2+ focus areas in Q1, then we can solicit feedback from the Community and identify a problem that the Community and WMF are excited about tackling. | |
PES1.2.3 | If we bundle 3-5 wishes that relate to selecting and inserting templates, and ship an improved feature in Q1, then CommTech can take the learnings to develop a Case Study for the foundation to incorporate more "focus areas" in the 2025-26 annual plan. | |
PES1.3.1 | If we provide insights to audiences about their community and their use of Wikipedia over a year, it will stimulate greater connection with Wikipedia – encouraging greater engagement in the form of social sharing, time spent interacting on Wikipedia, or donation. Success will be measured by completing an experimental project that provides at least one recommendation about “Wikipedia insights” as an opportunity to increase onwiki engagement. | Wikipedia user insights |
PES1.3.2 | If we create a Wikipedia-based game for daily use that highlights the connections across vast areas of knowledge, it will encourage consumers to visit Wikipedia regularly and facilitate active learning, leading to longer increased interaction with content on Wikipedia. Success will be measured by completing an experimental project that provides at least one recommendation about gamification of learning as an opportunity to increase onwiki engagement. | Wikipedia games |
PES1.3.3 | If we develop a new process/track at a Wikimedia hack event to incubate future experiments, it will increase the impact and value of such events in becoming a pipeline for future annual plan projects, whilst fostering greater connection between volunteers and engineering/design staff to become more involved with strategic initiatives. Success will be measured by at least one PES1.3 project being initiated and/or advanced to an OKR from a foundation-supported event. | Incubator space |
PES1.4.1 | If we draft an SLO with the Editing team releasing Edit Check functionality, we will begin to learn and understand how to define and track user-facing SLOs together, and iterate on the process in the future. | |
PES1.4.2 | If we define and publish SLAs for putting OOUI into “maintenance mode”, growth of new code using OOUI across Wikimedia projects will stay within X% in Q1. | |
PES1.4.3 | If we map ownership using the proposed service catalog for known owned services in Q1, we will be able to identify significant gaps in service catalog as it helps in solving the SLO culture by the end of the year. |
Q2
The second quarter (Q2) of the WMF annual plan covers October-December.
Wiki Experiences (WE) Hypotheses
[ WE Key Results ] | ||
---|---|---|
Hypothesis shortname | Q2 text | Details & Discussion |
WE1.1.1 | If we expand the Event list to become a Community List that includes WikiProjects, then we will be able to gather some early learnings in how to engage with WikiProjects for product development. | Campaigns/Foundation Product Team/Event list |
WE1.1.2 | If we launch at least 1 consultation focused on on-wiki collaborations, and if we collect feedback from at least 20 people involved in such collaborations, then we will be able to advise Campaigns Product on the key characteristics needed to develop a new or improved way of connecting. | Campaigns/WikiProjects |
WE1.1.3 | If we consult 20 event organizers and 20 WikiProject organizers on the best use of topics available via LiftWing, then we can prioritize revisions to the topic model that will improve topical connections between events and WikiProjects. | |
WE1.1.4 | If we integrate CampaignEvents into Community Configuration in Q2, then we will set the stage for at least 5 more wikis opting to enable extension features in Q3, thereby increasing tool usage. | |
WE1.2.2 | If we build a library of UI components and visual artifacts, Edit Check’s user experience can extend to accommodate Structured Tasks patterns. | |
WE1.2.5 | If we conduct an A/B/C test with the alt-text suggested edits prototype in the production version of the iOS app we can learn if adding alt-text to images is a task newcomers are successful with and ultimately, decide if it's impactful enough to implement as a suggested edit on the Web and/or in the Apps. | |
WE1.2.6 | If we introduce new account holders to the “Add a Link” Structured Task in Wikipedia articles, we expect to increase the percentage of new account holders who constructively activate on mobile by 10% compared to the baseline. | |
WE1.3.1 | If we enable additional customisation of Automoderator's behaviour and make changes based on pilot project feedback in Q1, more moderators will be satisfied with its feature set and reliability, and will opt to use it on their Wikimedia project, thereby increasing adoption of the product. | mw:Moderator Tools/Automoderator |
WE1.3.3 | If we improve the user experience and features of the Nuke extension during Q2, we will increase administrator satisfaction of the product by 5pp by the end of the quarter. | mw:Extension:Nuke/2024 Moderator Tools project |
WE2.1.3 | If we offer list-making as a service, we’ll enable at least 5 communities to make more targeted contributions in their topic areas as measured by (1) change in standard quality coverage of relevant topics on the relevant wiki and (2) a brief survey of organizer satisfaction with topic area coverage on-wiki. | |
WE2.1.4 | If we developed a proof of concept that adds translation tasks sourced from WikiProjects and other list-building initiatives, and present them as suggestions within the CX mobile workflow, then more editors would discover and translate articles focused on topical gaps. By introducing an option that allows editors to select translation suggestions based on topical lists, we would test whether this approach increases the content coverage in our projects. |
|
WE2.1.5 | If we expose topic-based translation suggestions more broadly and analyze its initial impact, we will learn which aspects of the translation funnel to act on in order to obtain more quality translations. | |
WE2.2.4 | If we provide production wiki access to 5 new languages, with or without Incubator, we will learn whether access to a full-fledged wiki with modern features such as those available on English Wikipedia (including ContentTranslation and Wikidata support, advanced editing and search results) aids in faster editing. Ultimately, this will inform us if this approach can be a viable direction for language onboarding for new or existing languages, justifying further investigation. | |
WE2.2.5 | If we move addwiki.php to core and customize it to Wikimedia, we will improve code quality in our wiki creation system making it testable and robust, and we will make it easy for creators of new wikis and thereby make significant steps towards simplifying wiki creation process. | phab:T352113 |
WE2.3.2 | If we make two further improvements to media upload flow on Commons and share them with community, the feedback will be positive and it will help uploaders make less bad uploads (with the focus on copyright) as measured by the ratio of deletion requests within 30 days of upload. This will include release of further UX improvements to the release rights step in the Upload Wizard on Commons and automated detection of external sources. | |
WE2.3.3 | If the BHL-Wikimedia Working Group creates Commons categories and descriptive guidelines for the South American and/or African species depicted in publications, they will make 3,000 images more accessible to biodiversity communities. (BHL = Biodiversity Heritage Library) |
|
WE2.4.1 | If we build a prototype of Wikifunctions calls embedded within MediaWiki content and test it locally for stability, we will be ready to use MediaWiki’s async content processing pipeline and test its performance feasibility in Q2. | phab:T261472 |
WE2.4.2 | If we create a design prototype of an initial Wikifunctions use case in a Wikipedia wiki, we will be ready to build and test our integration when performance feasibility is validated in Q2, as stated in Hypothesis 1. | phab:T363391 |
WE2.4.3 | If we make it possible for Wikifunctions users to access Wikidata lexicographical data, they will begin to create natural language functions that generate sentence phrases, including those that can handle irregular forms. If we see an average monthly creation rate of 31 for these functions, after the feature becomes available, we will know that our experiment is successful. | phab:T282926 |
WE3.1.3 | If we develop models for remixing content such as a content simplification or summarization that can be hosted and served via our infrastructure (e.g. LiftWing), we will establish the technical direction for work focused on increasing reader retention through new content discovery features. | Research |
WE3.1.6 | If we introduce a personalized rabbit hole feature in the Android app and recommend condensed versions of articles based on the types of topics and sections a user is interested in, we will learn if the feature is sticky enough to result in multi-day usage by 10% of users exposed to the experiment over a 30-day period, and a higher pageview rate than users not exposed to the feature. | Rabbit Holes |
WE3.1.7 | If we run a qualitative experiment focused on presenting article summaries to web readers, we will determine whether or not article summaries have the potential to increase reader retention, as proxied by clickthrough rate and usage patterns | |
WE3.1.8 | If we build one feature which provides additional article-level recommendations, we will see an increase in clickthrough rate of 10% over existing recommendation options and a significant increase in external referrals for users who actively interact with the new feature. | |
WE3.2.2 | Increasing the prominence of entry points to donations on the logged-out experiences of the Vector web mobile and desktop experience will increase the clickthrough rate of the donate link by 30% YoY. | mw:Readers/2024 Reader and Donor Experiences |
WE3.2.3 | If we make the “Donate” button in the iOS App more prominent by making it one click or less away from the main navigation screen, we will learn if discoverability was a barrier to non banner donations. | Navigation Refresh |
WE3.2.4 | If we update the contributions page for logged-in users in the app to include an active badge for someone that is an app donor and display an inactive state with a prompt to donate for someone that decided not to donate in app, we will learn if this recognition is of value to current donors and encourages behavior of donating for prospective donors, informing if it is worth expanding on the concept of donor badges or abandoning it. | Private Donor Recognition Experiment |
WE3.2.5 | If we create a Wikipedia in Review experiment in the Wikipedia app, to allow users to see and share personalized data about their reading, editing, and donation habits, we will see 2% of viewers donate on iOS as a result of this feature, 5% click share and, 65% of users rating the feature neutral or satisfactory. | Personalized Wikipedia Year in Review |
WE3.2.7 | Increasing the prominence of entry points to donations on the logged-out experiences of the Minerva web mobile and desktop experience will increase the clickthrough rate of the donate link by 30% YoY. | |
WE3.3.2 | If we develop the Charts MVP and get it working end-to-end in production test wikis, at least two Wikipedias + Commons agree to pilot it before the code freeze in December. | |
WE3.4.1 | If we were to explore the feasibility by doing an experiment of setting up smaller PoPs in cloud providers like Amazon, we can expand our data center map and reach more users around the world, at reduced cost and increased turn-around time. | |
WE4.1.2 | If we deploy at least one iteration of the Incident Reporting System MVP on pilot wikis, we will be able to gather valuable data around the frequency and type of incidents being reported. | Incident Reporting System |
WE4.2.1 | If we explore and define Wikimedia-specific methods for a unique device identification model, we will be able to define the collection and storage mechanisms that we can later implement in our anti-abuse workflows to enable more targeted blocking of bad actors. | |
WE4.2.9 | If we provide contextual information about reputation associated with an IP that is about to be blocked, we will see fewer collateral damage IP and IP range blocks, because administrators will have more insight into potential collateral damage effects of a block. We can measure this by instrumenting Special:Block and observing how behavior changes when additional information is present, vs when it is not. | |
WE4.2.2 | If we define an algorithm for calculating a user account reputation score for use in anti-abuse workflows, we will prepare the groundwork for engineering efforts that use this score as an additional signal for administrators targeting bad actors on our platform. We will know the hypothesis is successful if the algorithm for calculating a score maps with X% precision to categories of existing accounts, e.g. a "low" score should apply to X% of permanently blocked accounts. | |
WE4.2.3 | If we build an evaluation framework using publicly available technologies similar to the ones used in previous attacks we will learn more about the efficacy of our current CAPTCHA at blocking attacks and could recommend a CAPTCHA replacement that brings a measurable improvement in terms of the attack rate achievable for a given time and financial cost. | |
WE4.3.1 | If we apply some machine learning and data analysis tools to webrequest logs during known attacks, we'll be able to identify abusive IP addresses with at least >80% precision sending largely malicious traffic that we can then ratelimit at the edge, improving reliability for our users. | |
WE4.3.3 | If we deploy a proof of concept of the 'Liberica' load balancer, we will measure a 33% improvement in our capacity to handle TCP SYN floods. | |
WE4.3.5 | By creating a system that spawns and controls thousands of virtual workers in a cloud environment, we will be able to simulate Distributed Denial of Service (DDoS) attacks and effectively measure the system's ability to withstand, mitigate, and respond to such attacks. | |
WE4.3.6 | If we integrate the output of the models we built in WE 4.3.1 with the dynamic thresholds of per-ip concurrency limits we've built for our TLS terminators in WE 4.3.2, we should be able to increase our ability to neutralize automatically attacks with 20% more volume, as measured with the simulation framework we're building. | |
WE4.3.7 | If we roll out a user-friendly web application that enables assisted editing and creation of requestctl rules, SREs will be able to mitigate cachebusting attacks in 50% less time than our established baseline. | |
WE4.4.2 | If we deploy Temporary Accounts to a set of small-to-medium sized projects, we will be able to the functionality works as intended and will be able to gather data to inform necessary future work. | Trust and Safety Product/Temporary Accounts |
WE5.1.1 | If we successfully roll out Parsoid Read Views to all Wikivoyages by Q1, this will boost our confidence in extending Parsoid Read Views to all Wikipedias. We will measure the success of this rollout through detailed evaluations using the Confidence Framework reports, with a particular focus on Visual Diff reports and the metrics related to performance and usability. Additionally, we will assess the reduction in the list of potential blockers, ensuring that critical issues are addressed prior to wider deployment. | |
WE5.1.3 | If we reroute the endpoints currently exposed under rest_v1/page/html and rest_v1/page/title paths to comparable MW content endpoints, then we can unblock RESTbase sunsetting without disrupting clients in Q1. | |
WE5.1.4 | If we complete the remaining work to mitigate the impact of browsers' anti-tracking measures on CentralAuth autologin and move to a more resilient authentication infrastructure (SUL3), we will be ready to roll out to production wikis in Q2. | |
WE5.1.5 | If we increase the number of relevant SonarCloud rules enabled for key MediaWiki Core repositories and refine the quality of feedback provided to developers, we will optimize the developer experience and enable them to improve the maintainability of the MediaWiki codebase in the future. This will be measured by tracking developer satisfaction levels and whether test group developers feel the tool is becoming more useful and effective in their workflow. Feedback will be gathered through surveys and direct input from developers to evaluate the perceived impact on their confidence in the tool and the overall development experience. | |
WE5.1.7 | If we represent all content module endpoint responses (10 in total) in our MediaWiki REST API OpenAPI spec definitions, we will be able to implement programmatic validation to guarantee that our generated documentation matches the actual responses returned in code. | |
WE5.1.8 | If we introduce support for endpoint description translation (ie: does not include actual object definitions or payloads) into our generated MediaWiki REST API OpenAPI specs, we can lay the foundation to support Wikimedia’s expected internationalization standards. | |
WE5.2.3 | If we conduct an experiment to reimplement at least [1-3] existing Core and Extension features using a new Domain Event and Listener platform component pattern as an alternative to traditional hooks, we will be able to confirm our assumption of this intervention enabling simpler implementation with more consistent feature behavior. | |
WE5.3.3 | If we instrument both parsers to collect availability of prior parses and timing of template expansions, and to classify updates and dependencies, we can prioritize work on selective updates (Hypothesis 5.3.2) informed by the quantification of the expected performance benefits. | |
WE5.3.4 | If we can increase the capability of our prototype selective update implementation in Parsoid using the learnings from the 5.3.1 hypothesis, we can leverage more opportunities to increase the performance benefit from selective update. | |
WE5.4.1 | If the MediaWiki engineering group is successful with release process accountability and enhances its communication process by the end of Q2 in alignment with the product strategy, we will eliminate the current process that relies on unplanned or volunteer work and improve community satisfaction with the release process. Measured by community feedback on the 1.43 LTS release coupled with a significant reduction in unplanned staff and volunteer hours needed for release processes. | |
WE5.4.2 | If we research and build a process to more regularly upgrade PHP in conjunction with our MediaWiki release process we will increase speed and security while reducing the complexity and runtime of our CI systems, by observing the success of PHP 8.1 upgrade before 1.43 release. | |
WE6.1.3 | If we collect insights on how different teams are making technical decisions we are able to gather good practices and insights that can enable and scale similar practices across the organization. | |
WE6.1.4 | If we research solutions for indexing the code of all projects hosted in WMF’s code repositories, we will be able to pick a solution that allows our users to quickly discover where the code is located whenever dealing with incident response or troubleshooting. | |
WE6.1.5 | If we test a subset of draft metrics on an experimental group of technical documentation collections, we will be able to make an informed decision about which metrics to implement for MediaWiki documentation. | Wikimedia Technical Documentation Team/Doc metrics |
WE6.2.1 | If we publish a versioned build of MediaWiki, extensions, skins, and Wikimedia configuration at least once per day we will uncover new constraints and establish a baseline of wallclock time needed to perform a build. | mw:Wikimedia Release Engineering Team/Group -1 |
WE6.2.2 | If we replace the backend infrastructure of our existing shared MediaWiki development and testing environments (from apache virtual servers to kubernetes), it will enable us to extend its uses by enabling MediaWiki services in addition to the existing ability to develop MediaWiki core, extensions, and skins in an isolated environment. We will develop one environment that includes MediaWiki, one or more Extensions, and one or more Services. | wikitech:Catalyst |
WE6.2.3 | If we create a new deployment UI that provides more information to the deployer and reduce the amount of privilege needed to do deployment, it will make deployment easier and open deployments to more users as measured by the number of unique deployers and number of patches backported as a percentage of our overall deployments. | mw:SpiderPig |
WE6.2.5 | If we move MultiVersion routing out of MediaWiki, we 'll be able to ship single version MediaWiki containers, largely cutting down the size of containers allowing for faster deployments, as measured by the deployment tool. | https://docs.google.com/document/d/1_AChNfiRFL3VdNzf6QFSCL9pM2gZbgLoMyAys9KKmKc/edit |
WE6.2.6 | If we gather feedback from QTE, SRE, and individuals with domain specific knowledge and use their feedback to write a design document for deploying and using the wmf/next OCI container, then we will reduce friction when we start deploying that container. | T379683 |
WE6.3.4 | If we enable the automatic deployment of a minimal tool, we will be able to evaluate the end to end flow and set the groundwork to adding support for more complex tools and deployment flows. | phab:T375199 |
WE6.3.5 | By assessing the relative importance of each sustainability category and its associated metrics, we can create a normalized scoring system. This system, when implemented and recorded, will provide a baseline for measuring and comparing Toolforge’s sustainability progress over time. | phab:T376896 |
WE6.3.6 | If we conduct discovery, such as target user interviews and competitive analysis, to identify existing Toolforge pain points and improvement opportunities, we will be able to recommend a prioritized list of features for the future Toolforge UI. | Phab:T375914 |
Signals & Data Services (SDS) Hypotheses
[ SDS Key Results ] | ||
---|---|---|
Hypothesis shortname | Q2 text | Details & Discussion |
SDS 1.1.1 | If we partner with an initiative owner and evaluate the impact of their work on Core Foundation metrics, we can identify and socialize a repeatable mechanism by which teams at the Foundation can reliably impact Core Foundation metrics. | |
SDS1.2.1.B | If we test the accuracy and infrastructure constraints of 4 existing AI language models for 2 or more high-priority product use-cases, we will be able to write a report recommending at least one AI model that we can use for further tuning towards strategic product investments. | Phab:T377159 |
SDS1.2.2 | If we study the recruitment, retention, and attrition patterns among long-tenure community members in official moderation and administration roles, and understand the factors affecting these phenomena (the ‘why’ behind the trends), we will better understand the extent, nature, and variability of the phenomenon across projects. This will in turn enable us to identify opportunities for better interventions and support aimed at producing a robust multi-generational framework for editors. | Learn more. |
SDS1.2.3 | If we combine existing knowledge about moderators with quantitative methods for detecting moderation activity, we can systematically define and identify Wikipedia moderators. | T376684 |
SDS1.3.1.B | If we integrate the Spark / DataHub connector for all production Spark jobs, we will get column-level lineage for all Spark-based data platform jobs in DataHub. | |
SDS1.3.2.B | If we implement a frequently run Spark-based MariaDB MW history data querying job, reconciliate missing events and enrich them, we will provide a daily updated MW history wikitext content data lake table. | |
SDS2.1.1 | If we create an integration test environment for the proposed 3rd party experimentation solution, we can collaborate practically with Data SRE, SRE, QTE, and Product Analytics to evaluate the solution’s viability within WMF infrastructure in order to make a confident build/install/buy recommendation. | mw:Data Platform Engineering/Data Products/work focus |
SDS2.1.3 | If the Growth team learns about the Metrics Platform by instrumenting a Homepage Module on the Metrics Platform, then we will be prepared to outline a measurement plan in Q1 and complete an A/B test on the new Metrics platform by the end of Q2. | |
SDS2.1.4 | If we conduct usability testing on our prototype among pilot users of our experimentation process, we can identify and prioritize the primary pain points faced by product managers and other stakeholders in setting up and analyzing experiments independently. This understanding will lead to the refinement of our tools, enhancing their efficiency and impact. | |
SDS2.1.5 | If we design a documentation system that guides the experience of users building instrumentation using the Metrics Platform, we will enable those users to independently create instrumentation without direct support from Data Products teams, except in edge cases. | aiki T329506 |
SDS2.1.7 | If we provide a function for user enrollment and a mechanism to capture and store CTR events to a monotable in a pre-declared event stream we can ship MPIC Alpha in order to launch an basic split A/B test on logged in users. | |
SDS2.2.2 | If we define a standard approach for measuring and analyzing conversion rates, it will help us establish a collection of well-defined metrics to be used for experimentation and baselines, and start enabling comparisons between experiments/projects to increase learning from these. | |
SDS2.3.1 | If we conduct a legal review of proposed unique cookies for logged out users, we can determine whether there are any privacy policy or other legal issues which inform the community conversation and/or affect the technical implementation itself. |
Future Audiences (FA) Hypotheses
[ FA Key Results ] | ||
---|---|---|
Hypothesis shortname | Q2 text | Details & Discussion |
FA1.1.1 | If we make off-site contribution very low effort with an AI-powered “Add a Fact” experiment, we can learn whether off-platform users could help grow/sustain the knowledge store in a possible future where Wikipedia content is mainly consumed off-platform. | Experiment:Add a Fact |
Product and Engineering Support (PES) Hypotheses
[ PES Key Results ] | ||
---|---|---|
Hypothesis shortname | Q2 text | Details & Discussion |
PES1.2.4 | If we research the Task Prioritization focus area in the Community Wishlist in early Q2, we will be able to identify and prioritize work that will improve moderator satisfaction, which we can begin implementing in Q3. | |
PES1.2.5 | If we are able to publish and receive community feedback on 6+ focus areas in Q2, then we will have confidence in presenting at least 3+ focus areas for incorporation in the 2025-26 annual plan. | |
PES1.2.6 | By introducing favouriting templates, we will improve the number of templates added via the template dialog by 10%. | |
PES1.3.4 | If we create an experience that provides insights to Wikipedia Audiences about their community over the year, it will stimulate greater connection with Wikipedia – encouraging engagement in the form of social sharing, time spent interacting on Wikipedia, or donation. | |
PES1.4.1 | If we draft an SLO with the Editing team releasing Edit Check functionality, we will begin to learn and understand how to define and track user-facing SLOs together, and iterate on the process in the future. | |
PES1.4.2 | If we define and publish SLAs for putting OOUI into “maintenance mode”, growth of new code using OOUI across Wikimedia projects will stay within X% in Q1. | |
PES1.4.3 | If we map ownership using the proposed service catalog for known owned services in Q1, we will be able to identify significant gaps in service catalog as it helps in solving the SLO culture by the end of the year. | |
PES1.5.1 | If we finalize and publish the Edit Check SLO draft, practice incorporating it in regular workflows and decisions, and draft a Citoid SLO, we’ll continue learning how to define and track user-facing and cross-team SLOs together. | |
PES1.5.2 | If we clarify and define in writing a document with set of roles and responsibilities of stakeholders throughout the service lifecycle, this will enable teams to make informed commitments in the Service Catalog, including SLOs |
Q3
The third quarter (Q3) of the WMF annual plan covers January-March.
Wiki Experiences (WE) Hypotheses
[ WE Key Results ] | ||
---|---|---|
Hypothesis shortname | Q3 text | Details & Discussion |
WE1.1.3 | If we consult 20 event organizers and 20 WikiProject organizers on the best use of topics available via LiftWing, then we can prioritize revisions to the topic model that will improve topical connections between events and WikiProjects. | |
WE1.1.5 | If we implement at least 2 methods to discover the Collaboration List, then we will increase pageviews of the Collaboration List, thereby allowing more people to discover events and WikiProjects that interest them | |
WE1.1.6 | If we identify and then contact 20 affiliates and/or groups connected to wikis that have high organizer activity in Q2, we can build advocacy networks that will set the stage for the extension being enabled on 3 more wikis by the end of Q3. | |
WE1.1.7 | If we add at least 2 improvements to the Collaboration List for events, then at least 50% of surveyed respondents will find the Collaboration List to be more useful in finding events than before the changes were made. | |
WE1.2.5 | If we conduct an A/B/C test with the alt-text suggested edits prototype in the production version of the iOS app we can learn if adding alt-text to images is a task newcomers are successful with and ultimately, decide if it's impactful enough to implement as a suggested edit on the Web and/or in the Apps. | |
WE1.2.7 | If we deploy the Multi-Check sidebar (desktop) at all wikis where the Reference Check is available, we will unlock our ability to present multiple Edit Checks within a new "mid-edit" moment without negatively impacting the quality of new content edits newcomers publish. | |
WE1.2.9 | If we surface the ‘Add a Link’ Structured Task to new account holders who are reading Wikipedia articles through an A/B test on pilot wikis, then we expect to increase the percentage of these people who constructively activate on mobile by 10% compared to the control group. | |
WE1.2.10 | If the Structured Content team improves the code health of the Article-level Image Suggestions data pipeline to meet 90% of code deduplication, article and section level image suggestion separation on the index level; and adapt the image suggestion evaluation tool to be able to get baselines for quality of suggestions for target wikis, then the “Add an Image” task can be released to newcomers on additional Wikipedias. This will enable the Growth team to pursue a follow-up hypothesis focused on increasing constructive activation across at least 10 additional Wikipedias. | |
WE1.2.11 | If we release the “Add a Link” Structured Task to at least 5% percent of newcomers on English Wikipedia, then newcomers with access to this structured task will demonstrate a constructive activation rate on mobile that is 10% percent higher than the baseline, as measured through an A/B test. | |
WE1.3.3 | If we improve the user experience and features of the Nuke extension during Q2, we will increase administrator satisfaction of the product by 5pp by the end of the quarter. | |
WE1.3.4 | If we improve the user experience and features of Recent Changes, we will increase administrator satisfaction of the product by 5pp. | |
WE1.5.1 | If we create a strategy brief by February 2025, including a prioritized strategy and trade-offs, we can use it as one of the main inputs for APP25/26. | |
WE1.5.2 | If we develop a unified measurement strategy, we will enable evaluation of the multi-year product strategy for contributors and set the landscape for prioritization of next steps in metric development and reporting | |
WE2.1.5 | If we expose topic-based translation suggestions more broadly and analyze its initial impact, we will learn which aspects of the translation funnel to act on in order to obtain more quality translations. | |
WE2.1.6 | If we offer list-making as a service, we’ll enable at least 5 communities to make more targeted contributions in their topic areas as measured by (1) change in standard quality coverage of relevant topics on the relevant wiki and (2) a brief survey of organizer satisfaction with topic area coverage on-wiki. | |
WE2.1.7 | "If we developed a proof of concept that adds translation tasks sourced from WikiProjects and other list-building initiatives, and present them as suggestions within the CX mobile workflow, then more editors would discover and translate articles focused on topical gaps. By introducing an option that allows editors to select translation suggestions based on topical lists, we would test whether this approach increases the content coverage in our projects. |
|
WE2.2.4 | If we document the pre-incubator, incubator, and post-incubator journeys for the five pilot wikis with quantitative and qualitative data, we will be able to better support new languages in the future. | |
WE2.4.4 | If we develop a live proof-of-concept, using MediaWiki’s async content processing pipeline, for the first use case of Wikifunctions in Wikipedia, we will be ready to switch it on in the new year for the Dagbani community. | |
WE2.6.1 | If we propagate the integration of Wikifunctions from Test2Wiki to a small production Wikipedia with the MVP user experience, we will see the feature used organically without being reverted. | |
WE2.6.2 | If we make it possible to translate sentences in Wikifunctions from something “abstract” like a function, we will see an organic increase of at least 5 multilingual functions that generate natural language sentences. This is a milestone towards building an Abstract Wikipedia. | |
WE3.1.6 | If we introduce a personalized rabbit hole feature in the Android app and recommend condensed versions of articles based on the types of topics and sections a user is interested in, we will learn if the feature is sticky enough to result in multi-day usage by 10% of users exposed to the experiment over a 30-day period, and a higher pageview rate than users not exposed to the feature. | |
WE3.1.8 | (Q2-Q3, web) If we build one feature which provides additional article-level recommendations, we will see an increase in clickthrough rate of 10% over existing recommendation options and a significant increase in external referrals for users who actively interact with the new feature. | |
WE3.1.9 | If we create a daily-use Wikipedia-based trivia game in the Android app, logged-out readers who engage with this feature will open the app on multiple days within a 20-day period at a rate at least 5% higher than those who do not engage with the feature. | |
WE3.1.10 | If we develop and test design prototypes for tabbed browsing in the Wikipedia iOS app, we will gain and incorporate actionable insights on usability, while also enabling engineers to assess technical feasibility of different approaches, building a solid foundation for adding Tabs to the app in Q4. | |
WE3.1.11 | If we make the article search bar more prominent, we will increase the number of users who initiate searches by 8%, possibly leading to a 1% increase in search retention rate for logged out users. | |
WE3.2.3 | If we make the “Donate” button in the iOS App more prominent by making it one click or less away from the main navigation screen, we will learn if discoverability was a barrier to non banner donations. | |
WE3.2.4 | If we update the contributions page for logged-in users in the app to include an active badge for someone that is an app donor and display an inactive state with a prompt to donate for someone that decided not to donate in app, we will learn if this recognition is of value to current donors and encourages behavior of donating for prospective donors, informing if it is worth expanding on the concept of donor badges or abandoning it. | |
WE3.2.7 | Increasing the prominence of entry points to donations on the logged-out experiences of the Minerva web mobile and desktop experience will increase the clickthrough rate of the donate link by 30% YoY. | |
WE3.2.8 | If we make improvements to the personalised and collective content of the iOS apps’ Year in Review, and scale its availability, we will learn if this is an effective fundraising method. | |
WE3.4.1 | If we were to explore the feasibility by doing an experiment of setting up smaller PoPs in cloud providers like Amazon, we can expand our data center map and reach more users around the world, at reduced cost and increased turn-around time. | |
WE3.5.1 | If we make it possible for Commons Data namespace pages to be categorized and surface their usage across wikis, Commons admins will have the minimum tools they need to manage the increased usage of the Data namespace, ensuring we can sustainably scale up deployment to all wikis | |
WE3.5.2 | If we improve test coverage and documentation for Charts, we will be comfortable handing off maintenance and future feature development [to reading engineering, contractors, and volunteers], allowing us to wind down the project and task force. | |
WE3.5.3 | If we seed the Community Wishlist with Charts features we know volunteers have asked for that are out of scope for the MVP, there will be a central place for volunteers and staff to discuss future Charts-related work, allowing the future maintainers to manage expectations and source input for annual planning | |
WE4.1.3 | If we deploy the Incident Reporting System MVP to x more wikis (representative sample) we will be able to gather valuable data that will help us identify patterns of harmful conduct across wikis | |
WE4.1.4 | If we engage stakeholders across key departments in structured discussions, we can collaboratively define a shared vision and realistic scope for the Incident Reporting System, aligned with organizational priorities and compliance requirements, providing valuable insights to inform annual planning. | |
WE4.2.11a | If we define a terminology and thresholds for revert risk scores across wikis, we will make it possible to use revert risk scores in a wider range of user facing anti-abuse tools. This hypothesis impacts the WE4.2 KR by doing the background work necessary to build upon revert risk scores. | |
WE4.2.20 | Implement a trial enablement which will gather data on the efficacy of the new CAPTCHA on enabled wikis at preventing sockpupppet account creation and bot-based spam edits to measure the efficacy and value of a production rollout of the new technology | |
WE4.2.15 | If we analyze attributes of blocked user accounts on multiple wikis, we will identify patterns across these accounts and assign weights based on the relative importance of each attribute on block rates to use in calculating a user account reputation score. The success of this hypothesis would be measured by whether we are successful in defining a formula for multiplying attributes of an account to provide an account reputation score that maps to blocked users. | |
WE4.2.10 | If we add two more data points to the client hints collection pipeline, we will have more entropy to better identify sockpuppets and potential ban evasion. We will know we are successful when we are able to use the client hints data to identify X% of confirmed sock puppets on en:Wikipedia:Sockpuppet investigations. Or when we are able to use the collected data to identify Y% of suspected ban invasion pair. This hypothesis directly contributes to the KR by providing new signals (browser canvas fingerprint, list of fonts) that will allow CheckUsers to more precisely target sockpuppets and accounts attempting to evade bans. | |
WE4.2.14b | "If we introduce IP reputation data variables in AbuseFilter variables, we will enable mitigations that can reduce the amount of submissions of vandalism, spam and abuse. Context:This directly contributes to the KR goal by introducing a new signal (IP reputation) to allow for more precision in mitigations (only actions matching the variable are impacted). We could measure the impact of this hypothesis by examining the volume of reverted edits on wikis before/after the variables are introduced. (Other ideas?) We would initially introduce variables like “is likely a VPN” or “is likely a proxy”. We could also consider exposing other variables, depending on discussions in T354599: Make IP reputation available as a variable in AbuseFilter." |
|
WE4.2.14a | If we analyze IP reputation data associated with problematic editing activity and user accounts, we will be able to prioritize a set of IP reputation facets that can be provided as variables in AbuseFilter. This analysis would then be used by WE4.2.14b later Q3 to build out the variables in AbuseFilter, along with specific guidance about what mitigations would be reasonable to use alongside a given set of IP reputation variables. For example, the recommended mitigation for one IP reputation variable might be to block edits outright, while the recommended mitigation for a different IP reputation variable might be to tag the edit for further review, or to show a CAPTCHA. | |
WE4.2.18a | If we design and build a clickable component to display public data related to user account reputation to functionaries, we will be able to learn if this is useful to them by observing the number of repeat usages of the tool | |
WE4.3.3b | If we deploy a proof of concept of the 'Liberica' load balancer, we will measure a 33% improvement in our capacity to handle TCP SYN floods | |
WE 4.3.6b | If we integrate the output of the models we built in WE 4.3.1 with the dynamic thresholds of per-ip concurrency limits we've built for our TLS terminators in WE 4.3.2, we should be able to increase our ability to neutralize automatically attacks with 20% more volume, as measured with the simulation framework we're building. | |
WE 4.3.8 | If we deploy the liberica load balancers to all datacenters, we will increase the capacity to handle TCP SYN floods by 33% everywhere | |
WE 4.3.9 | If we establish and follow a verified procedure for the regular testing of large-scale abuse scenarios, then we will consistently measure and improve our ability to respond effectively to such incidents. | |
WE 4.3.10 | If we define a policy for review and maintenance of requestctl rules, we will keep the system understandable and manageable over time | |
WE 4.3.11 | If we can identify patterns and separate web scrapping from general traffic, we will be able to create reporting systematically to reduce the traffic and maintain sustainability of our serving infrastructure. | |
WE 4.4.3 | If we improve the interface of the iOS app, we will be able to clearly communicate how temporary accounts work to users as they edit without logging in, and the iOS app will be prepared for the imminent release of temporary accounts to all projects. | |
WE 4.4.4 | If we update the data models in the data lake, and the corresponding data pipelines and dashboards, to accurately represent the new user account types, we'll be able to provide accurate analytics reporting related to activities of corresponding user types. | |
WE 4.4.5 | If we resolve all remaining product, design and legal blockers for the engineering work that needs to be done before the major pilots deployment, we will be able to complete the engineering work on time for the next round of pilot deployment. | |
WE5.1.9 | If we enable Parsoid on Incubator and all newly created Wikis by Q2, we’ll further ensure sustainability by not allowing the number of wikis that run on the legacy parser to grow. We will measure the success of this rollout through detailed evaluations using the Confidence Framework reports, with a particular focus on Visual Diff reports and the metrics related to performance and usability. Additionally, we will assess the reduction in the list of potential blockers, ensuring that critical issues are addressed prior to wider deployment. | |
WE5.1.11 | The Observability team aims to sunset graphite by enabling read-only mode and disabling new metric ingest by the end of Q3 FY2024/2025. To achieve this goal, the team has set a 90% coverage target of converting the remaining dashboard and retiring legacy metrics and panels that point to graphite metrics. | |
WE5.1.12 | If we release an interactive documentation sandbox for MediaWiki REST APIs, it will introduce a repeatable pattern for low maintenance, high quality API documentation while making the APIs easier to adopt for developers around the world. This will ensure that our API documentation is fully up to date, testable, and localized for generations of developers, while reducing the maintenance cost and increasing sustainability for API publishers. | |
WE5.1.13 | If we roll out SUL3 for all existing accounts and new account creation across all wikis, we will ensure compatibility with browser anti-tracking measures and improve security, by moving authentication to a dedicated domain that requires user interaction and further prevents XSS vulnerabilities. | |
WE5.2.5 | If we model at least one more page state change (e.g. PageDelete) as a PHP event and drive further adoption of in-process domain events across MediaWiki components and extensions currently utilizing event-like hooks, then we will build confidence in events as a platform sustainability pattern by improving component boundaries, improving interface flexibility, and reducing high risk boilerplate code. | |
WE5.2.6 | If we explore designing an architecture for serializing and broadcasting events generated within MediaWiki core, we will create a foundation for offering first class event support that will enable us to consume events outside of the originating MediaWiki PHP process (e.g. JobQueue, EventBus). This will make MediaWiki data more reusable beyond the MediaWiki platform. | |
WE5.2.7 | If we identify and align on a set of domains that can be used for MediaWiki platform events by the end of Q3, we will have an initial map of core component boundaries and can improve consistency across MediaWiki interfaces by utilizing the same domains for the MediaWiki REST API modules. | |
WE5.2.8 | If we clearly define the concept of extension interfaces in the MediaWiki documentation, we can make it easier to develop new functionality on top of MediaWiki and provide a clearer path for defining new extension interfaces, such as Domain Events. We will measure this by identifying places in the documentation where extension interfaces are presented as “extension types” and replacing 100% of those instances. | |
WE5.4.3 | If we enable developers with PHP8.1 MediaWiki images and infrastructure for testing them on Kubernetes, they will be able to validate and certify them to be deployed to production. If we also develop infrastructure for progressive traffic migration and use it to safely migrate production to 8.1, this helps MediaWiki drop unsupported PHP versions in the upcoming May release. Success will be observed by the ability to ramp up production traffic to PHP 8.1 instances. | |
WE5.4.4 | If we decouple the legacy dumps processes from their current bare-metal hosts and instead run them as workloads on the DSE Kubernetes cluster, this will bring about demonstrable benefit to the maintainability of these data pipelines and facilitate the upgrade of PHP to version 8.1 by using shared mediawiki containers. | |
WE5.4.6 | If the beta cluster is configured to run MediaWiki with PHP 8.1 then the Data Platform Engineering group and their SRE team will be able to validate whether the existing dumps code functions correctly, or whether any significant functional changes would be required. | |
WE5.5.1 | If, by the end of January, we are able to measure and monitor Wikimedia hosted dumps traffic using log data, we will have clarity on how users are consuming the different dumps formatting options and access points. This will unblock additional metrics for overall consumption across streams, and improve our understanding of what users care about in terms of recency, data completion, and structure, so that we can tailor the overall API strategy accordingly. | |
WE5.5.2 | If, by the end of Q3, we create a consolidated view of developer personas and use cases collected through a listening and discovery tour, then we will uncover lesser understood gaps and opportunities in this space. This will leverage existing work completed by stakeholder teams in their respective areas (eg: Dumps, WME), in addition to creating new insights by conducting interviews with WMF staff, technical volunteers, and high impact content reuse partners (eg: WME customers and prospects). | |
WE6.1.7 | If we review the user feedback, decide on a code search and code browsing solution, deploy it to the production infrastructure as an officially supported service and enable indexing of both existing and new repositories from both code tracking systems, we will increase the scope of code that is indexed and searchable and simplify the process of locating code in day to day operations as well as during incident response. | |
WE6.1.8 | If we analyze the documentation metrics scores from our test dataset, we can evaluate the usefulness and effectiveness of the draft metrics, collect feedback, and provide actionable insights for implementing automated metrics computation | |
WE6.1.9 | If we transition 5 additional access groups to management within the Identity Management system, it will enhance access governance by improving efficiency, significantly reducing TOIL and improving the onboarding experience for incoming Wikimedia staff and new members of the technical communities. | |
WE6.2.2 | If we replace the backend infrastructure of our existing shared MediaWiki development and testing environments (from apache virtual servers to kubernetes), it will enable us to extend its uses by enabling MediaWiki services in addition to the existing ability to develop MediaWiki core, extensions, and skins in an isolated environment. We will develop one environment that includes MediaWiki, one or more Extensions, and one or more Services. | |
WE6.2.3 | If we create a new deployment UI that provides a web interface for deployments that is open to existing deployers it will allow backporters to have a shared view of deployments in progress and provide greater visibility for deployments in progress. | |
WE6.2.5 | If we publish a planning doc to move single-version routing out of MediaWiki and gather comments from stakeholders on the implementation, then we will reduce friction during implementation. | |
WE6.2.6 | If we gather feedback from QTE, SRE, and individuals with domain specific knowledge and use their feedback to write a design document for deploying and using the wmf/next OCI container, then we will reduce friction during when we start deploying that container. | |
WE6.2.7 | If we make a deployment web UI available behind our single sign-on system and open it to the Wikimedia development community it will increase the number of backport deployers. | |
WE6.2.8 | Continuing on the capabilities of Catalyst to deliver pre-merge test environments of MediaWiki and its extensions & skins on Kubernetes, if we facilitate deployments of pre-merge patches for MediaWiki services, by running pre-merge tests for Wikifunctions, then contributors will be able test more MediaWiki projects with stable, well-defined, isolated test environments. | |
WE6.2.9 | If we test the proposed MediaWiki routing implementation with a single wiki, we will have proven the plan works and can proceed with an accelerated rollout to other wikis and we will be able to route a single version container to Wikimedia’s wiki hosting infrastructure. | |
WE6.3.7 | By establishing detailed measurement criteria and evolution guidelines for our sustainability framework, we will create an actionable scoring system for platform improvements. | |
WE6.3.8 | Engaging with prospective users to explore Toolforge UI’s early design prototype will help us uncover improvement opportunities and risks to be addressed in a follow-up iteration. |
Signals & Data Services (SDS) Hypotheses
[ SDS Key Results ] | ||
---|---|---|
Hypothesis shortname | Q3 text | Details & Discussion |
SDS1.1.1 | If we partner with an initiative owner and evaluate the impact of their work on Core Foundation metrics, we can identify and socialize a repeatable mechanism by which teams at the Foundation can reliably impact Core Foundation metrics. | |
SDS1.1.2 | If we assess the impact of the new South American data center (MAGRU) on our relevance metric (unique devices), we will be able to produce a report that provides insights into the return on investment of current and future data center investments. | |
SDS1.3.1.B | If we integrate the Spark / DataHub connector for all production Spark jobs, we will get column-level lineage for all Spark-based data platform jobs in DataHub. | |
SDS1.3.2.B | If we implement a frequently run Spark-based MariaDB MW history data querying job, reconciliate missing events and enrich them, we will provide a daily updated MW history wikitext content data lake table. |
Future Audiences (FA) Hypotheses
[ FA Key Results ] | ||
---|---|---|
Hypothesis shortname | Q3 text | Details & Discussion |
FA1.1.1 | If we make off-site contribution very low effort with an AI-powered “Add a Fact” experiment, we can learn whether off-platform users could help grow/sustain the knowledge store in a possible future where Wikipedia content is mainly consumed off-platform. |
Product and Engineering Support (PES) Hypotheses
[ PES Key Results ] | ||
---|---|---|
Hypothesis shortname | Q3 text | Details & Discussion |
PES1.1.2 | If we choose three main areas in which to highlight efforts being made to improve our culture of review, and communicate about them in the right channels, we will see improvements in the responses for iterative development, decision-making, and collaboration in the next culture survey (Jan 2025). | |
PES1.1.3 | If we send a revised culture survey, we will identify areas where we can provide support to managers to continue strengthening our culture of review. | |
PES1.3.5 | If we create a Wikipedia-based game for daily use that highlights the connections across vast areas of knowledge, it will encourage consumers to visit Wikipedia regularly and facilitate active learning, leading to increased interaction with content on Wikipedia and longer session lengths. | |
PES1.3.6 | If we apply lessons from the first Sprinthackular to a second event focused on improving prototyping tools and processes, at least one Sprinthackular project will show enough value and promise that it can be integrated into the APP. We'll also be able to develop a repeatable Sprinthackular framework that other teams will recognize that they can adopt to explore any focus area! | |
PES1.5.1 | (Starting Oct 1) If we finalize and publish the Edit Check SLO draft, practice incorporating it in regular workflows and decisions, and draft a Citoid SLO, we’ll continue learning how to define and track user-facing and cross-team SLOs together. | |
PES1.5.2 | (Starting Oct 1) If we clarify and define in writing a document with set of roles and responsibilities of stakeholders throughout the service lifecycle, this will enable teams to make informed commitments in the Service Catalog, including SLOs |
Explanation of buckets
Wiki Experiences

The purpose of this bucket is to efficiently deliver, improve and innovate on wiki experiences that enable the distribution of free knowledge world-wide. This bucket aligns with movement strategy recommendations #2 (Improve User Experience) and #3 (Provide for Safety and Inclusion). Our audiences include all collaborators on our websites, as well as the readers and other consumers of free knowledge. We support a top-10 global website, and many other important free culture resources. These systems have performance and uptime requirements on-par with the biggest tech companies in the world. We provide user interfaces to wikis, translation, developer APIs (and more!) and supporting applications and infrastructure that all form a robust platform for volunteers to collaborate to produce free knowledge world-wide. Our objectives for this bucket should enable us to improve our core technology and capabilities, ensure we continuously improve the experience of volunteer editors and moderators of our projects, improve the experience of all technical contributors working to improve or enhance the wiki experiences, and ensure a great experience for readers and consumers of free knowledge worldwide. We will do this through product and technology work, as well as through research and marketing. We expect to have at most five objectives for this bucket.
Knowledge is constructed by people! And as a result our annual plan will focus on the content as well as the people who contribute to the content and those who access and read it.
Our aim is to produce an operating plan based on existing strategy, mainly our hypotheses about the contributor, consumer and content "flywheel". The primary shift I’m asking for is an emphasis on the content portion of the flywheel, and exploration of what our moderators and functionaries might need from us now, with the aim of identifying community health metrics in the future.
Signals and Data Services

In order to meet the Movement Strategy Recommendations for Ensuring Equity in Decision Making (Recommendation #4), Improving User Experience (Recommendation #2), and Evaluating, Iterating and Adapting (Recommendation #10), decision makers from across the Wikimedia Movement must have access to reliable, relevant, and timely data, models, insights, and tools that can help them assess the impact (both realized and potential) of their work and the work of their communities, enabling them to make better strategic decisions.
In the Signals & Data Services bucket, we have identified four primary audiences: Wikimedia Foundation staff, Wikimedia affiliates and user groups, developers who reuse our content, and Wikimedia researchers, and we prioritize and address the data and insights needs of these audiences. Our work will span a range of activities: defining gaps, developing metrics, building pipelines for computing metrics, and developing data and signals exploration experiences and pathways that help decision makers interact more effectively and joyfully with the data and insights.
Future Audiences

The purpose of this bucket is to explore strategies for expanding beyond our existing audiences of consumers and contributors, in an effort to truly reach everyone in the world as the essential infrastructure of the ecosystem of free knowledge. This bucket aligns with Movement Strategy Recommendation #9 (Innovate in Free Knowledge). More and more, people are consuming information in experiences and forms that diverge from our traditional offering of a website with articles – people are using voice assistants, spending time with video, engaging with AI, and more. In this bucket, we will propose and test hypotheses around potential long-term futures for the free knowledge ecosystem and how we will be its essential infrastructure. We will do this through product and technology work, as well as through research, partnerships, and marketing. As we identify promising future states, learnings from this bucket will influence and be expanded through Buckets #1 and #2 in successive annual plans, nudging our product and technology offerings toward where they need to be to serve knowledge-seekers of the future. Our objectives for this bucket should drive us to experiment and explore as we bring a vision for the future of free knowledge into focus.
Sub-buckets

We also have two other “sub-buckets” which consist of areas of critical functions, which must exist at the Foundation to support our basic operations, and some of which we have in common with any software organization. These “sub-buckets” won’t have top level objectives of their own, but will have input on and will support the top level objectives of the other groups. They are:
- Infrastructure Foundations. This bucket covers the teams which sustain and evolve our datacenters, our compute and storage platforms, the services to operate them, the tools and processes that enable the operation of our public facing sites and services.
- Product and Engineering Support. This bucket includes teams which operate “at scale” providing services to other teams that improve the productivity and operations of other teams.