Wikimedia Foundation Annual Plan/2023-2024/Product & Technology

Translate this page

This page explains how the Wikimedia Foundation's Product and Technology departments are preparing for the 2023-24 Annual Plan. It contains the work portfolios (nominally called "buckets") and some potential objectives for them. These were shared as early as possible, but because these were works-in-progress please be aware they did change.

Part 2 of this documentation covers the departments' finalised Objectives and Key Results (OKRs)

This page is also available in summary form as a blogpost on Diff
This page is only available in English as it is about an early and conceptual stage in a larger process. Later documents are all translated.
Translations of the associated blogpost are welcome.

My name is Selena Deckelmann, and I’m the new Chief Product and Technology officer at the Wikimedia Foundation. I’m glad to have the opportunity to share with you all some thoughts on how we can improve the way we work together. I know you have ideas you want to share with us, too, and I’m looking forward to hearing them.

The reason I took this role at the Foundation is the depth of passion I see across the movement for keeping knowledge free and accessible. I started my own career as a volunteer, tinkering with computers for open source projects, and have spent a lot of time since then working to make the technology space more welcoming, accessible, and safe for everyone. I know how vital volunteers are to a global movement like ours.

I’m proud of the work our team has done, and I’m consistently impressed by the knowledge, expertise, time, and care volunteers bring to Wikipedia. I also recognize there isn’t currently a clear, dependable way to keep volunteers up to date about the work Wikimedia Foundation staff are doing at any given time. We’d like to make it easier to communicate our plans and projects, and to listen to your ideas and questions. I also want to acknowledge that the process of deciding how to get there is going to be collaborative and, so, almost certainly a little bit messy as we try some things out. The goal, though, is to work together toward a solution that feels helpful and meaningful.

With that in mind, we want to share some potential objectives we’ve drafted for the financial year 2023-2024 (FY 23-24). This isn’t meant to be a list of things we’re definitely going to do — there’s plenty of room for your questions and suggestions. The purpose of this list is to highlight the most necessary and important categories of technical work across the movement. In working together on aligning our priorities, we’ll be able to make sure our time and energy is being used most effectively in regard to the projects within each objective.

Summary

The work we do at Wikimedia Foundation has many purposes, and has been described as a socio-technical ecosystem. Within that ecosystem, the Product and Technology departments provide critical services, design and launch new products, and innovate in areas of machine learning, and internet-based collaboration. As we think about both multi-year strategic planning, as well as this year’s annual plan, we are trying to identify core areas of work that extend from Wikimedia’s historic success, relate directly to advancing Movement Strategy recommendations, and provide clear, visible roadmaps to stakeholders internally and externally.

Ways of Working:

In order to focus the work of the Product and Tech departments going forward, we’re assigning wranglers (Product and Technology department leaders who determine priorities) to “buckets” for planning purposes. These groups of wranglers are going to identify objectives and key results at our highest levels through our planning process. We want to bring folks together who depend on one another to accomplish our most important goals.

An important aspect of the “three key” buckets is that we are declaring that work within a bucket will not block work in another bucket. This means that we still consult and inform one another, and collaborate to enable work where appropriate, but teams can make a decision to move forward with plans without someone from another bucket blocking their work.

Budget Planning:

We don’t yet have a committed budget allocation between the different buckets – this will come later. Across both Product and Tech departments, we estimate that Wiki Experiences will be about 50%, Signals and Services will be about 30%, Future Audiences will be about 5%, and Infrastructure and Product and Engineering Services will be the remaining. These estimates are based on the work we're doing today. We won't know the FY23-24 budget allocations until after we prioritize objectives sometime in March.

A simple way to distinguish between the purposes of the two largest buckets are: Wiki Experiences is focused on our audiences interacting with our content through core wiki experiences (wikis, mobile apps, tools etc.). Signals and Data Services serve audiences seeking insights into our content/metadata, making decisions on our content and services, and/or interacting with our content in structured or programmatic ways.

Wiki Experiences

ℹ️ About 50%

The purpose of this bucket is to efficiently deliver, improve and innovate on wiki experiences that enable the distribution of free knowledge world-wide. This bucket aligns with movement strategy recommendations #2 (Improve User Experience) and #3 (Provide for Safety and Inclusion). Our audiences include all collaborators on our websites, as well as the readers and other consumers of free knowledge. We support a top-10 global website, and many other important free culture resources. These systems have performance and uptime requirements on-par with the biggest tech companies in the world. We provide user interfaces to wikis, translation, developer APIs (and more!) and supporting applications and infrastructure that all form a robust platform for volunteers to collaborate to produce free knowledge world-wide. Our objectives for this bucket should enable us to improve our core technology and capabilities, ensure we continuously improve the experience of volunteer editors and moderators of our projects, improve the experience of all technical contributors working to improve or enhance the wiki experiences, and ensure a great experience for readers and consumers of free knowledge worldwide. We will do this through product and technology work, as well as through research and marketing. We expect to have at most five objectives for this bucket.

Knowledge is constructed by people! And as a result our annual plan will focus on the content as well as the people who contribute to the content and those who access and read it.

Our aim is to produce an operating plan based on existing strategy, mainly our hypotheses about the contributor, consumer and content "flywheel". The primary shift I’m asking for is an emphasis on the content portion of the flywheel, and exploration of what our moderators and functionaries might need from us now, with the aim of identifying community health metrics in the future.

Objectives

Support the growth of high quality and relevant content within the world’s most linguistically diverse, trusted and comprehensive free knowledge ecosystem by enabling and supporting high quality and accessible experiences

Encourage the satisfaction of and support given to moderators, patrollers and functionaries through potential initiatives like:

Developing and testing product hypotheses about moderator and other functionary community health by:
- Acting on promising hypotheses as we engage in deeper research, including ML/AI enabled approaches to improving workflows;
- Exploring and developing metrics and monitoring for moderator and other functionary backlogs and the quality of workflows volunteer contributors currently use;
- Defining audience types and ways of measuring the number of these contributors over time;
- Developing qualitative and quantitative measurements of community health;
- Examining the “dance partner” model between editor outputs, volunteer technical contributors and moderator workloads, including but not exclusive to exploring tooling, feedback mechanisms, recruitment to enable effective and collaborative community development and health.
Continuing to explore, improve and prioritize mobile experiences for this audience in an increasingly mobile-first world (e.g. Moderator Tools)

Support editors of our wikis at-scale through potential initiatives like:

Improving communication tools that support healthy and sustainable communities;
Supporting content growth on medium-sized wikis providing tools and workflows that enable ML-assisted content creation, machine-assisted translation and other tooling;
Continuing to explore, improve and prioritize mobile experiences in an increasingly mobile-first world.

Continue to improve our development practices through potential initiatives like these:

Moving to a self-service model for platform support where we can.
Adopt the Codex design system.

Produce a modern, relevant and accessible reading and media experience for our projects

Continue to improve reader experience through potential initiatives like:

Exploring, improving and prioritizing mobile experiences including audio,
Continue work on SEO and enabling incremental reader improvements,
Explore ML-enabled natural language search experience.
Explore the future of media related workflows and media-rich content, starting with Commons.

Support the funding of Wikimedia projects by making sustainable the technology behind our largest and emerging revenue sources

Provide the necessary tools for all aspects of advancement/fundraising to be the best at their jobs.
Explore the native mobile app donation workflows to reduce donor friction and other Mobile use cases for future fundraising.

Produce an effective & efficient knowledge production platform

Improve MediaWiki platform development that reduces toil, incorporates best practices, and maintains our commitment to open source development practices, through potential initiatives like:

This includes considering the developer experience of volunteer technical contributors.
Define what “core MediaWiki is” to establish good boundaries between the work of MediaWiki platform and feature development teams.
Define and support moving to a self-service model for MediaWiki feature development that interfaces with a well-defined “back end” website architecture.
Continue to improve our infrastructure platform, developing more self-service components and ensuring stable and reliable infrastructure services.
Establish a comprehensive open source strategy for the Foundation.

Signals and Data Services

ℹ️ About 30%

In order to meet the Movement Strategy Recommendations for Ensuring Equity in Decision Making (Recommendation #4), Improving User Experience (Recommendation #2), and Evaluating, Iterating and Adapting (Recommendation #10), decision makers from across the Wikimedia Movement must have access to reliable, relevant, and timely data, models, insights, and tools that can help them assess the impact (both realized and potential) of their work and the work of their communities, enabling them to make better strategic decisions.

In the Signals & Data Services bucket, we have identified four primary audiences: Wikimedia Foundation staff, Wikimedia affiliates and user groups, developers who reuse our content, and Wikimedia researchers, and we prioritize and address the data and insights needs of these audiences. Our work will span a range of activities: defining gaps, developing metrics, building pipelines for computing metrics, and developing data and signals exploration experiences and pathways that help decision makers interact more effectively and joyfully with the data and insights.

Objectives

Each metric and dimension in our essential metric data set is scientifically or empirically supported, standardized, productionized, and shared across the Foundation.

Effective use of metrics to make strategic decisions at the Foundation requires us to measure and assess the impact of work using a common, reliable, and well-understood set of metrics. Ensuring that different teams working on different projects are using the same metrics with the same definitions to understand the impact of their work will allow us to ensure we are “speaking the same language”, and better coordinate efforts across the foundation, with affiliates, and with the community. These metrics will allow Foundation staff to monitor the value of their work and the community to see the value of the Foundation. And establishing a standard set of metrics allows the engineers supporting the tools used in data preparation and analysis to deliver a higher standard of service by more precisely defining the scope of their work, making the effort more tractable.

Potential Key Results:

Identify and define the key essential metrics needed to evaluate program progress and assess impact.
Establish standardized definitions, calculations, and data sources for each metric and dimension in the essential metric data set.
Provide training and support to relevant stakeholders on how to use and interpret the essential metrics.

Wikimedia staff and leadership make data-driven decisions by using essential metrics to evaluate program progress and assess impact

By using essential metrics to evaluate program progress and assess impact, we can ensure that we are making informed decisions that are backed by evidence. This allows us to stay focused on our most important goals, make adjustments as needed, and track our progress over time.

Potential Key Results:

Five Wikimedia initiatives or projects use essential metrics to evaluate progress.
Develop and implement a dashboard that displays essential metrics.

Data users can easily find our essential metric data and access information about it: how to interpret them, where they are sourced, how they are produced, appropriate and inappropriate uses, privacy and ethical guidance, and what limitations they have

Data is not useful if it is not accessible. Our metrics must have maximum accessibility for us to maximize their utility to all audiences. To guide appropriate use and prevent misuse, we gather, organize, and make available the necessary information.

Potential Key Results:

100% of essential datasets have designated business owners/data stewards
Six Metrics that Matter data sets are fully and publicly documented with clear guidance on how to use and consume
100% of essential metric data that are reliably tracked in the Foundation-Level Metrics Framework are cataloged with respect to sourcing and production

We establish standards for quality and examine the quality of essential metric data via regular review

Metrics must have known and verified quality to be trusted to guide major strategic decisions. Standards for the quality of essential metric data must be determined and published to provide the evidence that our metrics can be used to support strategic decisions.

Potential Key Results:

Six Metrics That Matter data sets contain logging, monitoring, and alerting for data quality incidents based on business owner/data steward-informed SLOs/SLIs.
100% of potential essential metrics data issues are triaged by essential metrics business owners/data stewards within 10 business days of the issue being reported.

Users can reliably query Wikimedia data at scale

Search and discovery experiences are critical to how users experience our content. We must be able to deliver those experiences in a reliable, sustainable, scalable fashion to meet the needs of free knowledge distribution and discovery.

Potential Key Results:

Increase the number of successfully processed backend Wikidata requests by 25%.
- ALTERNATIVE: Decrease the number of failing backend Wikidata requests by X%?
- Experiment top three recommendations in the WDQS stabilization plan and document outcomes
Experiment with # improvements for Wikipedia reader 1st party search basic functionality

Future Audiences

ℹ️ (About 5%)

The purpose of this bucket is to explore strategies for expanding beyond our existing audiences of consumers and contributors, in an effort to truly reach everyone in the world as the essential infrastructure of the ecosystem of free knowledge. This bucket aligns with Movement Strategy Recommendation #9 (Innovate in Free Knowledge). More and more, people are consuming information in experiences and forms that diverge from our traditional offering of a website with articles – people are using voice assistants, spending time with video, engaging with AI, and more. In this bucket, we will propose and test hypotheses around potential long-term futures for the free knowledge ecosystem and how we will be its essential infrastructure. We will do this through product and technology work, as well as through research, partnerships, and marketing. As we identify promising future states, learnings from this bucket will influence and be expanded through Buckets #1 and #2 in successive annual plans, nudging our product and technology offerings toward where they need to be to serve knowledge-seekers of the future. Our objectives for this bucket should drive us to experiment and explore as we bring a vision for the future of free knowledge into focus.

Objectives

Describe multiple potential strategies through which Wikimedia could satisfy our goal of being the essential infrastructure of the ecosystem of free knowledge

We know we want Wikimedia to become the essential infrastructure of the ecosystem of free knowledge, but there are multiple different ways we could do this, and different audiences on which we could concentrate. Our movement has not yet settled on a path to achieving this goal. We will come up with different discrete ways that we could be the essential infrastructure. These candidate strategies are what we’re investigating in this bucket – which one or ones are the future we should inhabit to complete our mission? For each of them, we will identify the audiences they address, and we’ll develop hypotheses to test that will help us collectively decide whether that strategy seems right for our future.

Beginnings of key results / deliverables:

Develop five candidate strategies for reaching future audiences as the essential infrastructure, and describe the audiences these strategies would reach.
For each candidate strategy, identify hypotheses and approaches for testing them.
Publish explanations of the candidate strategies.

Test hypotheses to validate or invalidate potential strategies for the future, starting with a focus on third party content platforms

Committing to a strategy for the long-term is a major choice that our movement needs to make wisely. The core work of this bucket will be the research and experiments that can help us foresee which paths hold the most potential. For the candidate strategies that come from the previous objective, we will identify hypotheses that probe the core of those strategies, and then do the research, proofs-of-concept, and investigations that help us learn whether those strategies do or don’t seem like good paths into the future.

Though we’ll be open to many different strategies, we do know we want to prioritize investigating “third party content platforms”, because if we want to reach everyone in the world with free knowledge, we’ll need to reach them where they are on the internet – these days, they can be found on the many social and content platforms that go beyond our historical channels of search engines and our own websites.

Beginnings of key results / deliverables:

Publish guiding principles for how we test hypotheses and how we take action with successes and failures
Pursue global youth audiences, which have a concerningly low awareness of Wikimedia projects
Explore the leading candidate strategy of consumption and contribution through third party content platforms
Build a self-storage solution that enables low-cost prototyping of software ideas

Build alignment around a continuously sharpening vision of the future

Though our movement has agreed on becoming the essential infrastructure of the ecosystem of free knowledge, different parts of our movement see our path to getting there differently. But the more we’re able to pull in the same direction, the more likely we’ll succeed in our goals. As we test hypotheses and learn more about what our future strategies could be, we need to spread our learnings and their implications for the future. We also need to seek input from our colleagues and communities. By doing these things, we will help align our movement to achieve bigger things together.

Beginnings of key results / deliverables:

Bring a set of learnings to annual planning 2024–25 that influences the work of Buckets #1 and #2.
Develop community groups or conversations around “future audiences” that include both longtime trusted community members and new voices that represent the audiences we’re pursuing.

Sub-buckets

ℹ️ About 15%

We also have two other “sub-buckets” which consist of areas of critical functions, which must exist at the Foundation to support our basic operations, and some of which we have in common with any software organization. These “sub-buckets” won’t have top level objectives of their own, but will have input on and will support the top level objectives of the other groups. They are:

Infrastructure Foundations. This bucket covers the teams which sustain and evolve our data centers, our compute and storage platforms, the services to operate them, the tools and processes that enable the operation of our public facing sites and services.
Product and Engineering Services. This bucket includes teams which operate “at scale” providing services to other teams that improve the productivity and operations of other teams.

Keywords

Accessible – The experience of locating and retrieving relevant Wikimedia data sets or viewing relevant reporting is self-directed and easy.
Buckets: These are a portfolio of work with a budget allocation attached. These will be the highest order grouping of all our work. They are intended as containers that help us make decisions about priorities/objectives and allocation of our budget. Importantly, though, buckets are not rearranging people and teams.
Committed Work: Work that is tied to a contract, regulatory or legal requirement, or other significant commitment. Not completing this kind of work would subject us to significant legal, financial or reputational risk. Risk assessments which result in not completing something considered committed work should be reviewed by Director+ level for approval before deprioritizing the work. Approval level should match where the commitment was made.
Core Metric – One of a small set of well-defined data element that measures something we use to guide strategic decisions.
Contributing Metric – A measure that contributes directly to the explanatory picture of a core metric; e.g. session length.
Data Product - a product that facilitates an end goal through the use of data (DJ Patil, Data Jujitsu: The Art of Turning Data into Product, 2012). Products types can be raw data, derived data, algorithms, decision support, and automated decision-making
Dimension – A data element that is used to narrow or focus the analysis of a metric; e.g. timestamps, language codes, country codes, DNS names, test treatment set, etc.
Discoverable – New and experienced users of data are able to find and use / reuse the data elements relevant to their use cases.
Essential Metric Data – The set of data elements that includes all core metrics, and all contributing metrics and standard dimensions used to analyze and understand core metrics.
Integrable – It is easy to use data sets from different sources in combination with one another.
Metadata – Data about data. E.g. location, modification time, ownership, format, access permissions, privacy sensitivity, documentation, etc.
Navigable – The relationships between data elements are documented and defined. OKR: We will be creating 3 levels of cascading OKRs. For convenience sake, we are labeling each level. Here are the levels:
- Top-level Metric: These are the highest level, the metrics by which we expect to judge our progress as departments. We expect to have 4ish of these.
- Objective: This is the mid-level objective, owned by directors. We expect there to be 1-4 of these in each bucket, and each one will comprise the work of 30-40 people.
- Hypothesis or Bet: This is the team level, and is a specific project or task which a team is proceeding with, and has evaluation criteria for. This should be associated with some key result for evaluation of success.
Prepared – The data is in or close to the format in which the user wants to consume it.
Reliable – Data products are ‘reliable’ within defined constraints (e.g. SLOs). Reliability of data products includes:
- Available – data is online and available with predictable query and response time characteristics.
- Fresh – data is ‘fresh’ within a timeframe.
- Eventually consistent – Primary and derived data items are eventually consistent across data products. E.g. a wikidata item has the same values in wikidata as it does in WDQS with predictable latency.
Sourceable – It is easy to load data into systems for serving production features and analytics.
Standardized – The same definition is used for a data element across all Wikimedia work.
System - The totality of all actors, components, and processes involved in creation, transformation, and distribution of data.
Technical Contributors: developers, wiki template editors, technical documentation writers, bug reporters.
Wranglers: Product and Technology leaders responsible for creating, estimating and prioritizing potential objectives and supporting the teams in proposing work to achieve the objectives and defining metrics.
Wikimedia data – Inclusive of all gathered and generated data in our systems – text content (including wikitext constructs), multimedia files, user-generated data files, observed facts, inferences, metrics and metadata.
Well-defined – Data elements and records each have definitions that explain their provenance, appropriate and inappropriate uses, formats, constraints, expected ranges/distributions, and restrictions.