Jump to content

Future Audiences/Discord bot

From Meta, a Wikimedia project coordination wiki

Future Audiences is exploring strategies to expand beyond our existing audiences of readers and contributors so we can truly reach everyone in the world as the “essential infrastructure of the ecosystem of free knowledge”. This includes investigating whether/how audiences might like to interact with Wikipedia information in a more conversational way.

Building on last year’s experiments with conversational AI, we want to understand whether and how people might like to get information from Wikipedia via a conversational AI bot. We are proposing to test this via an experimental conversational application on Discord. This will also provide us with an opportunity to test new ways of delivering free knowledge – i.e., social and/or gamified approaches – to the younger audiences on Discord.

Experiment FAQs

[edit]

Why Discord and why a Discord application?

[edit]

Discord is a chat platform popular among Millennials and Gen Z. It has close to 200 million monthly active users and 21 million “servers” (unique spaces for groups of users to chat). While Discord began as a chat platform for gamers, it is now used for a variety of groups to gather and connect (including Wikipedians). Discord is also a popular platform for 3rd party developers, with over 9,000 third party bots developed for the site, including by platforms like Netflix and Soundcloud.

Many popular Discord bots provide fun/social experiences for a group chat – e.g., create in-chat gameplay, produce images on command, recommend music or anime, etc. We believe providing some social and/or gamified functionality will be critical in getting organic adoption and usage of our bot.

What will our application do?

[edit]

MVP functionality

[edit]

We are aiming for a chat experience that:

  • Answers questions based on information contained in Wikipedia
  • Provides a fun gamified way to learn about a Wikipedia knowledge topic

1. As a user, I can:

  1. Ask the bot questions and receive:
    1. A natural-language summary of the relevant Wikipedia knowledge
    2. Link(s) to relevant pages used
    3. Metadata on quality of content (number of contributors, date last edited)
  2. Ask the bot to generate a quiz from a Wikipedia article and get a fun interactive Q&A experience
Other ideas:
[edit]
  • Other utility or fun use-cases (games, trivia, “Citation Needed”-like fact checking, subscriptions, other?)
  • Prompts to learn more or learn about “related concepts” after a query is completed

Technical functionality

[edit]

The Discord bot will reuse the Retrieval-augmented generation (RAG)-based methodology we developed for the Wikipedia plugin for ChatGPT:

  • In response to a natural language query provided by the user (either a question or a request for a quiz) the bot will use the ChatGPT API to break this down into keywords that can be sent to the Mediawiki Search API
  • The bot will then take the content of the relevant article(s) returned by the Mediawiki Search API and run it back through ChatGPT to provide the user back a summarized answer to a question or a quiz on the topic, depending on what they asked for.

We will provide links to the source article(s) for each response.

How/what will we learn?

[edit]

1. Do people want to interact with Wikipedia knowledge in a more conversational way?

[edit]

In 2023, conversational AI was still new and our ChatGPT experiment did not surface a large appetite among consumers to interact with Wikipedia conversationally. Since then, AI-powered chatbots have become a more routine fixture online, including on social media and messaging apps. Facebook, WhatsApp, Instagram, and Twitter/X all now have an AI powered general knowledge chatbot.

It is still unclear to what degree people expect, want, or trust these types of experiences to get encyclopedic information; and whether it is possible for a chat experience to reflect Wikipedia’s values of neutrality and information integrity. We believe experimenting to learn about this via an off-wiki AI application will be less risky than starting with an experimental application on-wiki, but still deliver valuable insights about consumer experience and behavior.

2. Do people want to interact with Wikipedia knowledge in a fun/gamified way?

[edit]

Games and quizzes have become popular ways for information/media platforms like The New York Times to attract new audiences. Via a partnership with Kahoot! (a quiz platform) Wikipedia has had a Kahoot! channel since November 2023, where 1.9M players have played a Wikipedia quiz in the past year, indicating that there is appetite for off-platform gamified knowledge experiences. Testing new concepts for gamified knowledge experiences off platform could help get insights for on or off-platform games investment. The WMF Android App team has prototyped a Wikipedia game experience, but more exploration of how we might deliver gamified experiences (e.g., quizzes around knowledge topics specified by users) would be useful for developing games on or off-wiki.

3. Can a Wikipedia experience on Discord increase engagement with Wikipedia among younger audiences?

[edit]

Discord is home to many highly active Wikipedians (the community-run Wikipedia Discord server has over 7,000 members), but Discord is also home to many other niche communities that are popular among young people, e.g.: gaming, anime/K-Pop fandoms, etc. We are interested in seeing if a Wikipedia bot could spread organically among these communities, and whether this might encourage more active engagement with/contribution to Wikipedia from younger generations.

4. How will we ensure that the bot doesn't deliver incorrect, misleading, or harmful information?

[edit]

We are planning several rounds of internal evaluation before releasing the bot publicly on Discord. As with our previous work on the Wikipedia plugin for ChatGPT, this will include testing for specific known jailbreak use-cases and failure conditions to minimize incorrect, misleading, or harmful output. We are aiming for a 75-90% acceptable answer rate, with following output considered acceptable:

Acceptable:
[edit]
  • "Can't answer this" response (when there is insufficient content on Wikipedia to answer the question, or when the question is not relevant to Wikipedia – e.g., a matter of opinion)
  • Relevant and correct information (provides nuance, clearly states when there is a lack of information or ambiguity in the information on Wikipedia)
  • Irrelevant but correct information (provides information that is not relevant to the user's query – e.g., summarizes from a different but similarly-named article – but does so faithfully and transparently, linking to source article)

5. What data will we be collecting?

[edit]

We will be relying on Discord archives to log queries and answers for quality assessment purposes. Users will be able to mark bad responses, which will help us understand perceived quality.

How to stay updated on insights from this experiment

[edit]

As usual, we will be sharing updates on this and other Future Audiences experiments during our monthly open community calls. Please sign up here if you’d like to be notified for upcoming calls.

If you have any further questions/inputs please get in touch on the talk page!