Jump to content

GLAM CSI/Wikimania 2024

From Meta, a Wikimedia project coordination wiki

Global GLAM Meetup

[edit]
Afternoon team leaders at the Global GLAM Meetup at Wikimania 2024. (Left to right, Andrew Lih, Jamie Flood, Katarzyna Makowska, and Angie Cervellera)

At the Global GLAM Meetup on August 12, 2024, more than 30 GLAM professionals and community members gathered to discuss working with cultural and heritage content on Wikimedia projects. The afternoon session was led by Andrew Lih, Jamie Flood, Katarzyna Makowska, and Angie Cervellera at the Silesian Museum in Katowice, Poland.

The afternoon consisted of two activities that engaged participants in generating feedback on the current state of Wikimedia contribution, and brainstorming possible ways of overcoming obstacles.

Presentation of GLAM CSI results

[edit]

The afternoon started with Andrew Lih giving a quick look at the preliminary results of the GLAM CSI survey that was performed in the first half of 2024. (Google Slides link)

Some key takeaways that were presented included:

  • For survey respondents, Wikimedia Commons and Wikidata were the most engaged-with Wikimedia projects, with Wikipedia being third.
  • When asked "Which tool(s) (or scripts) do you use in contributing to Wikimedia projects?" a mix of both officially supported tools and community (e.g. Magnus Manske) tools were in the mix. The top ranked tools included:
    • Wikidata Query Service
    • Quickstatements
    • Commons Upload Wizard
    • OpenRefine
    • Program and Events Dashboard
    • Cat-a-lot gadget

Feedback on linked open data workflow tools

[edit]

We introduced the Linked Open Data workflow (Wikidata:Linked open data workflow) document to the attendees, which was new to many folks. After quickly explaining the various phases of this classic "Extract-Transform-Load" type workflow, we challenged the participants to respond to the chart with experiences and recommendations.

Participants were asked to individually provide feedback on each of these six phases of the workflow using sticky notes. People were given 20-30 minutes to walk amongst the posters in the room, noting to also read the stickies left by others. They were asked to leave notes in three areas for each column:

  1. Edits or additions to the list of tools
  2. Positive experiences
  3. Challenges

An extra poster to hold any "Other workflow ideas" was also provided.

Some rough overall observations could be made right away:

  • Ingestion (3) had the most varied and numerous responses, including many additions of specialized tools for ingestion, such as those related to iNaturalist, BHL, or SourceMD.
  • For ingestion solutions, there seem to be a good capacity for creating specialized or custom ingestion (3) tools, whether they were scripts, or custom code.
  • A number of feedback notes pertained to using tools to revise, adjust, or clean up data after an initial load of content - VisualFileChange.js, Quickstatements, or Cat-a-lot. This reflects comments that the ingestion/upload tools may not always do all that is desired by the user, requiring multiple tools to add all the relevant metadata, reflecting more of an iterative ELT (extract-load-transform) process rather than a traditional ETL process.
  • The most feedback was on the Reporting (6), with many concerns about the reliability and capability of measuring impact of contributions, both within the Wikimedia ecosystem, and especially for help measuring impact externally.

Detailed notes

[edit]

Transcribed from sticky notes in images above.

PREPARE RECONCILE INGEST ANALYZE RE-USE REPORT
Edits/additions video2commons

Maybe add GenAI for ChatGPT, like for SPARQL queries Opportunity to use AI to make process easier

Mix'n'Match, I use this to add GLAM identifiers to items all the time

Graph Builder but for properties Opportunity to invest in/use AI to make this process easier Mix'n'match gadget - Magnus mix and match gadget which adds ALL the possible matches onto one of them

iNat2commons

iNat2wiki SourceMD/Source Metadata BHL2Wikidata Commons mobile app for Android Petscan, ACDC, Cat-a-lot good tools to update institutional upload

Scholia

Amazing visualizations - also of WikiProjects author_strings (Magnus script) to change author name string to author while looking at the publication item

WLM and WLE tools PetScan - Can tell you which of your images have FA Awards, featured images.

Please make reporting super simple so a non-Wikimedian can casually check the stats Reporting impact, for Wikiprojects, not just GLAMs

Positives OpenRefine provides easy experience of mass editing/normalization of data (using Python?)

OpenRefine is a very appreciated data tool I LOVE OpenRefine - Works very well with librarians, not so well with museums, WHY? Flickr2commons flickypedia flickr is a platform GLAMs know and license change is relatively simple! I found this useful with small scale project OPENREFINE is introduced at my organization as a wiki-tool but became a very appreciated data-tool It's great for very data literate people... (a minority)

Petscan helped to find new articles in important themes

I Love OpenRefine, it's not just for wikimedia projects. Reconcile and interact with other websites, seen as pro-level Positive OpenRefine The entire NZ Thesis project runs off OpenRefine it's amazing! More identifiers gadget - empowers missing identifers to be added to items iva adding just the VIAF ID Another disambiguator - amazing tool to ensure publication Wikidata items are linked to their author Wikidata items (yes!)

WikiShootMe

Great for motivating newbies to add images Reveals incorrect Wikidata coordinates Pattypan Very customizable once I had experimented with it Pattypan Great because it offers a lot of guidance Pattypan Still the only bulk-upload tool that's easy to use (compared to OpenRefine) OpenRefine also great because of excellent support through the forum OpenRefine We're starting to use it in GLAM Wiki collabs OpenRefine For uploading in general

Visual File Change

Saved my life by fixing an error after a bulk upload Visual File Change for fixing my inevitable typos in Commons bulk upload Reasonator Impresses management! Looks much nicer than Wikidata.org Cat-a-lot Very useful and <unreadable> time Wikidata Graph Tool Useful visualization for non Wikidata people (Angryloki) VisualFileChange.js So easy to use and clean thinngs up Integraality is amazing. How else can I see how my Wikidata project contributed over time? Integraality Helped start Wikiproject Manuscripts MediaWiki API To track quality over tuime of target articles Listeria great for building work lists. I use it all the time to draw together my Wikidata work and find notable people for WP ISA <unreadable> early 2024 with the help of new and developing with Wiki<unreadable> Africa

Template:Art Photo

Multilingual Metadata on Commons Autopopulating Commons templates using SDC/Wikidata is great! Specialized templates in general Love infobox templates and all templates, really Wikidata templates in en.wp infoboxes - they don't always get deleted now! :) I LOVE Wikidata infobox. Especially for visibiltiy in non-English languages

Image views are a very good way of selling WM engagement to a GLAM

Scholia Tool - This is amazing for dismabiguation also for generating visualization and engagement Metrics to illustrate reach and impact are essential to getting GLAM leaders to support the work Every use in a wiki article is a win! We want to count them!

Challenges Format conversion

Opportunity to use AI to make the process easier Getting rights cleared OpenRefine is daunting for newbies and many GLAM staff OpenRefine would be more usable if web based

Please create a better tutorial for mix'n'match, technically as a manual instruction

Petscan - documentation?? If I get what I need out it's a miracle! Challenges/reconcile - Petscan is great but it is vary hard for me to learn to to use it OpenRefine - Sometimes reconciliation doesn't work and we have no idea why (beacuse it's external service)

WikiShootMe

needs upgrades, like choosing "instance of" item created. Could be useful for GLAMS! We've used it with popular libraries Wikidata and Commons data models are complex it would be good to have tools with easy to use "wizards" WikiShootMe +options +connect with events? QuickStatements Adding refs or qualifiers to existing items is hard and unpredictible Pattypan Persistent problems with post settings not being cleared when new batches started Commons Upload Wizard Modern version too slow and need long time to upload. <unreadable> is better I wish there was an easy tool to pull images from our museum's API/collections online into Commons. Or any museum/collection QuickStatements how can we have a tool used for 50% of Wikidata edits break so often?! I enrich NZThesis metadata on Wikidata by connecting authors to publications – I use Magnus's ORCIDator + SourceMD tools but they often break :( (Tamsin) BHL to Wikidata Is limited to 3-4 pubs at a time if they cite a lot of papers. But is a good alternative to SourceMD to get papers in. It's a Magnus tool so it breaks often ISA Tool is proven difficult to use for new-comers because they need an account and know what SD is SDC not complete in Pywikibot Lots of SDC models are incomplete/missing community consensus SDC is not added at upload (but as a subsequent edit)

Distributed game limitations

Obtaining a bot flat is complicated (Pywikibot)

Graph Commons

I don't know how it works Wikipedia editors unfriendly to Wikidata Translating labels (WD) and captions (Com) Not easy to find things Would like to know more about Wikidata Graph Builder Commons - More clearer license for GLAMs, especially uploading process which is not planned for them (options are confusing) Wikidata is hard to teach! GLAMs are often interested in data and enrichments added to their data and files, but not easy to retrieve this.

How to know if a tool work for not?

I want to be told when one of my images is on the front page of Wikipedia Measuring how integrated an uploaded data set is with rest of the platform (images or data) Summarize how a data set has been enriched WikiEdu dashboard - still has lots of problems reports everything not just event related There are so many statistics tools! They do slightly different things and it's impossible to remember their names. They're also hard to learn for GLAM partners (Who really need them) Need for normal simple to use tool for cultural institutions to evalute their event impact Anecdotes and examples are also needed for leadership reports, to illustrate a point made in quantitiatve metrics Linked Data Impact - needs to be defined before it can be measured GLAMorous gives inconsistent results Need for some tools to evaluate extra-wiki use of images on Commons GLAM wiki dashboard - GLAMS have to wait very long time to be added Report on impact of re-use outside Wikimedia projects GLAMorgan doesn't notice when an image was added? Assumes it was always in the aritcle? WMF itself must also be aware of the impact of Wikidata and Commons outside Wikipedia Impact and reliability Stats for the community engagement with the new data is missing Extracting/reporting on data enricment beyond the initial platform is hard How can the new Commons impact metrics dataset and APIs be integrated into existing dashboards and tools so they are more stable? Improve dashboards - create subject specified dashboard

Other workflow comments

[edit]

Transcription of the final poster with stickie notes.

  • Rights clearance triage helper
  • I've done two professional residences and have never seen the LOD workflow before! so frustrating and doubles the need for clearer routes to GLAM wiki
  • Program and Events Dashboard - I would like to learn how to use it more
  • Extract enrichment to allow for rountripping - is roundtripping beyond re-use and reporting?
  • Multilingual – Wikimedia is very good but we have the potential to be great
  • What about adding any other resources for communication to this page? e.g. telegram groups, meetups, etc?
  • I didn't know about the wikidata LOD workflow page and I don't know a lot of these tools so THANK YOU for sharing this info!!!
  • I'd love to see the Commons app used more - lots of potential as it's easy and app format familiar
  • What about adding a section for online training/resources for these tools?
  • An index of tools and purposes and platforms would be useful, including documentation
  • Some tools are very common or part of a platform; others are more hidden (I often only learned of them in real life meetings) Live meetings are the best way to get to know new tools, then good online documentation to use them is necessary
  • Wikibase cloud and specialist wikis
  • High/steep learning curve for many tools
  • A way to see types of licenses at a glance on a category page (we're actively looking for this! details even)

Ideating solutions to obstacles to contribution

[edit]

Overview: This late afternoon session focused on ways to solve challenges that were identified in the GLAM CSI survey's "Barriers to Participation." Six major areas were highlighted, and explained to the attendees. Six posters were spread around the room with the prompts, from 1 to 6.

Participants were asked to choose one of the six areas, and to self-organize into groups and to ideate solutions around the question: "What obstacles do you face when contributing to Wikimedia projects with existing tooling?" The six areas from the GLAM CSI survey results that were presented:

1. Technical and tooling issues 2. Data modeling and metadata challenges 3. Resource and capacity constraints 4. Community issues and cultural barriers 5. Content integration and re-use 6. Sustainable future
Tool reliability

Lack of good documentation

Metrics tools

Lack of standardization of tools and connections

Structured Data uncertainty

API usability

Metadata management and mapping of data sets

Time and skill needed to participate

Training and support

Lack of documentation and learning resources

Notability and standards and community norms

Clash with community practices, undoing edits, nominations for deletion

Cultural sensitivity and ethical guidance

Use of content across Wikimedia

Workflow process improvements

"Round trip" integration

Support for future tool development

Road map unclear, hard to plan for GLAM wiki partnerships

Relationship of GLAM wiki communities with Wikimedia Foundation

Six posters with these prompts were setup around the room. People self-selected to one of the poster groups to work on answering the question:

How might we address and improve.... (an area listed above)

IDEO brainstorming method was employed, where people wrote their ideas on a sticky note, raised their hand with it, announced it to the group. A local facilitator placed it on the paper chart. Groups were encouraged to go for quantity, and to generate as many ideas as possible in 15 minutes. Afterwards, the groups were asked to cluster the raw ideas into thematic areas, and to combine overlapping ideas if appropriate. Then people were asked to add a dot for up to FIVE ideas in order to see where the interest levels were.

We then went around the room to share what each group found.

After that, each group was asked to select at least three major areas that stood out, and add it to a priority matrix: high or low priority, versus being hard or easy to do.

Solutions priority matrix

[edit]

This is a summary of the stickies in the diagram above.

Ideas in the upper-right hand (EASY) quadrant are ideal for implementation, as they are easy, high priority tasks:

  • Examples (case studies) of impactful reuse (5)
  • Ways to encourage reuse (5)
  • GLAM Wish List (6)
  • Systematic tools assessment system by quality (status) and importance (usefulness) (1)
  • Training in tech documentation (6)
  • Understand the future of Structured Data on Commons in the movement (2)

In the middle-right area:

  • Start from project -> search tool (3)
  • Institutional "buddying up" for better understanding (4)
  • Training folks in tech documentation (including reporting bugs, submitting improvement requests, <unknown>, <unknown>, <unnknown>, etc) (6)
  • International GLAM events (3, 4)
  • Recognize low hanging fruit (tools and projects) (3)
  • Support tool developers when there are major changes (1)
  • Step by step recipes for institutions with entry level tasks, 'cookbook' (3)

In the middle area:

  • Build tech capacity, not tools (6)
  • Case studies - What tool is most relevant to reach my goal (3)
  • Truly multi-lingual (5)

In the middle-left area:

  • Prioritize those tools that area easiest to use (3)
  • Impactful improvements to the platforms to make tooling easier (1)
  • Consistent branding (3)
  • Wider definition of GLAM and image flagging for use (5)
  • Charge GLAMs for tech services (6)
  • Wiki community training in local context (4)
  • Better onboarding, learning, training for partners to help them get started mapping data (2)
  • Layering/weighting data "official" vs contributed (5)

In the left (HARD) area:

  • GLAM council or consortium (6)
  • Language help: more tools for non-Roman alphabet (4)
  • An easy to handle surface for Wikidata queries (1)
  • Automated API refresh of GLAM-submitted data on Wikidata (2)
  • More technical support for Wikibase integrations (2)

Main session - Workshop on Documenting Wiki User Stories

[edit]
Wikimania 2024 - GLAM CSI session with Ryan King

The main Wikimania session had roughly 20-25 attendees, and challenged audience members to write their own user stories.

Story templates were given on a Google Docs format that people could customize. Among those started in the session include:

Session: https://wikimania.eventyay.com/2024/talk/UBRGFS/