Celtic Knot Conference 2020/Submissions/Search Support for Minority Languages
Appearance
- Title
- Search Support for Minority Languages
- Type of submission
- short presentation + Q&A session
- Author of the submission
- Trey Jones / TJones (WMF)
- Language of presentation
- English
- E-mail address
- tjoneswikimedia.org
- Country of origin
- U.S.
- Affiliation
- Wikimedia Foundation Search Platform Team
- Personal homepage or blog
- My posts on Wikimedia Foundation blogs—new blog, old blog, older blog
- Abstract
- A brief overview of language-specific processing in search—tokenization, normalization, stemming, and stop words—using primarily English and Irish as examples, with additional examples from other languages that offer unique challenges—segmentation, transliteration, complex orthography, and a lack of software support—such as Chinese, Serbian, Khmer, and Mirandese.
- What will attendees take away from this session?
- Attendees will have a better understanding of the language-specific processing that goes into search, hopefully with an eye towards collaboration with the WMF Search Platform Team to improve search in the languages they use.
- Theme of session
- *Language technology
- Slides or further information
- Slides and notes are on Commons.
Links
[edit]Interested attendees
[edit]If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest. Sign with a hash and four tildes. (# ~~~~).
- —M@sssly✉ 13:41, 25 June 2020 (UTC)
- --Psubhashish (talk) 04:30, 3 July 2020 (UTC)
- ----Maria zaos (talk) 11:01, 10 July 2020 (UTC)
- VIGNERON * discut. 12:46, 10 July 2020 (UTC)
Friendly space: Because we want to provide a great experience for all participants and foster collaboration, please keep these few guidelines in mind: let’s be respectful to each other, encourage participation and a positive atmosphere, be mindful of how our actions impact others, and feel free to ask for help at any time. Friendly Space Policy in full.