Jump to content

Talk:Community Wishlist/Wishes/Improved autosuggest in default search

Add topic
From Meta, a Wikimedia project coordination wiki
Latest comment: 29 days ago by GLederrey (WMF) in topic Fuzzy matching
This page is for discussions related to the Community Wishlist/Wishes/Improved autosuggest in default search page.

  Please remember to:

Fuzzy matching

[edit]

@Slowking Man This is a problem of fuzzy matching. The possibilities for fuzzy matching are all described in mediawikiwiki:Help:CirrusSearch#Words,_phrases,_and_modifiers. If you add a tilde like this https://en.wikipedia.org/w/index.php?search=Australlite~&title=Special:Search&profile=advanced&fulltext=1&ns0=1 you immediately get the result you were looking for. It also drastically increases false matches when having multiple words, as it also influences the allowed proximity between those multiple words.. I'm not sure if anyone ever looked into making fuzzy the default, but I suspect so.

Maybe an interesting idea would be for situations with 0 results, to give the user the option to repeat the search with higher fuzzy matching ? —TheDJ (talkcontribs) 09:46, 21 August 2024 (UTC)Reply

I think a better title for this wish would be 'Search should be better at handling typos'. That is easier to understand for most people and gives a good summary of the intent. —TheDJ (talkcontribs) 09:49, 21 August 2024 (UTC)Reply
Yeah thanks. My immediate writing often benefits from some later editing --Slowking Man (talk) 22:31, 23 August 2024 (UTC)Reply
My focus here is on the readers: the vast public that Wikimedia exists to serve. To people who are not computer nerd types (yours truly being definitely among said nerd types) the term "fuzzy matching" is gibberish and they have no idea what CirrusSearch or "search modifiers" are or that you can put funny-looking symbols into the text box to make it do things. They want "the computer to just work". They don't want to think about it. The job of people who wish to serve them is to "make it just work". (Could you fix an automobile or aircraft on your own? "The flaperons are acting weird, sometimes they stick and don't move at all, can you check that out?" Analogy here, aircraft controls : MW search function, or more broadly MW altogether)
Various "desktop server etc" Linux distributions (one being Fedora Linux) ship with "helpers" to identify tyoped command names in the command-line interface (e.g. "remdir" instead of "rmdir"). I think it should not be too difficult to address the simplest problem case of: one page exists, with title an edit distance of 1 from the search query string. A good starting point, and any work to tackle more complicated cases can build on that.
I think a good criterion for "public-facing" stuff like the search bar that's on every page, is: how useful is X to the Average Jane reader, and, how can X be improved in that regard? Who comes immediately to my mind are the "not computer people" in my family. Good usability testing subjects. Hey if requested I'd be glad to corral some of them into trying various features and reporting their feedback. (Note I don't use ping unless requested. Don't want to annoy. Echo has a "subscription" feature for discussions and so I don't want to presume others' desires.) --Slowking Man (talk) 22:31, 23 August 2024 (UTC)Reply
Agree. Also want to express support for this proposal – I think it's very important. Moreover, I think fuzzy matching and potential typo detection should be used also when there are many results. E.g. display one or a few results from the fuzzy matching in between the results. It should also suggest corrections if it found articles and/or categories with a different similar title. However, it seems there already is something like that to some extent with "Did you mean:" below the search bar when entering something like that. So probably the proposal should be clearer on what should be changed how, including maybe a list of examples. Prototyperspective (talk) 13:07, 3 November 2024 (UTC)Reply
For the "Australlite" example, the correct result is the second one in the autocomplete, which seems reasonable. I'm sure there are other cases where it doesn't work as well, it would be great to know of more examples of where this breaks down. Due to the way autocompletion is implemented, using edit distance isn't really feasible, at least not without changing the technical stack behind it. We might be able to do other kind of improvements. GLederrey (WMF) (talk) 16:23, 27 November 2024 (UTC)Reply