Wikidata:Property proposal/common misspellings

From Wikidata
Jump to navigation Jump to search

common misspellings[edit]

Return to Wikidata:Property proposal/Lexemes

   Under discussion
Descriptioncommon misspellings of this form
Representsmisspelled word
Data typeString
Domainform
Example 1L3280-F1 → "fuscia" (incorrect for fuchsia (L3280))
Example 2L3280-F1 → "fuschia" (incorrect for fuchsia (L3280))
Example 3L36116-F1 → "abbonnemang" (incorrect for abonnemang (L36116))
Example 4L36116-F1 → "abbonemang" (incorrect for abonnemang (L36116))
See alsoWikidata:Property proposal/correct form

Motivation[edit]

This makes it possible to easily create e.g. a spell checker that recommends a correction. See discussion here.--So9q (talk) 21:23, 22 March 2020 (UTC)

Discussion[edit]

Symbol support vote.svg Support I support this proposal in this form (more in the linked discussion) with the condition we have applicable definition of common misspelling. --Lexicolover (talk) 12:17, 24 March 2020 (UTC)

See discussion here: Wikidata_talk:Lexicographical_data#Common_misspellings_data--So9q (talk) 19:55, 24 March 2020 (UTC)

Symbol neutral vote.svg Neutral we need something to solve this problem but I'm not sure if a simple property is the simpliest solution here. A broader system for all sort of variants would be more difficult but better in the long run as correct/incorrect spelling is often not a binary situation (see "colour"/color" in English, correctness is contextual here). Cheers, VIGNERON (talk) 20:44, 25 March 2020 (UTC)

Symbol neutral vote.svg Neutral: what is your definition of "common"? It sounds a bit arbitrary... Nomen ad hoc (talk) 07:30, 26 March 2020 (UTC).

@Nomen ad hoc: that point can easily be objectively defined by the frequency. If a misspelling is over a threshold, let's say 5%, then it's "common". We can use tool like Google Books Ngram Viewer to see the frequency. We can also rely on sources, dictionaries (especially the descriptivist one) often give the common misspelling. Cheers, VIGNERON (talk) 08:55, 26 March 2020 (UTC)
@Nomen ad hoc: see proposed definition here: Wikidata_talk:Lexicographical_data#Common_misspellings_data--So9q (talk) 10:41, 26 March 2020 (UTC)

Symbol support vote.svg SupportFinn Årup Nielsen (fnielsen) (talk) 11:25, 26 March 2020 (UTC)

  • Pictogram voting comment.svg Comment I preferred the initial version of this proposal [1] or the earlier proposal (correct form) using form datatype. --- Jura 15:49, 26 March 2020 (UTC)
    • Actually, the direction of the earlier proposal seems preferable (correct form). If the form is only known as a misspelling, "grammatical feature" could include that too. If it's also something else, the "grammatical feature" would just include that "something else". --- Jura 13:28, 1 April 2020 (UTC)