Wikidata:Property proposal/refers to

From Wikidata
Jump to navigation Jump to search

refers to[edit]

Originally proposed at Wikidata:Property proposal/Generic

   Withdrawn

Motivation[edit]

There are currently 38,710 instances of term (Q1969448). Many terms refer to something specific. It would be nice to have a property to express such relationships.

Perhaps this new property should also apply to lexemes? I am not very familiar with lexemes. I guess you could also model: LXitem for this sense (P5137)QX + QXrefers toQY.

--Push-f (talk) 07:58, 5 November 2022 (UTC)[reply]

Discussion[edit]

  •  Oppose. The more than 30,000 instances of term (Q1969448) and many similar statements are probably a leftover from the time before the lexeme database was established, and are in many cases specific to the English language.

An example: Between 2016 and 2021, the item curriculum (Q207137) was stated to be an instance of (P31) of Latin phrase (Q3062294) as well as a sequence (Q20937557) (of learning), which constitutes a confusion of subjects. The English word "curriculum" may be a Latin phrase, but the concept of a curriculum (an educational plan) is not. Wikidata serves many languages besides English, and while the Swedish word "läroplan" (which you will find among the many labels listed) indeed means curriculum, "läroplan" is most definitely not a Latin phrase.

This confusing claim was however deleted last year, but many similar cases remain, perhaps not as easily detected when the labels are missing for many languages. But as is stated as an alias for map–territory relation (Q1963130), "the word is not the thing". There is a relation between a word (in any language) and the concept or "thing" it represents, but it's not an identity relation (except in the case of the word (Q8171) "word" which is what it represents - in English only).

I appreciate that you bring up the lexeme database in this context, because that is where statements about specific words or phrases in a language belong. Take turnip (Q7856056) as an example; it has a label and description in English only, as it originates with an article in English Wikipedia. But the article isn't about turnips (in whatever sense of the word); it's about the word "turnip" itself, written from a linguistic rather than botanical perspective.

In many respects, the WP article is like a disambiguation page, providing references to other articles dealing with the different real-world items implied by this word. In Wikidata, disambiguation page items are treated differently from all other items due to them being limited to one language or another. Yet this article isn't recognized as a disambiguation page, it's a "terminology" article. How many of those are there in WP? Has anyone thought up a plan for how to deal with them in Wikidata?

In my opinion, statements referring to the written or spoken elements of specific languages don't belong in Wikibase main itemspace at all; they belong in the lexeme database. There is a lexeme turnip (L312484), which is an English noun. It currently has one sense defined, referring to the item turnip (Q3916957), which looks correct to me. But if the same word is also used for Brassica rapa (Q3384), rutabaga (Q158464), Pachyrhizus erosus (Q517283) and radish (Q7224565) (even if just occasionally or in a limited geographical area), additional senses should be defined for that lexeme, the linguistic statements could be moved there, and references made back to the corresponding items.

A Swedish word for Brassica rapa (Q3384) is rova (L32887). Just like the English word "turnip", "rova" has multiple meanings in Swedish; they just haven't been added to the lexeme as different senses yet. Those additional senses are pocket watch (Q849813) and falling (Q333495) (or fall (Q11620540)), which differ from the common turnip in that they aren't edible. But fall (Q11620540) is usually called fall (L35716) in Swedish, a word that also means legal case (Q2334719) or simply case (L3910) in English, another meaning of which is suitcase (Q200814), and the chain goes on...

As the majority of those 30,000 terms should probably move to the lexeme database, where there is already a property item for this sense (P5137), I see no reason to duplicate that property in Wikidata just to be able to make identical statements elsewhere. We need to find a systematic plan for those "terminology" WP articles instead. --SM5POR (talk) 00:20, 6 November 2022 (UTC)[reply]

Thanks for your thorough response :) I withdraw my proposal. I do think that the current implementation of Wikidata lexemes very much leaves things to be desired, in particular:
  • Related lexemes are not automatically linked on data item pages.
  • Lexemes do not show up in the autocompletion search.
I think I'll try to raise that point of discussion at Wikidata talk:Lexicographical data and the phabricator issue tracker for WikibaseLexeme (Q28925815). --Push-f (talk) 04:59, 6 November 2022 (UTC)[reply]