Wikidata:Property proposal/said to be the same as (lexeme)
said to be the same as lexeme
[edit]Originally proposed at Wikidata:Property proposal/Lexemes
Description | some source considers this lexeme to be the same lexeme or a spelling variant of a different lexeme |
---|---|
Data type | Lexeme |
Domain | lexeme |
Example 1 | Zerealie (L612704) → Cerealie (L612705) |
Example 2 | Cerealie (L612705) → Zerealie (L612704) |
Example 3 | curb (L16605) → kerb (L721633) |
Example 4 | Handlung (L450459) → Handlung (L744620) |
Motivation
[edit]We usually model spelling variants as separate forms: Geografie/Geographie. just stumbled upon this: some dictionaries might consider a pair of lexemes as spelling variants while others don't. In these cases we should depict the lowest common denominator I think. So Duden considers Duden:Zerealie and Duden:Cerealie as two distinct lexemes that share the same meaning while DWDS considers one to be a spelling variant of the other: DWDS: Zerealie. I propose to go with the source that has the finer modelling and connect the two lexemes with this new property that is roughly equivalent to said to be the same as (P460) that always should have a statement supported by (P3680) qualifier and a source. The property should also be symmetric.
If you have a better idea for the label, or if you have other examples, please leave a comment – Loominade (talk) 09:13, 11 October 2022 (UTC)
Discussion
[edit]- Comment This would not only be useful for pure spelling variants (like
Geogra(f|ph)ie
), but also for other variants within the same standard. For Norwegian Bokmål I have been modelling this by adding synonym (P5973) for each sense (partly the reason why Bokmål has the highest synonym count by far), but that gets cumbersome when there are many senses. One extreme example for Norwegian Bokmål is hjemmelaget (L588053)/heimelaga (L588054)/heimelagd (L588055)/heimelaget (L588056)/hjemmelaga (L588057)/hjemmelagd (L588058); they all mean exactly the same thing (and share an identifier in the official Bokmål dictionary), but have different lemmas and inflection paradigms, so adding them all as different forms on the same lexeme would not be a good solution in my opinion. So a property like this would be a welcome addition in my opinion; but it wouldn't solve the problem with synonyms completely, because what if there was another word with a completely derivation that meant the same thing? I'd still have to add all of them as synonyms to the other word… 🤔 Jon Harald Søby (talk) 12:50, 14 October 2022 (UTC)- I think your example would not be a spelling variant. But I'm not sure, as I am unfamiliar with Bokmål - Loominade (talk) 08:33, 18 October 2022 (UTC)
- I changed the proposal to not exclusively cover spelling variants but also lexemes that some sources consider to be identical lexemes in general, independently from whether they are spelled the same -- Loominade (talk) 11:08, 9 December 2022 (UTC)
- I could support a "spelling variant" property (as previously proposed) but I'm opposed to a generic property like this, because it's unclear what a given statement is supposed to represent. It could be spelling variants, it could be synonyms, it could be related words (e.g. different parts of speech) under the same headword in a dictionary, it could be unrelated words (e.g. different etymologies) under the same headword in a dictionary, or something else entirely... and if we don't know what it means, how can we use the data for anything? (I know we have said to be the same as (P460) but it's a problematic property and we shouldn't copy it).
How would we decide that a source considers two lexemes to be identical? Listing them under the same identifier doesn't mean the source considers them to be identical, only that the source has decided, for whatever reason, to cover both under the same entry and we can already describe that for identifiers using the qualifier identifier shared with lexeme (P9531). "Spelling variant", on the other hand, is something that is often quite clearly stated by a source, whether it has separate entries for them or not.
Also: I don't think "said to be" in the label is necessary, we have references, statement supported by (P3680) and statement disputed by (P1310) to describe who says what. - Nikki (talk) 14:48, 21 December 2022 (UTC)
- I could support a "spelling variant" property (as previously proposed) but I'm opposed to a generic property like this, because it's unclear what a given statement is supposed to represent. It could be spelling variants, it could be synonyms, it could be related words (e.g. different parts of speech) under the same headword in a dictionary, it could be unrelated words (e.g. different etymologies) under the same headword in a dictionary, or something else entirely... and if we don't know what it means, how can we use the data for anything? (I know we have said to be the same as (P460) but it's a problematic property and we shouldn't copy it).
- I changed the proposal to not exclusively cover spelling variants but also lexemes that some sources consider to be identical lexemes in general, independently from whether they are spelled the same -- Loominade (talk) 11:08, 9 December 2022 (UTC)
- I think your example would not be a spelling variant. But I'm not sure, as I am unfamiliar with Bokmål - Loominade (talk) 08:33, 18 October 2022 (UTC)
- Support ArthurPSmith (talk) 17:51, 17 October 2022 (UTC)
- Support UWashPrincipalCataloger (talk) 19:12, 20 November 2022 (UTC)
- Both of these support votes were made when the proposal was "spelling variant", not the current "said to be the same as lexeme" proposal. @ArthurPSmith, UWashPrincipalCataloger: you should check whether you still agree with the proposal now that it has changed. - Nikki (talk) 14:48, 21 December 2022 (UTC)
- The new label seems fine to me. ArthurPSmith (talk) 16:42, 21 December 2022 (UTC)
- Yes, I still support. UWashPrincipalCataloger (talk) 23:47, 21 December 2022 (UTC)
- Both of these support votes were made when the proposal was "spelling variant", not the current "said to be the same as lexeme" proposal. @ArthurPSmith, UWashPrincipalCataloger: you should check whether you still agree with the proposal now that it has changed. - Nikki (talk) 14:48, 21 December 2022 (UTC)
- Comment A third example was missing, so I added what I think is a correct English one. See https://www.merriam-webster.com/dictionary/curb and https://www.dictionary.com/browse/curb. UWashPrincipalCataloger (talk) 22:50, 26 October 2022 (UTC)
Weak opposeI don’t see why we wouldn’t use/repurpose said to be the same as (P460) (presumably with relaxed allowed-entity-types constraint (Q52004125)). ―BlaueBlüte (talk) 06:18, 1 February 2023 (UTC)- @BlaueBlüte: because this property can only reference Q-items, not lexemes. the same reason we have identifier shared with (P4070) and identifier shared with lexeme (P9531) -- Loominade (talk) 08:29, 1 February 2023 (UTC)
- Oh, I see, Wikidata lacks a union data type of item and lexeme. We should probably consider establishing a subproperty of (P1647) relationship then. Switching to Support. ―BlaueBlüte (talk) 08:56, 1 February 2023 (UTC)
- @BlaueBlüte: because this property can only reference Q-items, not lexemes. the same reason we have identifier shared with (P4070) and identifier shared with lexeme (P9531) -- Loominade (talk) 08:29, 1 February 2023 (UTC)
Done said to be the same as lexeme (P11577) Midleading (talk) 04:57, 11 February 2023 (UTC)