Shortcut: WD:PP/L

Wikidata:Property proposal/Lexemes

From Wikidata
Jump to navigation Jump to search

Property proposal: Generic Authority control Person Organization
Creative work Place Sports Sister projects
Transportation Natural science Computing Lexeme

See also

[edit]

This page is for the proposal of new properties.

Before proposing a property

  1. Search if the property already exists.
  2. Search if the property has already been proposed.
  3. Check if you can give a similar label and definition as an existing Wikipedia infobox parameter, or if it can be matched to an infobox, to or from which data can be transferred automatically.
  4. Select the right datatype for the property.
  5. Read Wikidata:Creating a property proposal for guidelines you should follow when proposing new property.
  6. Start writing the documentation based on the preload form below by editing the two templates at the top of the page to add proposal details.

Creating the property

  1. Once consensus is reached, change status=ready on the template, to attract the attention of a property creator.
  2. Creation can be done 1 week after the creation of the proposal, by a property creator or an administrator.
  3. See property creation policy.

Wikibase lexeme

[edit]
   Under discussion
Descriptionsuggest the relationship between similar Javanese lexemes, between its various registers (social variants), mainly ngoko (Q12500634) register (plain Javanese), krama (Q12492493) register (high/polite Javanese), and madya (Q13091955) register (middle Javanese)
Data typeLexeme
Domainlexeme senses, in particular forms with spelling alternatives
Example 1kowé/kowe/ꦏꦺꦴꦮꦺ (L2328) "ngoko" register and sampéyan/sampeyan/ꦱꦩ꧀ꦥꦺꦪꦤ꧀ (L1322036) "krama" register both means "you", but have different social register, where the former is considered casual, and the latter more formal and polite. For reference, please see the online Javanese dictionary in https://www.sastra.org/leksikon (make sure to tick "kata utuh" checkbox when searching to exclude partial matches). For more information regarding this ngoko/krama, see the introduction in this Javanese-English dictionary: https://www.sastra.org/bahasa-dan-budaya/kamus-dan-leksikon/1703-javanese-english-dictionary-horne-1974-1968, especially section 4.1. Organization of the Entries, and 5. SOCIAL STYLES. See also: en.wp, https://jv.wiktionary.org/wiki/Wikisastra:Tabel_krama-ngoko jv.wikt
Example 2(update 18 August) gunung/ꦒꦸꦤꦸꦁ (L680638) (ngoko), redi/rêdi/ꦉꦢꦶ (L45622) (krama)
Example 3(update 18 August) endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183) (ngoko), sirah/ꦱꦶꦫꦃ (L999025) (krama), mastaka/ꦩꦱ꧀ꦠꦏ (L413863) (krama inggil)

Motivation

[edit]

I'm planning to add more Javanese lexeme, but there are many words with different registers, and using synonym (P5973) is not correct, because although they have different meaning, but they have different usage, and also there are many synonyms within the same registers (for example, "you" have 4 or more synonyms in "ngoko", and 3 or more different words in "krama"). Using a dedicated property would enable to search and query the relationship between different registers. As you can ses from the links provided above, the relationship between these registers are not one-to-one, and while "ngoko" form is considered the default, not all "ngoko" have "krama" equivalent (only about 1000 without affixation, much more with affixation), much less "madya" and other register ("krama inggil", etc.) and some "krama" are equivalent to several "ngoko", because they are not true "synonym" equivalent, but rather substitutions words for different social context. Therefore this property should support multiple relationships. For example:

"you"
  • ngoko: kowe, (synyonym: ko'ên, kohên, kowên)
  • madya: samang, andika, (synyonym: dika)
  • krama: sampeyan, (synyonym: bênampeyan, bênangpeyan)
  • krama inggil: panjênêngan, (synyonym: nandalêm, paduka)
"to say, to tell"
  • ngoko: kandha, (synyonym: clathu, ngomong, kêcap, wara, gotèk, cluluk, wuwus, etc.)
  • krama: criyos, sanjang, (synyonym: sajang, wicantên, etc.)
  • krama andhap: matur
  • krama inggil: andika, ngêndika, (synyonym: unandika)
  • kawi: angling
(things related to hand / "tangan")
  • ngoko: tangan, krama inggil: asta, simple noun, but the verbs get complicated:
  • krama inggil: ngasta (ng- + asta) serve as substitutions for ngoko: 1 nyambut gawe (to work), 2 nggawa (to bring, take, carry), 3 nandang (to do), 4 nyekel (to hold, grasp, to handle), 5 mulang (to teach)

Bennylin (talk) 18:23, 9 August 2024 (UTC)[reply]

Update 18 August

[edit]

Just to make it clearer, on behalf of Javanese speakers, we would like to request 5 new properties:

The first and foremost reasoning is that most Javanese dictionaries (monolingual, bilingual jv-id, jv-en, jv-nl) separate Javanese lexemes into mainly these 5 registers and link to their counterparts seamlessly. Secondly, the current available property (synonym (P5973)) doesn't fit our need for specific-linking from one lexeme to another - besides, synonymy in Javanese is called dasanama (lit. ten names), instead of register (Jv: unggah-ungguh) - and in the future I believe using these 5 new properties would make it much easier to "transform" words, phrases, sentences from one register to another (e.g. via WikiFunctions or other tools).

I've given in the form above two new examples:

  • mountain: gunung/ꦒꦸꦤꦸꦁ (L680638) (ngoko), redi/rêdi/ꦉꦢꦶ (L45622) (krama)
    • L680638-S1, instead of having property "synonym: L45622-S1", should instead have property "krama variations: L45622-S1"
    • Likewise L45622-S1, instead of having property "synonym: L680638-S1", should instead have property "ngoko variations: L680638-S1"
    • both lexemes could have the following synonyms: ancala, indra, endra, ancala, ardi; ardya, arga, asalingga, awukir, aldaka, hyang parwata, imandri, himawan, himawat, nala, cala, dri, tambana, wanawasa, wukir, wukira, parsa of parswa, parasu, parswa = paraswa, praswa, parwaka, par(of pwar)wata, prawata, parja, pradesa, pra(of prê)bata, par(of pêr)bata, par(of pêr)bwata, par(of pêr)byata, padaka, jambangan, mahahimawan, mahendra, mèru, malaya, gana, gunungan, giri, gori, girindra, girinata, gorata, giriwara, gêgêr, basulingga, byata, ngasrama. These all means "mountain" in Javanese language
  • head: endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183) (ngoko), sirah/ꦱꦶꦫꦃ (L999025) (krama), mastaka/ꦩꦱ꧀ꦠꦏ (L413863) (krama inggil)

Discussion

[edit]
They're incorrect
So, you see, many synonym of endhas/sirah/mastaka (head) have the register ngoko, krama, or both, but none of them are paired as _the_ register variant to the triplet endhas/sirah/mastaka. Therefore we need dedicated properties to store these values. Most have one-to-one relations, while some rarely have one-to-two or two-to-one, but never one-to-many. Bennylin (talk) 11:04, 23 August 2024 (UTC)[reply]
@Mahir256, would you like to give your opinion? Regards, ZI Jony (Talk) 18:31, 16 September 2024 (UTC)[reply]

‎Dwelly entry ID

[edit]
   Under discussion
Descriptionidentifier for an entry in the Scottish Gaelic dictionary compiled by Edward Dwelly, as hosted on faclair.com and dwelly.info
Data typeExternal identifier
Domainlexeme
Allowed values[0-9A-F]{32}
Example 1uisge (L8297)A35FE50DA8851697BBE614BF50FBFEC5
Example 2feusag (L308158)3A3797D93208BE668A3B8F286B3A6AA0
Example 3bainne (L1080895)9F0005B2A2D8BF5E1921F1BE6C48DCD1
Example 4leabaidh (L312378)4BA49BF33B14A11715179AA328302669
Example 5barail (L312372)E89628169E2C07EB5641F1A89D3C4B61
Example 6aon (L727347)7291043974FB286C78C07317D3AA1C74
Sourceexternal reference URL
Planned useAdd Dwelly entry identifiers to Scottish Gaelic lexemes
Number of IDs in source77769
Expected completenesseventually complete (Q21873974)
Formatter URLhttps://www.faclair.com/ViewDictionaryEntry.aspx?ID=$1
See also‎Am Faclair Beag ID (P12315)

Motivation

[edit]

This is one of two dictionaries hosted on faclair.com/dwelly.info, the other being the source used with ‎Am Faclair Beag ID (P12315). (Note the differing formatter URL from P12315, which distinguishes an entry in one dictionary from the other dictionary.) Mahir256 (talk) 18:35, 30 September 2024 (UTC)[reply]

Discussion

[edit]

‎Indo-Tibetan Lexical Resource ID

[edit]
   Under discussion
Descriptionidentifier for a Sanskrit lexeme in the Indo-Tibetan Lexical Resource (ITLR)
RepresentsIndo-Tibetan Lexical Resource (Q129502277)
Data typeExternal identifier
Example 1झर (L1137922) 498768
Example 2अज (L1368075) 38131
Example 3यकृत् (L1368084) 34156
Formatter URLhttps://www.itlr.net/hwid:$1

Motivation

[edit]

Indo-Tibetan Lexical Resource (Q129502277) is a small termbase consisting of Indic vocabulary relevant to Tibetan Buddhist texts which would be useful to link to Sanskrit lexemes. -عُثمان (talk) 23:51, 30 September 2024 (UTC)[reply]

Discussion

[edit]

‎A digital concordance of the R̥gveda ID

[edit]
   Under discussion
Descriptionentry for a Sanskrit lexeme in Lubotsky’s concordance of the R̥gveda
RepresentsA digital concordance of the R̥gveda (Q127123052)
Data typeExternal identifier
Example 1अपि (L747428) 1413
Example 2अन्ध (L929039) 1229
Example 3रेणु (L1132860) 22137
Formatter URLhttps://dictionaries.brillonline.com/search#dictionary=rvconcordance&id=rvc-$1

Motivation

[edit]

This property is proposed for linking to Sanskrit lexemes attested in the R̥gveda. -عُثمان (talk) 00:03, 1 October 2024 (UTC)[reply]

Discussion

[edit]

Wikibase form

[edit]

Wikibase sense

[edit]

Other

[edit]

has kanji reading

[edit]
Descriptionphonetic reading or pronunciation of the kanji
Data typeString
Domaininstances of sinogram (Q17300291)
Example 1(Q3594955)よん
Qualifiers
subject lexeme (P6254)/よん (L625228)
sinogram reading pattern (P5244)kun'yomi (Q1147749)
Example 2(Q3594955)
Qualifiers
subject lexeme (P6254)/ (L641752)
sinogram reading pattern (P5244)on'yomi (Q718498)
Example 3(Q3594998)うみ
Qualifiers
subject lexeme (P6254)/うみ (L5120)
sinogram reading pattern (P5244)kun'yomi (Q1147749)
Example 4(Q3594998)カイ
Qualifiers
subject lexeme (P6254)no value
sinogram reading pattern (P5244)on'yomi (Q718498)
See alsosinogram reading pattern (P5244)

Motivation

[edit]

In japanese, chinese characters can be read as different vocalisations. With lexemes we currently only cover those sounds that make up actual words. See the examples /よん (L625228) and / (L641752) where forms that use the kanji have a sinogram reading pattern (P5244) statement.

Sometimes however, readings don't make up real words but are merely affixes that can be used in compounds. We currently clutter these readings under a lexeme, that happens to have the same Kanji representation. But those usually have a different ethymology and external ids that don't apply to the reading. These readings also sometimes don't share the same senses.

I want to split all these lexemes, so that every lexemes only represents a single reading. Those readings that do not constitute words would be deleted in the process, but I'd strive to preserve those. And I think the sinogram entity is the right place for that. –Shisma (talk) 10:00, 27 August 2024 (UTC)[reply]

I'm merely interested in, but am not a speaker of japanese. If I said something horribly wrong here, please correct me. –Shisma (talk) 10:18, 27 August 2024 (UTC)[reply]

should we transliterate on'yomi readings to katakana? – Shisma (talk) 11:45, 27 August 2024 (UTC)[reply]

Indeed, in kanji dictionaries published in Japan, on'yomi (Q718498) readings are usually written in katakana (Q82946). --Okkn (talk) 01:37, 28 August 2024 (UTC)[reply]
updated –Shisma (talk) 14:14, 28 August 2024 (UTC)[reply]

Discussion

[edit]

@Duesentrieb, Afaz, Was a bee, Deryck Chan, NMaia, Okkn: pinging everybody involved with the proposal of sinogram reading pattern (P5244)Shisma (talk) 10:16, 27 August 2024 (UTC)[reply]

Bravefoot
Okkn (talk)
Camillu87
User:Araisyohei
Tris T7 TT me
higa4
Mcampany
Mzaki
NMaia
Siramatu
Mochimap
Spinster
Haansn08
Shisma(UTC)
Syunsyunminmin
Yirba

Notified participants of WikiProject Japan

Назва українською мовою (uk) – (Please translate this into English.)

[edit]
   Under discussion
Descriptiondifficulty of word by the level of JLPT
RepresentsJapanese-Language Proficiency Test (Q1071147)
Data typeLexeme
DomainJapanese lexemes
Allowed valuesN1, N2, N3, N4, N5
Example 1JLPT levelN3
Example 2JLPT levelN3
Example 3JLPT levelN1
Sourcehttps://en.wiktionary.org/wiki/Appendix:JLPT
Expected completenesseventually complete (Q21873974)
Single-value constraintyes

Motivation

[edit]

JLPT is the standard test of Japanese knowledge for non-native speakers. A lot of the resources for learning Japanese often times have information about what level certain material is (N5 is the lowest, N1 highest) and learners orient onto this data. It seems to be significant enough to be included into Wikilexemes schema. English Wiktionary already has an Appendix where you can find Japanese words by their JLPT level. Bicolino34 (talk) 19:13, 29 September 2024 (UTC)[reply]

Discussion

[edit]