Wikidata:Lexicographical data/Documentation/Languages/cmn

From Wikidata
Jump to navigation Jump to search
Mandarin
language
Subclass ofChinese Edit
Short nameпаўночнакітайская Edit
Located in the administrative territorial entityHong Kong, Macau Edit
ReplacesMiddle Chinese Edit
Linguistic typologysubject–verb–object, isolating language Edit
Writing systemsinograms Edit
Ethnologue language status1 National Edit
Related categoryCategory:Mandarin pronunciation Edit
Stack Exchange taghttps://linguistics.stackexchange.com/tags/mandarin Edit

The Mandarin Chinese lexicographic guideline (Q9192) is a community guideline to build a consistent dictionary relative to Chinese Mandarin language on Wikidata lexeme.

This draft is first inspired by the Japanese guideline but must be adapted to Chinese language and Wikidata usages.

Context[edit]

Language spoken in People's Republic of China (Q148), Taiwan (Q865), Singapore (Q334).

Replaced the Middle Chinese (Q2016252) and many regional languages.

Dialects :

Writing system sinograms (Q8201), pinyin (Q42222). Alternative transcriptions can be automatically derivated from Hanyu Pinyin.

語彙範疇/Lexical category[edit]

品詞/Part of speech[edit]

Various categorizations exist, but for collaborative purpose, we first adopt the following taxonomy. Please follow accordingly. Future tools will allow splitting a target category in finer subcategory if agreed upon.

品詞以外/Non-words[edit]

Lemma[edit]

言語 code /Language code[edit]

Language code for lemmas in simplified and traditional Chinese characters[edit]

The following is proposed by User:Rdrg109:

Ideally, all lexemes in Standard Mandarin (Q727694) should have lemmas with the language codes zh-hans (for the written form in simplified Chinese characters) and zh-hant (for the written form in traditional characters). Such lexemes shouldn't use the language code zh as it is not clear whether it refers to the simplified or traditional form.

Some users have previously expressed that zh should be used when zh-hans and zh-hant are the same. However, the problem of this approach is that inexperienced users that are not aware of zh-hans and zh-hant might add zh with any of simplified and traditional characters. When that occurs, implying that zh indicates both values are the same will not be always true.

Examples

Proposed by others in Wikidata:Lexicographical data/Best practices:

  • If there are multiple scripts in which a language is generally written, it is desirable for the lemma to contain a representation for each script.
    • Where a correspondence in representation exists between multiple related scripts, repeating that correspondence may not be necessary.
      • For those Mandarin lexemes which have not been affected by character simplification, a single lemma with code 'zh' suffices.

文/Statements[edit]

語彙素/For lexemes[edit]

instance of (P31)[edit]

語義/For senses[edit]

語形/For forms[edit]

語形/Form[edit]

文法的特徴/grammatical feature[edit]

SPARQL queries[edit]

Lexical Masks[edit]

See Wikidata:Lexical Masks.

Tools[edit]

See also[edit]