Wikidata:Données lexicographiques/Meilleures pratiques

From Wikidata
Jump to navigation Jump to search
This page is a translated version of the page Wikidata:Lexicographical data/Best practices and the translation is 45% complete.

Nous documentons ici les meilleures pratiques mises en oeuvre à travers le temps par différents contributeurs, le plus souvent après des descriptions d'eux dans le groupe telegram et en d'autres endroits.

Devriait-il y avoir un lexème pour cela ?

  • Il devrait être évident qu'un lexeme existe dans un language à partir du moment où ce lexème est créé dans Wikidata.
    • The better documented a language is in general, the more the above should be treated as a requirement rather than merely a best practice.
      • As a result, for languages like English, French, Spanish, Mandarin, Russian, and Arabic—that are supported by nation-states and that, by virtue of being used to communicate all sorts of information among very large groups of people, are expected to have diverse vocabularies—this should be taken as obligatory regardless of one's fluency in that language.
      • For less well-documented languages like Breton, Sindhi, Acehnese, and Guarani, this remains merely a strong recommendation: once a resource is found for that language, attempts should be taken to use it as evidence for as many existing lexemes in that language as possible.
      • For even less well-documented languages like Skolt Sami, Igbo, Angika, and Cia-Cia, this is much less binding even as a recommendation—especially when you are a native speaker of that language and can thus vouch for the use of a particular lexeme in your language community.
    • L'évidence de l'exisence d'un lexème peut être indiquée de différentes manières :
  • In general, while individual words that aren't merely inflections of other words might warrant lexemes, non-idiomatic phrases typically do not warrant them, since they may be treated as the sum of their parts.
    • This does not necessarily discount the addition of non-idiomatic meaning senses to lexemes which do have idiomatic meanings, however, and which have those idiomatic meanings as senses already.

Lemme

  • The lemma of a lexeme should ideally be the representation of that lexeme that is provided in a dictionary. What representation this is will generally depend on the lexeme's language and lexical category.
    • Take Indo-European languages: for nouns and adjectives, this may reflect some combination of nominative case, singular number, and masculine gender; for verbs, this may be the infinitival or verbal noun form.
    • Other languages may present lemmata differently, for which a non-exhaustive list is given below:
      • An Arabic verb generally uses the masculine third-person singular perfect active indicative as a lemma ('كَتَبَ' for 'to write').
      • A Korean verb generally uses the verb stem followed by the dedicated citation suffix '-다' ('가다' for 'to go').
      • An isiZulu verb generally uses the verb stem on its own, including the final vowel 'a' ('shaywa' for 'to be struck').
  • S'il existe plusieurs écritures courraient utilisées pour une langue, il est souhaitable que le lemme contient une représentation pour chaque écriture.
    • Lorsque une correspondance de représentation existe entre plusieurs écritures liées, la répétition de cette correspondance peut ne pas être nécessaire.
      • Pour les lexèmes mandarin qui n'ont pas été affectés par la simplification des caractères, un seul lemme avec le code "zh" suffit.
      • For those Esperanto lexemes which do not change under 'hsistemo' or 'xsistemo', a single lemma with code 'eo' suffices.

Catégories lexicales

  • In general, a instance of (P31) value on a lexeme should be more specific than the lexeme's lexical category.

Déclarations des lexèmes

Dérivations

Formes

Une grande partie de la section 'cela devrait-il être un lexème?' peut également s'appliquer içi.

To help establish the existence and use of a lexeme, at least one form should be referenced—perhaps on a usage example (P5831) statement qualified with subject form (P5830) [the form in question], or on another statement (described by source (P1343), attested as (P7855) or attested in (P5323) are possible other properties). The goal is to have all forms attested or referenced with at least one date, preferably with these dates years apart.

Sens

Traductions