Wikidata:Lexicographical data/Focus languages/Requirements

From Wikidata
Jump to navigation Jump to search

The following set of criteria will be used to guide our decision making, but should not be regarded to exclude potential candidates. In fact, go ahead and submit your application - it is then our task to make sure that the set of selected languages fulfill as many of the criteria given here as possible. Also, the criteria are not necessarily equally weighted, and are not in a particular order.

Potential group requirements

  1. The focus languages can demonstrate interest from an existing group of Wikimedia contributors, either formal or informal, who are native to that language and culture, to seed the language community.
  2. The group can show believable potential to grow their community.
  3. Wikipedia and Wiktionary in the focus languages are diverse regarding size and maturity.
  4. The focus languages should be of a currently under-represented Wikipedia language edition. However, if there is already a very active and well-covered Wikipedia and/or Wiktionary in that language, we should clearly show that the community is open to use baseline content for their existing missing articles from Abstract Wikipedia and/or to use lexicographic data from Wikidata in Wiktionary (e.g. the given Wikipedia uses the article placeholder extension).
  5. The group has at least two members who can communicate in English and who are willing to be the long-term communication facilitators.
  6. It would help if there is coding skill available in the group.
  7. Long-term interest. Keep in mind that Abstract Wikipedia is a project that will be in initial development at least until 2023, and that some of the results and benefits we aim for will not become 'real' until then. The seed group should be able to argue that it can sustain itself for such a time period.

Potential language requirements

  1. We would like for each of the focus languages to be from distinct and different language families.
  2. We would like for at least one, better two of the focus languages to use at least one non-Latin script widely.
  3. At least one of the selected languages uses a right-to-left script widely.
  4. At least one of the languages uses more than one script (examples).
  5. We would like for the focus languages to have native language communities of more than three million speakers.
  6. These native speakers are not generally comfortable with another, well-represented language, i.e. supporting these languages would really reach people who currently have no alternative access to knowledge.
  7. Some focus languages are used in Sub-Saharan Africa, the Indian subcontinent, or Central Asia.
  8. The people speaking the language have access to Wikipedia.
  9. A language that has official support, by a government, academic, or similar institution.
  10. At least one of the focus languages should be morphologically rich: the language agglutinates, or is discontinuous, or uses compounds regularly.
  11. Ideally, at least one of the focus languages is ergative-absolutive in some or all of its conjugations.
  12. Ideally, at least one of the focus languages uses evidentials.
  13. Ideally, at least one of the focus languages covers a number of dialects or languages, possibly themselves standardized.


Please feel free to discuss and edit this list.