Wikidata:Requests for comment/How to make new languages enabled on Wikidata
An editor has requested the community to provide input on "How to make new languages enabled on Wikidata" via the Requests for comment (RFC) process. This is the discussion page regarding the issue.
If you have an opinion regarding this issue, feel free to comment below. Thank you! |
THIS RFC IS CLOSED. Please do NOT vote nor add comments.
The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- to add a new language to monolingual text properties, follow Help:Monolingual text languages
- to add a new label language, add it to MediaWiki --Pasleim (talk) 21:23, 11 January 2017 (UTC)[reply]
Explanation[edit]
I posted this RfC following this request on Phabricator, but there is also T59342 and T146707 who are on the same subject. The problem is how we should add language who are not enable on Wikidata? The Commission de toponymie du Québec, for exemple add, when is know the native languages names of a places, but most of the languages, like Mohawk (Q13339), Wyandot (Q1185119), Atikamekw (Q56590) or Innu-aimun (Q13351) are not supported by Wikidata. The main problem is that if these language are not enable on Wikidata, I can't add the statement on a monolingual property, like adding that Trois-Rivières (Q44012) is called Ok'entondie in Wyandot (Q1185119). But I also beleive they should be some rule for adding new languages in Wikidata.
I also see that this Wikidata:Requests for comment/Labels and descriptions in language variants seem the only about language in Wikidata. So we don't have to much RfC about this subject.
Proposal[edit]
I think the best solution will be to created a light structure like Wikidata:Property proposal then we could warn the develloper of make the language enable.
The base will probably to have:
- The ISO code of the language
- The autonym (The name of the speekers of the language use the designate it)
- The reason the language need to be created. (ex: Found a database who use this language)
--Fralambert (talk) 03:04, 16 November 2016 (UTC)[reply]
Discussion[edit]
- WMDE came up with Help:Monolingual_text_languages, but I'm not sure if they are actually following it. It seems to get confused with meta:Requests_for_new_languages, especially by people part of that other process. Maybe WMDE could clarify. @Lea Lacroix (WMDE):
--- Jura 08:28, 16 November 2016 (UTC)[reply]- That is the right thing and it should be followed. --Lydia Pintscher (WMDE) (talk) 09:21, 16 November 2016 (UTC)[reply]
- This seem to be a good way, but maybe there is a lack of advertizing. None of the three language requests I cite in my presentation use Help:Monolingual_text_languages. --Fralambert (talk) 12:55, 16 November 2016 (UTC)[reply]
- That is the right thing and it should be followed. --Lydia Pintscher (WMDE) (talk) 09:21, 16 November 2016 (UTC)[reply]
- What's not entirely clear is what should happen for language codes for labels (and descriptions). Sample: en-gb, en-ca ... Should they follow the same Help:Monolingual_text_languages? @Lydia Pintscher (WMDE):.
--- Jura 16:50, 16 November 2016 (UTC)[reply]- Yeah. Though they are more tricky for us. --Lydia Pintscher (WMDE) (talk) 17:06, 16 November 2016 (UTC)[reply]
- Well, I want to put all the wenacular names of plants in VASCAN ID (P1745), I would probably need en-ca and even fr-ca since the Canadian names choosen in this database are not the same of the one in United States for English and France for French. The best is pobably that the user who ask for a national variant have to give the usefullness of the creation. --Fralambert (talk) 04:48, 17 November 2016 (UTC)[reply]
- "en-ca" already exists. "fr-ca" would be good to have.
--- Jura 06:40, 17 November 2016 (UTC)[reply]
- "en-ca" already exists. "fr-ca" would be good to have.
- @Lydia Pintscher (WMDE): When I asked this before, Adrian told me that the requirements for labels are "quite different", although I have still not seen anything saying what the requirements are. There are two separate sets of additional codes, one set can be used for labels and monolingual text, the other can only be used for monolingual text. If the criteria for labels and monolingual text are the same, then the help page should be made more generic and the codes which are currently only available for monolingual text should be made available for labels. If the criteria are different, it would be useful to know what they are. - Nikki (talk) 17:25, 29 November 2016 (UTC)[reply]
- @Thiemo Mättig (WMDE): Can you tell more about it? --Lydia Pintscher (WMDE) (talk) 16:09, 30 November 2016 (UTC)[reply]
- Well, I want to put all the wenacular names of plants in VASCAN ID (P1745), I would probably need en-ca and even fr-ca since the Canadian names choosen in this database are not the same of the one in United States for English and France for French. The best is pobably that the user who ask for a national variant have to give the usefullness of the creation. --Fralambert (talk) 04:48, 17 November 2016 (UTC)[reply]
- Yeah. Though they are more tricky for us. --Lydia Pintscher (WMDE) (talk) 17:06, 16 November 2016 (UTC)[reply]
- I'm not sure I fully understand the question. Here are the basic principles, as far as I'm able to explain them:
- Monolingual text values are about content, while labels, descriptions and aliases are about the interface.
- There is no technical reason to limit monolingual text languages. Technically, we are free to add whatever we want: dead languages, even language codes that are not in an official standard. However, we want the community to approve each addition. We add a new language to
getMonolingualTextLanguages
in WikibaseRepo when the community really wants to use it in monolingual text values. - In contrast, adding additional languages for labels and descriptions is something we really want to avoid because of the negative effects it haves. We use MediaWiki core's
wmgExtraLanguageNames
setting for this, which has weird effects like being able to select this language in your setting but not having any localization except for the language name itself, which is always shown in it's own language, and never translated. We did it in a few cases, but the goal must always be to add this to MediaWiki core and TranslateWiki to be able to translate everything, not only a few labels.
- I hope this explains a few things. --Thiemo Mättig (WMDE) 18:19, 8 December 2016 (UTC)[reply]
- @Thiemo Mättig (WMDE): Thanks for the information! The problem here is that it's not clear to users how to get new languages added. We have a page explaining the process and criteria for new monolingual text languages (although it's not very visible) but there is nothing for label languages. From what you said, it seems like there is a reason to have separate criteria for labels and monolingual text, so we should create a page similar to Help:Monolingual text languages but for label languages. Could you help with that? I can try drafting a page but I would need help with some of the details. The main things I don't know are:
- Which languages will be considered? A valid non-private en:IETF language tag (BCP 47 code) seems like the bare minimum, but are there any additional restrictions? What about country variants, script variants, dialect or orthography variants, ancient/extinct languages, constructed languages?
- What needs to be done before the language will be added? You mentioned adding it to MediaWiki core and TranslateWiki - how does a user go about doing that? It looks like Incubator has information about TranslateWiki (here) but how much needs to be translated for it to count? Is there anything else?
- What information does the user need to provide? The language code would be the most obvious thing. What else? Language name in the language itself? Language name in English? Script? Fallback languages?
- - Nikki (talk) 12:04, 9 December 2016 (UTC)[reply]
- @Thiemo Mättig (WMDE): Thanks for the information! The problem here is that it's not clear to users how to get new languages added. We have a page explaining the process and criteria for new monolingual text languages (although it's not very visible) but there is nothing for label languages. From what you said, it seems like there is a reason to have separate criteria for labels and monolingual text, so we should create a page similar to Help:Monolingual text languages but for label languages. Could you help with that? I can try drafting a page but I would need help with some of the details. The main things I don't know are:
- I'm not entirely sure if the initial assumption above ("Monolingual text values are about content, while labels, descriptions and aliases are about the interface.") is how Wikidata currently works.
- Ideally, I think we could have statements that include names of things in any language and do away with labels and aliases, but currently this is not done that way (and maybe not even possible given the limitations of applicable properties). So I don't think it's going to happen anytime soon. The result is that labels are there to supply content in specific languages, e.g. the name of city in England in German. To do this, we need to make available codes for these languages for labels and aliases. For this, the GUI to edit Wikidata (and MediaWiki sites in general) can be in English or whatever, but it needn't be the same.
--- Jura 12:45, 9 December 2016 (UTC)[reply]- @Jura1: We do have official name (P1448) as well as a few other monolingual properties for exactly this reason. One of the (many) reasons labels are a special thing in the Wikibase data model, and not just statements, is because they are about searching and labeling an entity in the context of the interface the entity is used in. This interface is typically the MediaWiki software. We can go ahead and add "extra" languages to Wikidata before they are added to MediaWiki, and we already did that, but the way we are doing this right now is a bad hack with all kinds of unsolved problems, as hinted above. We need a clean, semantically correct solution first before we can open this for more additions. Currently the cleanest solution is adding the language to MediaWiki. --Thiemo Mättig (WMDE) 17:10, 12 December 2016 (UTC)[reply]
- @Jura1: The language in the labels and aliases is generrally the name who apear in a given wiki, so if the wiki. Monolingual property like native label (P1705) are generrally more usefull, scince you can add the name in a specific language, like I done for fr:Pointe-du-Lac (You can se the result in the infobox à "Nom local".) --Fralambert (talk) 23:03, 12 December 2016 (UTC)[reply]
- @Jura1: We do have official name (P1448) as well as a few other monolingual properties for exactly this reason. One of the (many) reasons labels are a special thing in the Wikibase data model, and not just statements, is because they are about searching and labeling an entity in the context of the interface the entity is used in. This interface is typically the MediaWiki software. We can go ahead and add "extra" languages to Wikidata before they are added to MediaWiki, and we already did that, but the way we are doing this right now is a bad hack with all kinds of unsolved problems, as hinted above. We need a clean, semantically correct solution first before we can open this for more additions. Currently the cleanest solution is adding the language to MediaWiki. --Thiemo Mättig (WMDE) 17:10, 12 December 2016 (UTC)[reply]
- I'm not sure I fully understand the question. Here are the basic principles, as far as I'm able to explain them: