Topic on User talk:Harmonia Amanda

Jump to navigation Jump to search
Kaganer (talkcontribs)

Please explain your revert. "Yakovlev" is not an independent surname, this is transliteration of russian surname "Яковлев" only.

Harmonia Amanda (talkcontribs)

There are people born in Latin-script country bearing this name. The reference for P31 is even stating it's an American name. The latin-script version of the name is clearly based on the Russian name but it has become a name all of its own. The Russian name would be transliterated differently on some latin-script languages, but the American version of the name would stay the same. They are two different names, one derived from the other.

To be more clear: Wikidata is creating an entry for each different string of a name; a Latin-script version and a Cyrillic version (or hangeul, or kanji, or…) are by definition not the same string and should then be on separate items. Russian people bear Yakovlev (Q21450308) "Яковлев", the Cyrillic name (which should be by far the most used) and American people (who most probably are of Russian descent) bear Yakovlev (Q37559986) "Yakovlev" or Jakovlev (Q42293799) "Jakovlev" or any other transliteration-which-then-became-a-real-surname.

Kaganer (talkcontribs)

Three questions:

  1. where described this algorithm?
  2. In the Wikimedia Commons all categories is named in English; and all peoples with this surname may be collect into one single category. Only one Wikidata item may be linked to the Commons category. How to choose?
  3. For Russian surname "Яковлев" may be filled English label. And for latin "Yakovlev" may be filled Russian label. How to distinguish?
Harmonia Amanda (talkcontribs)

1. There is a Wikiproject about names, all this was decided years ago, and there are help pages, scripts, etc.

2. Commons choices are Commons choices, and should be asked there. I guess if the category title is in English the correct sitelink would be the Latin-script one, but that's a guess, not an answer

3. Labels should always be in the language of the label (Russian in Russian, French in French, Japanese in Japanese, etc.), because people with only a basic phone with only their own writing system present on their devices need to be able to read it. I don't have devanagari installed by default on my work computer but we do have names in devanagari on Wikidata.

The label is then the most frequent transliteration of the name in the language. Other transliterations are added as aliases, since most of the times you'll have different transliteration systems coexisting.

The description make it clear what the item is about. On Yakovlev (Q21450308) (the Cyrillic name), all languages not using Cyrillic have a description like that "family name (Яковлев)" (in French "nom de famille (Яковлев)", etc.). On Yakovlev (Q37559986) (the Latin-script name), all descriptions in languages not using Latin-script are this way: "фамилия - Yakovlev" (in Russian). So it should always be clear what the item is about, and if you are working on names, there are scripts to add in one click all labels, descriptions and aliases based on native label (P1705).

Infovarius (talkcontribs)

3. The other choice (more appropriate from my point of view) is to use all (frequent?) variants joined in a label. Like "Yakovlev/Jakovlev"

Harmonia Amanda (talkcontribs)

Except that "Jakovlev" is not an English transliteration for Яковлев? It's an Italian one? Why would it be on the English label?

Infovarius (talkcontribs)

Ok, "Yakovleff" then

Harmonia Amanda (talkcontribs)

It's an old transliteration from the nineteenth century, so useful as an alias bor soemone working with old translations of books (for example on Wikisource) but nobody would transliterate that way nowadays… I'm really not convinced.

Harmonia Amanda (talkcontribs)

Ok, I've looked at Commons. It seems that names are added automatically based on the English label of the name ; meaning that if Яковлев English label changed, all people bearing Яковлев would be categorized in a different category from Yakovlev (which have a really small chance of happening between English and Russian, but there are other languages for which different transliteration systems coexist). I would say the category is about the Latin-script string in this case, since it's the only one not at a risk to change.

But there are technical ways to deal with the choice Commons made to be exclusively in English. The most obvious would be to create a template at the top of every name category stating:

"This category concerns people named 'Yakovlev' and 'Яковлев'"

We should also add related names too, like "Jakovlev", in another section. And probably add explicitly the writing systems ("'Yakovlev' (Latin script) and 'Яковлев' (Cyrillic)"), because for other examples it's not so clear:

"This category concerns people names 'Han' (Latin script), '韩' (Simplified Chinese), '韓' (Traditional Chinese), '한' (Hangul), '伴' (Kanji), and '坂' (Kanji)"

By the way 伴 in Japanese can be pronounced Tomo, Tomono, Tomori, Ban and Han.

It would be a system similar to the one existing on the French Wikisource, were we do use Wikidata to classify authors, and where we want Чехов to be with the T (Tchekhov), but eventual American Chekhov to be with C.

Wikidata should be able to deal with language-to-language combinations.

Reply to "Yakovlev vs Яковлев"