Wikidata talk:WikiProject Names

From Wikidata
Jump to navigation Jump to search

Contents

question sur nom/prénom d'un auteur sous pseudonyme[edit]

Salut gentils contributeurs très francophones...

ça serait peut-être bien d'internationaliser un peu ce projet, si vous voulez avoir d'autres que les francophones qui participent ;)

question de principe : pour une personne sous pseudonyme, on met en nom/prénom ceux de son pseudo ou ceux de son nom réel ?

  • ex : "George Sand"
  • autre exemple : Laure Conan (Q3218752) - son prénom, c'est "Marie-Louis-Félicité" ou "Laure" ou les deux ? - idem pour le nom...

Amicalement, --Hsarrazin (talk) 07:21, 10 September 2014 (UTC)

Pour le deuxième article, "Marie-Louise Félicité Angers, dite Laure Conan", je mettrais trois: Marie-Louise (Q18012396), Félicité (Q18012403), Laure (Q3218740) --- Jura 17:36, 10 September 2014 (UTC)

derivative question/corollaire[edit]

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Sascha
Joxemai
Place Clichy
Branthecan
Azertus
ToJack
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
ajayi adeniyi
Moebeus
Dcflyer
Looniverse
Pictogram voting comment.svg Notified participants of WikiProject Names

See Andrea H. Japp (Q2846340) : it's an author, much better known under her pseudo than her real name… 3 possibilities for first names :

  1. given name (P735) contains all first names, and each one is qualified with "birth name", "pseudo", etc.
  2. pseudonym (P742) and birth name (P1477) each have given name (P735) as qualifier…
  3. both solutions, as long as requests do not search in qualifiers…

so, please give your advices ? --Hsarrazin (talk) 19:11, 5 October 2014 (UTC)

I'd add them primarily in given name (P735) to make sure it can be found when checking P735 on the item. --- Jura 16:55, 12 October 2014 (UTC)

Female surnames[edit]

Hello, there migt be little problem with female surnames in czech and slovak language - female surnames are usually changed (gender inflection (Q1124523)) to female variant. But in most cases there are not separate article about this variant of surname, only in cases, when this form should be from more male surnames

Typical examples
  • Novák -> Nováková
  • Fišer ->Fišerová
  • Malý -> Malá (adjective surnames)

Second problem is, that these articles are usually marked as disambiguations, even if they contains infobox, origin of name and list of people. JAn Dudík (talk) 08:12, 10 September 2014 (UTC)

We could create items for the female form and link these with a new property to the basic form of the surname. --- Jura 17:50, 10 September 2014 (UTC)
Shall we go ahead and do that? ----- Jura 10:32, 17 September 2014 (UTC)
@JAn Dudík, Jura1: What is the consensus regarding this question? I personally believe that Sidorov/Sidorova are 2 forms of the same surname (see en:Sidorov) and we can use female form of label (P2521) to specify female label on any language that needs it. But if you decided that we need 2 separate items, I'm fine with this. --Ghuron (talk) 16:43, 25 January 2018 (UTC)

Historical persons[edit]

Another problem I see with some historical persons, which have their names translated into latin or other languages (John Calvin (Q37577): Jan/Johannes/Juan/John/Jean Kalvín/Calvin/Kalvyn/Calvinus). Which form to use? french? latin? some other? JAn Dudík (talk) 20:36, 11 September 2014 (UTC)

All of them, with the one the person actually used as prefered value. -Ash Crow (talk) 20:39, 14 September 2014 (UTC)

Property proposal(s)[edit]

Please see Wikidata:Property_proposal/Person#second_or_maternal_family_name_of_Spanish_name. --- Jura 06:11, 11 September 2014 (UTC)

Please comment there. ----- Jura 10:32, 17 September 2014 (UTC)

Fun list :)[edit]

Please see Wikidata:WikiProject Names/first names (1) ----- Jura 16:15, 14 September 2014 (UTC)

New report: items by label (same or different)[edit]

At User:Jura1/Person names, Ivan made a list of items used with P735 and P734 based on the labels they have in various languages using Roman script. --- Jura 16:34, 14 September 2014 (UTC)

I hope noone start "working" on this list, separating "different" names and surnames, as they are not intrinsically different. --Infovarius (talk) 18:33, 14 September 2014 (UTC)
Well, they are different enough so that the only sustainable way to work constistantly for all given names and surnames is to have an item per variant, linked with said to be the same as (P460) between each other. -Ash Crow (talk) 20:38, 14 September 2014 (UTC)
They are as different as a pear is from an apple. We're separating them since some weeks now. The goal is to respect consistency, explain the links between given names and add some properties related to one but not another (Okki is willing to indicate the matching saints celebration days, and this isn't a bad idea, as this is a property use on some given name infoboxes use).
That also allows a consistency for given name and surname between property and item value.
That finally allows to show links of etymology (surname Rose is named after first name Rose, named after the Rosa flower).
According my watchlist, we're at least 5 users to work like according this methodology: Jura1, Okki, Harmonia Amanda, Ash Crow and me. --Dereckson (talk) 20:46, 14 September 2014 (UTC)
Looking at this list I see next issue: Some names have variants of writing, in czech language especially with lenght of vowels, i/y, ending letters (Justina/Justýna, Vasil/Vasyl, Tereza/Terezie/Teréza, Anastasie/Anastázie, Magdalena/Magdaléna...). Separating these names only because the most used variant is unique among other languages is not good, because less common variant (older, archaic) is usually redirected to main article. Creating separate article for each variant of each name should be useless. JAn Dudík (talk) 05:57, 15 September 2014 (UTC)
Indeed, no need to create articles for these items. On Wikipedia, a single article can include lists based on various items linked through "P460". Some languages already do this quite thoroughly. --- Jura 19:54, 15 September 2014 (UTC)
Don't forget we create a general-purpose database, not a Wikipedia helper tool.
If the variants are really used, ie when there is at least one item instance of Q5 with this given name, create an item.
The core issue is what's a variant? For you, it could be a variant, for other people, two distinct given names. There is no objectif criteria to separate that. So 'said to be the same' claims make the perfect solution for such cases. --Dereckson (talk) 20:17, 17 September 2014 (UTC)
So you are acquainted with problem of transliteration duplicates at last, thanks to Jan Dudik. Look at my example below. Many of packs of names in the list do not differ in cyrillic languages. So you are prophets of latinocentric point of view... Infovarius (talk) 12:40, 17 September 2014 (UTC)

father's name (patronyme) in Russian names[edit]

Hello,

Since I recently tried to add translitterated names from Russian items, I think there should also be a propriety for those... a lot of people can only be distinguished through their father's name... Ivanovitch (in French), Ivanovich or (in English), etc. should be values to add to those (generally) Russian people.

and also, should there be feminine version items, or just a "feminine value" (Ivanovna), as those are used in the same sibling, to give the father's name.

A specific property ? an adaptation of the mother's name (in Spanish) ? what do you think... ? --Hsarrazin (talk) 11:49, 16 September 2014 (UTC)

For readers of the above question, w:Eastern_Slavic_naming_customs#Patronymic gives an introduction. --- Jura 12:36, 16 September 2014 (UTC)
The 2nd surname in Spanish has the advantage that the property can use items already available as first surname (of the mother). Here the situation is somewhat different.
A possible solution could be to create a monolingual string property that could be used with (Russian) given names (sample "Ivan") to indicate "Ivanovich" and another property (Russian patronym?) to link to these "given name" items. ----- Jura 10:32, 17 September 2014 (UTC)
As patronymic is by definition father's name it is enough to have link to a name of the father. Infovarius (talk) 12:26, 17 September 2014 (UTC)
@Infovarius : for a Russian, yes, obviously… but this would be very interesting for non-russian speakers, I think :)
Jura1's proposition seems simple, but, since there are not a lot of values, and many errors/typos/different translitterations could be done in monolingual string properties, while a value based on the original russian value, and then translitterated in other languages would provide an easy way to translitterate russian names in other alphabets... :) --Hsarrazin (talk) 19:03, 5 October 2014 (UTC)

Constantin / Konstantin / Constantine merger at Q7111053[edit]

Somehow this merger mixes different given and family names we were trying to keep separate. How to fix it? --- Jura 19:27, 16 September 2014 (UTC)

We ask a sysop (@Ash Crow:) to undo the merge and we ask @Infovarius: to not do that again? --Harmonia Amanda (talk) 06:27, 17 September 2014 (UTC)
Now (with redirects) anyone can undo the merge, but let's discuss first (below). Infovarius (talk) 07:39, 17 September 2014 (UTC)
In the meantime, a bot updated all links pointing to the 2 redirects. This has gotten messy. We will need to do quite a lot of cleanup. --- Jura 10:46, 17 September 2014 (UTC)
Well, in my opinion, all these above should have given name (P735) with ru:Константин in preferred value, and all the transliterations in given name (P735) with devalued value. And we should keep the Latin ones separate. The cases like these concern mainly historical figures like kings or popes ; the merge make no sense at all for contemporary figures. We can deal, albeit heavily, with many different given names for a same person. We can't separate « Wilhelm » and « Guillaume » if they are merged, or worse, what do you suggest about Étienne (Q15727982) and Stéphane (Q3501543)? It's two french given names said to be the same as Στέφανος… I still prefer to separate those given names, and to include all the possible names on disputed items, with the value in the original language as preferred value. --Harmonia Amanda (talk) 20:07, 17 September 2014 (UTC)
I agree with Harmonia Amanda. -Ash Crow (talk) 10:51, 23 September 2014 (UTC)
Preferred values is interesting idea, but it doesn't solve any problem. I believe we should thoroughly think about global plan (including non-latin languages, including future Wiktionary inclusion) or we (mainly Jura1 now) doing superfluous work at least and destructing hard built interwiki-links at most by separating without thinking. I don't know solution too but if you continue with Harmonia Amanda-like strategy we end up with a bunch (up to 6000) of values in P735 each being associated with specific language. It'll be a complete mess and huge technical problem (like now with Germany item). --Infovarius (talk) 17:37, 16 October 2014 (UTC)
@Harmonia Amanda, Infovarius: Salut! May I ask why use the Russian ru:Константин as the point of reference? The people mentioned would had Latin, Greek, Armenian, maybe French, English, German as native language, they would be called Constantinus/Κωνσταντίνος/Constantine/Konstantin/Constantin but all called Константин in Russian. My first initial reflex would be to say that Constantinus (la)/Κωνσταντίνος (el)/Constantine (en)/Konstantin (de)/Constantin (fr)/Константин (ru, bg, sr...) are the same given name, but that does not seem to be the way things are done here. Maybe one should not trust one's first reflex) Place Clichy (talk) 11:23, 19 November 2015 (UTC)

last part moved to New subject for Discussion

@Place Clichy:, your "first reflex" is my opinion. I consider these spellings as the same name. The problem can be that each Wikipedia can have different number of articles devoted to this name: from 1 to tens (English Wikipedia likes to create article for each spelling, doesn't matter if they are same or not). And Wikidata need to place sitelinks in some items and to link them with some statements. There is some modelling (you can read at project page) but it is problematic and in the case of Константин it make more problems than solve them. --Infovarius (talk) 12:39, 20 November 2015 (UTC)
@Infovarius: The situation where a language (even en) has several articles for several transcriptions of the same name seems to be the exception rather than the norm. In some cases, these articles are almost empty and can be redirected without trouble. For this reason, I support having a single Wikidata item for different transcriptions of the same name, until at least one language has at least 2 articles for 2 different transcriptions. That way we can have en:Ivan (name)/de:Iwan (name)/ru:Иван/uk:Іван on the same item, which by the way is the current situation on Ivan (Q830350). Place Clichy (talk) 10:17, 23 November 2015 (UTC)
Unfortunately, for en-wiki it seems to be a rule to have plenty of articles for each variant of a name... So it leads to multiple items... --Infovarius (talk) 21:16, 22 September 2017 (UTC)

I propose to consider another example: Constantine (Q103314). Look at titles of sitelinks and choose one variant :) Infovarius (talk) 21:16, 22 September 2017 (UTC)

But now the current process has lead to another solution: please welcome Konstantin (Q31362405) - special item for Cyrillic name! Now we can forget about all those Q7111053, Q19327451, Q5163687 and other Latin-centric variants and use one item for all Cyrillic names (Russian, Ukranian, Belorussian, Kazakh, Serbian and many others). As for Latin labels for Q31362405, I can propose to use either "Константин" or "Konstantin/Constantin/Constantine..." to avoid ambiguity. Now I am moving all relevant uses to this item. --Infovarius (talk) 21:16, 22 September 2017 (UTC)

Are you aware that these "transcriptions" can be made even from one latin language to another? Look at the labels in different languages for Charles VI of France (Q160349)! We have many "Carl/Karl"-names among the Swedish kings, and which item is used together with the property for first name, depends on the language preferences of the user who have added the claim. Some says that we should use the same spelling as the person himself did. The problem with that is that it imply that the person could spell and were used to some kind of orthography (Q43091). Orthography was not introduced for the Swedish language until the 19th century, and it didn't become stable until the 1920's. And the Swedish tradition for the names of the royalties is that the spelling is changed when they have been dead for some time. That tradition is used even for foreign royalties. -- Innocent bystander (talk) 07:00, 23 September 2017 (UTC)

Nameguzzler and Beta labellister dead[edit]

This really makes it harder to clean up items. Are there any alternatives available ? --- Jura 08:27, 20 September 2014 (UTC)

Split between surname and given name[edit]

Some items still seem to mix the two. To make it easier to separate them, I made a property proposal at WD:PP/P#Family_name_identical_to_this_first_name --- Jura 08:27, 20 September 2014 (UTC)

Maiden name = real name = full name?[edit]

We have:

  1. birth name, string: no label (P513)
  2. birth name, monolingual text: birth name (P1477)

Used for:

  1. maiden name: Rodham (maiden name of Hillary Clinton)
  2. real name: Samuel Langhorne Clemens (pseudonym: Mark Twain)
  3. full name: William Jefferson Clinton (nick name: Bill Clinton)

--Kolja21 (talk) 14:30, 20 September 2014 (UTC)

or "full name at birth"? --- Jura 15:34, 20 September 2014 (UTC)
^^ what Jura said is how I have always worked with it, and that covers all the cases above I believe. I just wish for the better explanation of 513 -> 1477 which is a bit of a PITA in the format, and lack of explanation.  — billinghurst sDrewth 13:46, 21 September 2014 (UTC)

Wikidata:Status updates/Next[edit]

I left a note about the project there. --- Jura 09:44, 21 September 2014 (UTC)

Translation[edit]

Would you mind if I prepared/marked the page for translation? The idea comes from using {{TranslateThis}} and from adding to Status updates. Matěj Suchánek (talk) 11:28, 21 September 2014 (UTC)

It might be a bit early. The project just started and I'm not if sure if the current version was re-read.
The most detailed explication is still in the blog post (in French). ----- Jura 13:45, 21 September 2014 (UTC)

Cadet branches of noble families[edit]

Hello,

I am wondering what would be the best way to link a cadet branch of a noble family to the main tree. Should we use instance of (P31)cadet branch (Q2057658) with a qualifier of (P642) or create a whole new "cadet branch of" property? -Ash Crow (talk) 11:12, 23 September 2014 (UTC)

first name property should be Multilingual text, but not Item[edit]

See the following discussion: Property talk:P735#Mess --DixonD (talk) 16:38, 2 October 2014 (UTC)

1. "Multilingual text" would be an other concept. 2. "Multilingual text" would mean you couldn't add Paul (Q4925623) as given name without specifying which language "Paul" is. The name might be of French origin, mostly used in Germany and still used by the Roman pope Paul I (Q103404). --Kolja21 (talk) 19:01, 2 October 2014 (UTC)
"Paul" is (was) used for Paul I (Q103404), but only in some languages, check the labels or Wikipedia links. This is actually a good reason to use multilingual text datatype: He should have "Paul" for English, "Paul" for French, "Paul" for German, but "Paulus" for Latin, "Pavel" for Czech, "Paweł" for Polish etc. Another person with the same name (e.g. Paul Cézanne (Q35548)) should probably have "Paul" in nearly all languages using Latin script (including Czech, Polish etc.) Check the labels and Wikipedia links again. How can this be solved through "item-based concept" without adding hundreds of statements for the pope, one for every language variant of his name?--Shlomo (talk) 06:00, 7 October 2014 (UTC)

Until someone explains me better when the particular given name as an item should be added to the person's item, I'm going to remove all added given names items I see that do not match in Ukrainian or any other language I know with the real given name of that person in that language. Like this or this --DixonD (talk) 14:35, 6 November 2014 (UTC)

Mario / Marius[edit]

Hello,
I met a problem while adding given name (P735) to persons named Marius (Q2159938) and Mario (Q3362622). I just inverted the two items: people who have Marius (Q2159938) are actually named Mario and people who have Mario (Q3362622) are actually named Marius. All this because of false labels in French: it is "Mario" on Marius (Q2159938) and "Marius" on Mario (Q3362622).
I have already added ~2000 claims about Mario and Marius names... What could be the best solution in order to solve this problem? Is it possible to set the current 'false' properties to every Mario and Marius and then to switch all the links and label from Marius (Q2159938) to Mario (Q3362622) (and vice versa)? The second solution would be to cancel all my edits about Mario and Marius and then to restart adding claims from the beginning. What do you think about that? Mathieudu68 talk 17:39, 6 October 2014 (UTC)

Hmm .. I was wondering if it was a Latin name version that made you add "Marius" to persons named "Mario". Anyways, don't worry.
You can switch them over by opening two sessions of Autolist2: one for Mario, one for Marius. --- Jura 17:44, 6 October 2014 (UTC)

For "Marius" in "Mario", something like:

http://tools.wmflabs.org/wikidata-todo/autolist2.php?find=Marius%20%&statementlist=&language=en&project=wikipedia&category=&depth=12&wdq=claim[735:3362622]&mode=undefined&chunk_size=360&find_label=1&find_langs=&mode_wdq=and&run=Run

to replace

-P735:Q3362622
P735:Q2159938

should work. --- Jura 17:51, 6 October 2014 (UTC)

I just started moving the Marios over to their item. --- Jura 18:10, 6 October 2014 (UTC)

Thank you so much Jura1 ! :) Mathieudu68 talk 18:19, 6 October 2014 (UTC)

AutoList 2 down, requests page to avoid to list information[edit]

Hi,

I've created Wikidata:WikiProject Names/To fill to note given name or surname fill requests when AutoList 2 is down.

--Dereckson (talk) 02:52, 13 October 2014 (UTC)

Need for Discussion…[edit]

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Sascha
Joxemai
Place Clichy
Branthecan
Azertus
ToJack
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
ajayi adeniyi
Moebeus
Dcflyer
Looniverse
Pictogram voting comment.svg Notified participants of WikiProject Names
In fact, what Infovarius says above is exactly the concern I stressed while chatting with Harmonia Amanda and other contributors of this project on IRC some days ago… the problem of translitteration is NOT solved by this, and interwiki-links fail, and it is already developping as a monstruous Heath-Robinson machine (usine à gaz in French) ;

Believe me, as regards databases, sobriety is the best solution… each time you duplicate info, each time you complicate the path, you add complexity, and Murphy's_law is the result…

I do not think that we should continue this way, before really examining what are the goals of this project..

  • interwiki-links ?
  • stats about names (spreading, history, etc.)
  • string manipulations (like sorting of names)
  • others ? (please add, I can't think of others for now)

and there could be several technical solutions, the one that's being used now not necessarily the best :

  • maybe multilingual-type property could be used ? (I don't really understand the implications of that type of value - it's just an hypothesis)
  • maybe we could create a "top" value for each "group of names", on which all variants would be linked to, that could then allow interwiki... ? this would allow to fall back to a "1-to-many" relation, instead of a "many-to-many" that is the core of the problem… - choice of how this "top" item should be named and used, to be discussed…
  • maybe we still have to create another solution no-one has thought of, already…

One of the biggest problem for now is that it is somewhat difficult to see how wikidata will develop, technically, as many linking and querying are not possible, yet… and trying to get around that problem leads to duplication of information in many ways… among them, the treatment of names… :( Here is my position : testing is right… to see what works and what doesn't… and we've seen that it raises many problems… — for now, I think that we should R E S T, O B S E R V E, then T H I N K… and D I S C U S S… before launching into more action… :| --Hsarrazin (talk) 19:08, 16 October 2014 (UTC)

There seems to be a real need for discussion, now…

Here are my first reflections, after sleepless night :) … feel free to add on this page or we can just move it to sub-page here :) --Hsarrazin (talk) 07:12, 17 October 2014 (UTC)

I also thought about a "top value" for all name cognats, may be it's really good idea against this crazy splitting and mess in each person item. Wikipedias which have pages for several variants of the same name (like en-wiki) can also have one common article (like "Constantine and variants") which would be linked inside the top value. --Infovarius (talk) 03:54, 18 October 2014 (UTC)

Nikola[edit]

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Sascha
Joxemai
Place Clichy
Branthecan
Azertus
ToJack
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
ajayi adeniyi
Moebeus
Dcflyer
Looniverse
Pictogram voting comment.svg Notified participants of WikiProject Names no label (Q18247339) and Nikola (Q15501913) were merged by Milicevic01 few days after I worked with these items and let my bot add given name (P735): Nikola (Q15501913) and given name (P735): no label (Q18247339). I think this should be resolved by unmerging but members of this project could make a better cleanup than me... Matěj Suchánek (talk) 14:56, 12 November 2014 (UTC)

I'm unmerging the items. Meanwhile, could you explain to Milicevic01 why the merge weren't the optimal solution? --Dereckson (talk) 15:16, 12 November 2014 (UTC)
Er... could you indicate the exact given names of these two items? I can't distinguish between the two Nikola. --Dereckson (talk) 15:20, 12 November 2014 (UTC)
@Dereckson: I have just realized that my organisation wasn't well too. I am giving you a summary of my thoughts:
Male given name Female given name Unisex given name Not know
enwiki, bgwiki, srwiki, slwiki plwiki cswiki, skwiki, nowiki, hrwiki, ptwiki
Q18247339 (mostly my bot added backlinks) Q15501913 (mostly my bot added backlinks) to be created ?

Matěj Suchánek (talk) 18:37, 12 November 2014 (UTC)

@Matěj Suchánek: I can tell you this hrwiki says Nikola is given name and later on lists both male and femle variations of name and points out in which countries is used as male name and in which as a female name so i gess it is unisex? I also added slwiki --Milicevic01 (talk) 20:07, 12 November 2014 (UTC)
well, in this case, personally, I would think only one item, with unisex given name (Q3409032), since it is exactly the same name, so one for male and one for female is one too many :) --Hsarrazin (talk) 20:57, 12 November 2014 (UTC)
Like Hsarrazin, I've also a preference to use one item per first name variant, but keep one item for the use of the same firstname in different languages, so I concur with a merge too. --Dereckson (talk) 23:41, 12 November 2014 (UTC)
"Nikola" seems to be less clear-cut than "Jean", so it seems hard to formulate a way how some may be differentiated. --- Jura 07:47, 13 November 2014 (UTC)
They mention men in pl-article too. --Infovarius (talk) 08:35, 16 November 2014 (UTC)

name in native language (P1559)[edit]

Congratulations! So we have now a monolingual property for names. I believe that this property will be not controversial as previous given name (P735), so I propose to pay more attention to the new property. --Infovarius (talk) 21:32, 16 November 2014 (UTC)


family (P53) scope[edit]

Please see Property_talk:P53#Other_families. --- Jura 07:51, 21 December 2014 (UTC)

Language for given name items[edit]

Please see Wikidata:Property_proposal/Generic#Language. --- Jura 07:51, 21 December 2014 (UTC)


Name items[edit]

There are some items that link to articles that describe both the use of a name as a given name and a family name (exemple: en:Alonso).

These articles are different from articles that just describe a surname and mention that it's being used as a first name too.

Currently, two ways are suggested for these:

(A) See the approach used in the examples Bruno (Q955175) or Patrick (Q4927850):

(B) Alternate solution:

  • (B) doesn't have the advantages and inconveniences of (A).

Maybe there are other ways to solve this. --- Jura 10:30, 21 December 2014 (UTC)

I would use (B) and add on the general item has part (P527) with the family name and the given name, so we can still found easily these. --Harmonia Amanda (talk) 11:57, 21 December 2014 (UTC)
Sounds good. --- Jura 22:23, 29 December 2014 (UTC)

Wikidata:Project_chat#Statement_with_qualifier_.22applies_to_part.22_.28P:P518.29_.22Russian_Wikipedia.22[edit]

You might want to participate in the discussion, it's about the item "François". --- Jura 22:23, 29 December 2014 (UTC)


WikiProject Names in "Class Instance Analysis"[edit]

From Wikidata:Project_chat#Class_Instance_Analysis:

  • There are 40k+20k names: 40038 family name, 10320 given name, 5569 male given name, 4828 female given name.
    • Due to the good efforts of the WikiProject "Wikidata names", these items provide valuable information on names themselves, eg variations, male/female correspondences, etc.
    • This can probably be used for disambiguation or for generating language-specific name variants, but we have not investigated this topic

Nice :) --- Jura 16:40, 28 January 2015 (UTC)

Aliases for non-English characters[edit]

Generally, names of languages in Roman script will not have aliases. James (Q677191) would not have the alias "Jim", for example. However, in English it would seem to still make sense to do the normal aliasing to handle accented or other special characters not normally used in English. For example, José (Q2190619) would have the alias "Jose". In these situations, both accented and non-accented usage are widely used in references. Is there any reason this should not be permitted? Josh Baumgartner (talk) 19:11, 23 March 2015 (UTC)

The alias for mere ease of finding the accented version seems fine, but it probably shouldn't replace the un-accented version (item) entirely. --- Jura 19:17, 23 March 2015 (UTC)
Agreed, where there are both accented and non-accented names in use, both should have their own items. Josh Baumgartner (talk) 21:14, 27 March 2015 (UTC)

Aliases for non-Roman script[edit]

Q4925477 with the label "John" (in English) has "John" as alias for ru and be.

Based on this, I set a similar alias for the more frequent names (I think I limited it to names with > 1000 uses). Shall I do the same for other languages? Samples: ja or zh? --- Jura 10:34, 15 April 2015 (UTC)


More given name (P735) than date of birth (P569) on P31:Q5[edit]

Finally! 1,825,881 compared to 1,821,876.

From the department of pointless statistics ;) --- Jura 17:30, 18 April 2015 (UTC)


Top-down approach[edit]

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Sascha
Joxemai
Place Clichy
Branthecan
Azertus
ToJack
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
ajayi adeniyi
Moebeus
Dcflyer
Looniverse
Pictogram voting comment.svg Notified participants of WikiProject Names

Following a bot request by @Sascha:, I finally tried to write down what could be such an approach (section at Wikidata:WikiProject_Names#How_to_clean_up_given_name_items_.28top-down_approach.29, previously empty).

Please suggest more. --- Jura 17:36, 21 April 2015 (UTC)


Japanese names[edit]

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Sascha
Joxemai
Place Clichy
Branthecan
Azertus
ToJack
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
ajayi adeniyi
Moebeus
Dcflyer
Looniverse
Pictogram voting comment.svg Notified participants of WikiProject Names

Where to start? They are currently filling up the newly created Wikidata:Database reports/Most linked disambiguation page items (mostly surnames) and some of the constraint reports.

Personally, I had been avoiding them as there can be an issue with the sequence of family name/given name. Maybe this isn't so much an issue as English labels might be using the standard order. --- Jura 03:41, 6 May 2015 (UTC)

Looking at some of the names, there seem to be exceptions. Maybe its mainly for person who don't have an article on en.wikipedia.org . --- Jura 06:25, 13 May 2015 (UTC)
Among the items, there seem to be quite a few like Q6360446 that are mark as disambiguation, but only link to items for the given name. --- Jura 06:47, 19 May 2015 (UTC)
There are items like Wikimedia disambiguation page (Q4167410) that are disambiguations on Wikipedias, but they really refer to one name (Mochizuki/望月 in this case). So I think these items should have instance of (P31) => Wikimedia disambiguation page (Q4167410) removed. —Wylve (talk) 14:23, 19 May 2015 (UTC)
I fixed a few of those manually, but given the number of them (several hundred used on 12000+ items), it might be easier to replace Wikimedia disambiguation page (Q4167410) with given name (Q202444) for all Japanese names there. --- Jura 09:05, 21 May 2015 (UTC)
Yes, I tried to correct them manually but it seems like it's only names and not "real" disambiguation pages. We will have less work correcting given name (Q202444) to Wikimedia disambiguation page (Q4167410) than Wikimedia disambiguation page (Q4167410) to given name (Q202444), I think. --Harmonia Amanda (talk) 11:42, 21 May 2015 (UTC)
Ok, I went ahead and changed the following ones: Q10312977, Q3814192, Q8060712, Q3194472, Q2875045, Q7827665, Q376265, Q1906740, Q3199149, Q5752764, Q3566615, Q3453970, Q6381801, Q1618042, Q8049964, Q6455143, Q7496332, Q6849888, Q11738555, Q8062787, Q5770234, Q7385588, Q7705328, Q5996341, Q3532641, Q5389041, Q7677975, Q6387191, Q8056124, Q6444650, Q6381390, Q7820325, Q7706920, Q7446720, Q6378245, Q7681648, Q5752722, Q1609557, Q7678557, Q5675603, Q6311404, Q9286649, Q8056012, Q6314440, Q6381549, Q6380666, Q5508575, Q7706863, Q7385648, Q6862794, Q6782534, Q6378087, Q7827685, Q5771368, Q4803426, Q3519203, Q7820113, Q6381477, Q5771436, Q5770644, Q7496317, Q6381402, Q6378214, Q8049934, Q7827748, Q5770464, Q6381748, Q3189975, Q5771423, Q7678610, Q1666942, Q6782731, Q5752312, Q7827655, Q6883309, Q8062778, Q7674305, Q7496343, Q6884537, Q6455275, Q5675731, Q8049958, Q7688335, Q7674394, Q6782538, Q7827733, Q7674265, Q6392056, Q7677045, Q7496310, Q7415457, Q5770589, Q7677432, Q7385640, Q6410131, Q5772304, Q7688273, Q7674762, Q7497855, Q6389407, Q585483, Q8060684, Q6782282, Q6410145, Q8056328, Q6884482, Q6782694, Q6455199, Q5770184, Q7827727, Q7820193, Q6875091, Q6849876, Q6381484, Q4700946, Q5772471, Q6837942, Q6381730, Q8050007, Q7678034, Q7496321, Q5349231, Q6782402, Q8060695, Q8062705, Q5349988, Q7385612, Q8056310, Q8056300, Q7333107, Q7310136, Q6782815, Q5349260, Q6782681, Q8056298, Q8050125, Q7827662, Q5752434, Q8050046, Q5771411, Q8056182, Q7674397, Q6444686, Q8050113, Q7827743, Q925943, Q7681463, Q6381738, Q5102673, Q7677144, Q7674401, Q7506285, Q5530781, Q6883744, Q6383869, Q4701238, Q7505114, Q7336794, Q6418996, Q8061641, Q7960775, Q7827814, Q7705352, Q6378053, Q7638831, Q7428553, Q6883306, Q3856896, Q6413889, Q1741213, Q8062766, Q8062743, Q8049954, Q7695023, Q5560380, Q7678596, Q3482314, Q5508600, Q7402978, Q9071518, Q5771401, Q6419012, Q5752501, Q8049961, Q7827677, Q7705294, Q7688280, Q7688277, Q2365258, Q7496340, Q7397616, Q6381784, Q8050117, Q7850222, Q7678172, Q7677968, Q4817756, Q6883198, Q5752454, Q6837934, Q6837904, Q6782546, Q6447424, Q5772430, Q7820328, Q7818898, Q7705310, Q3200459, Q9355297, Q7674398, Q5100103, Q6782395, Q5509874, Q8062648, Q7827680, Q7688265, Q7499483, Q6381422, Q8056132, Q6434171, Q7813589, Q5349770, Q7678655, Q7677437, Q5752491, Q7496350, Q7403211, Q7397505, Q8060763, Q8050037, Q7830674, Q7830637, Q7705317, Q4830936, Q7505006, Q6405993, Q6883384, Q6837949, Q7635744, Q225187, Q248718, Q7379297, Q7335057. About 250 for 9000 items. --- Jura 04:04, 22 May 2015 (UTC)
It's now down to about 2000 (here). The Japanese ones I left out are some that are marked "family name" and "disambiguation". Maybe for these just a second item needs to be created. --- Jura 04:51, 22 May 2015 (UTC)
The problem with some of these items is that they are purely transliterations. For example, Rinko (Q7335057) refers to all possible combinations of kanji that is pronounced "Rinko" when spoken. The en.wp site link on that item already shows that "Rinko" refers to both 凛子 and 倫子. These two are different names with different meanings that coincidentally have the same romanization and pronunciation. If these name items are used on person items then they would lead to inaccuracy. I suggest that we fix these items by looking at the ja label. If the ja label is unavailable then we need to manually separate the romanizations to actual names. —Wylve (talk) 06:07, 22 May 2015 (UTC)
Yes. For Special:Search/Yuriko given name this is nicely done (currently 3 variants). In these cases, the Japanese spelling should probably go into the label, not the description. For Yuriko, we might want to create an undifferentiated item. --- Jura 06:15, 22 May 2015 (UTC)
I'm not so sure about an undifferentiated item. To me, it doesn't have any use. It can be used to link the three differentiated items together, but that would not be a semantic relationship, but a linguistic one. —Wylve (talk) 07:52, 22 May 2015 (UTC)
For Wikidata contributors, it has the advantage that they could use that if they don't know which other one applies (people like me). It may also be that (sample) an American has a the given name "Yuriko" without relating it to any Japanese spelled name, or at least none that is publicly known. Most given name pages at English Wikipedia would probably also be linked from such an item. --- Jura 07:59, 22 May 2015 (UTC)
I see. Anyway, I'll try to differentiate some items in the coming days. Which property should I use to link the differentiated items to their undifferentiated counterpart? —Wylve (talk) 08:28, 22 May 2015 (UTC)
Not sure, which one would you use? P:P527/P:P361 and P:P460 come to my mind. What do you think of adding the Japanese spelling to the label (such as "ゆり子 (Yuriko)"). For any person, we could probably add the undifferentiated item in P735 as well. Maybe we should create a P31 value for the undifferentiated ones. --- Jura 08:38, 22 May 2015 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────About the property, I'm reluctant to use said to be the same as (P460), since it begs the question, "if two items are said to be the same, then why aren't the items merged? If they can't be merged, then they couldn't be the same but only similar". I don't feel that has part (P527)/part of (P361) are suitable either, since they do not form parts of each other. They merely related in that they have the same pronunciation. As to the label, it would certainly be convenient for editors, but I feel that labels should be confined to their own language. Alternatively, editors can enable ja in their babel boxes. About the new P31 for undifferentiated items, we first need to separate items with disambiguation sitelinks that only disambiguates between names, and those that disambiguates also between places with the same spelling. The former can either use transliteration (Q134550) or given name (Q202444), depending on whether the name is also used in other languages besides Japanese. The latter should stick with Wikimedia disambiguation page (Q4167410). —Wylve (talk) 09:06, 22 May 2015 (UTC)

So which property do yo plan using? "different from" could be another option.
In the recent plan for Wikitionary (for 2020?), there is a new datatype. One that would have just one label for all languages. For given names, something in these lines could be an option as well. The labels are already mostly identical. The Romanization would just need to go into a statement. --- Jura 15:10, 22 May 2015 (UTC)
Maybe we should try to add a section about this to the project page. --- Jura 07:03, 23 May 2015 (UTC)
I've given it some thoughts and maybe we don't need a property linking them. They are merely synonyms and we don't collect those until we rollout Wiktionary support. —Wylve (talk) 03:31, 24 May 2015 (UTC)
I'm afraid if we don't link them together, people are likely to merge them or ignore that they exist. I wouldn't count on Wiktionary support as nothing is really certain about it (if/when/how).
We could create a dedicated property for the Japanese names. If it at some point there is a Wiktionary feature for that, it can easily be converted to that, the information already being structured. --- Jura 05:46, 24 May 2015 (UTC)

Constraint violations/P735[edit]

The report has finally a manageable size .. --- Jura 07:04, 23 May 2015 (UTC)

The bad news is that there are still 4000 items mixing given names with disambiguations and or family names (list). --- Jura 14:36, 23 May 2015 (UTC)
It seems that most are unused (3200) or rarely used (450), just about 100 to 300 need work. --- Jura 16:01, 23 May 2015 (UTC)

Korean versus Chinese surnames[edit]

Hello! I've recently been editing records of Korean persons, adding surname fields and such. I'm finding inconsistency between how different names are handled on Wikidata/Wikipedia, but I don't know enough about names to know why. For instance, the name "Lee" has one record for the Chinese name (Q686223) and one for the Korean name (Q13498149) even though they both use the same Chinese character. However, others such as the Korean surname "Heo", is combined with the Chinese name ("Xu") with whom it shares a character. So if I add a surname field to a Korean person named Heo, it shows their surname is Xu (they are different pronunciations of the same character - like Japanese, Korean uses Chinese characters but with its own pronunciations). I was wondering if it's okay for me to create, say, a new record for the Korean surname "Heo", or will I mess up some existing master plan? Again, I don't know the reasoning behind the way it's currently handled so I don't want to go messing around with things without checking first. So I'd love any advice on how to proceed in a case like this. Thanks so much! Shinyang-i (talk) 01:57, 28 May 2015 (UTC)

The surname part hasn't been worked on much yet.
There may be no reasoning behind it, except that people like me wouldn't see the difference (I can't read either language).
If the names are different, it seems reasonable to have separate items, this even if in some languages, two different names are spelled the same. --- Jura 03:05, 28 May 2015 (UTC)
Thanks for the response! I think I will go ahead and create the Korean surname records, and if later on the records are deemed unnecessary they can be merged or something like that. There aren't actually that many Korean surnames even in existence, and even fewer that are common. I will also make some Korean given name records. I made a few then worried maybe I shouldn't be doing so. Is that okay, too? I can't fill in much info but at least they'd exist. Is there a place where editors should list any name records they create, for the sake of organization? Thanks again! Shinyang-i (talk) 01:04, 29 May 2015 (UTC)
I suppose it depends on how you prefer to work.
I think it would be worth creating separate items for each of the names on en:List of Korean family names. A few may already have distinct items, but some might mixed this up with given names and other things. Does the Reasonator: List of people with the Korean family name Lee make any sense? Note you can display it with lang=ko as well. Running Reasonator on the Korean surname Lee obviously does the same.
I would create the items beforehand even if the list was much longer. It takes some time work it out how, but it's can be done with QuickStatements. The list could also be copied over to Wikidata. Not sure which Wikipedia on list of Korean family names (Q5934917) has the most complete one. Depending on how the items are defined "TAB" can generate a live list: List of Korean surnames. If you need additional properties for these names, they can be requested at PP/T (it might take some time though).
For English given names, I started adding properties based on automated queries like this. It doesn't work well for English language surnames as querying for the last part of the label is slow. As Korean surnames come first, using language "ko" and the name might work as well. --- Jura 04:43, 29 May 2015 (UTC)
Thanks for the great response! You've given me a lot to think about. I'm actually primarily working on people records, and the name issue has come up as part of that. For now I'm going through all the Korean surnames on ko-wiki, because I'm finding a lot of them are already on Wikidata, they just have no English labels or descriptions. I'm not in a position to request properties, as I can't add much to the records besides bare bones, ha ha. I'll check out some of the tools you've listed! Thanks again! Shinyang-i (talk) 06:30, 29 May 2015 (UTC)
For storing transliterations, it might be worth creating new properties (similar to P:P1721). --- Jura 10:51, 29 May 2015 (UTC)
One is already about to be created: Wikidata:Property_proposal/Person#McCune-Reischauer. --- Jura 18:51, 29 May 2015 (UTC)
Oh excellent, that's something that never even crossed my mind. The other major romanization system is Revised Romanization. Those are the two "official" romanizaiton systems. However, in reality, there are often a number of other romanzations that are actually used by Koreans for the spellings of names. I've added them to the "also known as" fields to assist in searching, but are those the kinds of things that should be part of the "official record"? Shinyang-i (talk) 16:31, 30 May 2015 (UTC)
Actually, in thinking further, they definitely should be. You can't really have a complete record for a person without including all widespread spellings of his/her name, especially when used professionally. Mc-R is designed to assist people in knowing how to pronounce words correctly, hence the various accent marks and such; it's not really used in other contexts these days. Revised Romanization is "official", but I've rarely seen it used in practice for anything but names of major cities. Person name romanizations are officially whatever they registered with the government shortly after birth, and rarely follow any standardized romanization system. Usage in the media can vary based on circumstance and market. Yet all of these romanizations are relevant for notable persons. Example: 김재중 is the name of a famous singer. McCune-Reischauer: Kim Chaejung (used by no one); Revised Romanization: Gim Jae-jung (used by no one); romanization by subject himself on his website/album covers/etc): Kim Jae-joong/Jaejoong/Jae Joong; typical romanizaton by Korean media: Kim Jae-jung/Jaejung/Jae Jung; romanization for Japanese market: Kim Jejung; official spelling on birth certificate: who knows. So I think three properties are needed: one for Mc-R, one for RR, and one for "other commonly-used romanizations" or something like that. The record would be very incomplete without the latter. Shinyang-i (talk) 17:28, 30 May 2015 (UTC)
Personally, I prefer them in a formatted way, but currently strings in properties can't be searched easily. We can work around this by bulk-copying all string values into aliases once in a while.
I gave some more input in the Mc-R discussion, maybe it will be made available. If you plan on using the other two properties, I suggest you make proposals for them as well. I can help you convert current aliases into property values. --- Jura 04:38, 31 May 2015 (UTC)
Thanks for the input! Right now, aliases are inconsistent, with some records having many and others none. I was wondering if you know, from a technical point of view, if there is a way to make it possible so that "Jaejoong", "Jae Joong", and "Jae-joong" can all be "seen" the same way, as this is part of what makes the number of romanizations for Korean names spiral out of control so much. Same for "Kim Jaejoong" versus "Jaejoong Kim". You may have already addressed this issue, but I am totally knowledgeable about the different data types I've seen mentioned, which is another reason I can't make any proposals yet. I have to learn about all that. I'll watch for the availability of the new Mc-R property and see how that works out. Thanks again, I'm learning a lot. Shinyang-i (talk) 05:40, 1 June 2015 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────Not sure. I wouldn't worry too much about GUI issues. It tends to change and adapted based on our use of features. If you like to bulk edit the aliases, the following might help: http://quarry.wmflabs.org/query/3846 You can edit the output and use it for QuickStatements. I will eventually change it to include names without any aliases. You can run it yourself by creating a new query on "Quarry". --- Jura 07:48, 1 June 2015 (UTC)

McCune-Reischauer romanization (P1942) was finally created. --- Jura 08:54, 16 June 2015 (UTC)

Hispanic surnames[edit]

Just noticed that the proposal is still up at: Wikidata:Property_proposal/Person#second_or_maternal_family_name_of_hispanic_name. Maybe some of the active contributors want to comments. It seems to draw comments primarily form people who don't intend to add statements anyways.

Not sure what it would have to do with double barreled English surnames, but go figure .. --- Jura 04:19, 28 May 2015 (UTC)

second family name in Spanish name (P1950) was created. --- Jura 08:55, 16 June 2015 (UTC)

New datatype: monostring item[edit]

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Sascha
Joxemai
Place Clichy
Branthecan
Azertus
ToJack
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
ajayi adeniyi
Moebeus
Dcflyer
Looniverse
Pictogram voting comment.svg Notified participants of WikiProject Names

For names, it could be helpful to have a datatype that displays the same label for any language. Currently, we edit items to ensure that items like Q677191 have identical labels. Transliterations could go into statements. Descriptions and aliases could still be different for each language.

The current Wiktionary proposal (3rd or 4th proposal?) has something in this direction, but doesn't allow sitelinks. As the need isn't dependent on Wiktionary and Wiktionary implementation doesn't have a timeline, one could just define this now.

What do you think of it? --- Jura 05:10, 29 May 2015 (UTC)

I think that label will be different anyway. E.g. for Q677191 it can be James, Джеймс, or Ջեյմս. And all are valid. --Infovarius (talk) 21:20, 29 May 2015 (UTC)
In the current model yes, but only one version is the actual name, others are just transliterations. --- Jura 04:32, 30 May 2015 (UTC)


2 million items for people with P735 reached[edit]

I just updated the statistics. It's now at 2,000,204. --- Jura 22:45, 31 May 2015 (UTC)

Related projects[edit]

Dans les débats ci-après, ont été rassemblés tous les débats à ce jour et classés par thème, dans un ordre allant du plus général vers le plus particulier : portail, liens avec d'autres wiki, articles, contenu des articles.

Vous êtes tous invités à participer à ces débats.

Bien amicalement. --Guy Courtois (talk) 21:03, 8 June 2015 (UTC)

Débats sur le portail lui-même[edit]

  • Projet:Anthroponymie/Débat sur la sensibilisation de la communauté Wikipédia à ce portail
  • Projet:Anthroponymie/Débat sur le périmètre du portail
  • Projet:Anthroponymie/Débat sur le titre du portail
  • Projet:Anthroponymie/Débat sur la palette du portail
  • Projet:Anthroponymie/Débat sur la notification de projet

Débats sur les liens avec d'autres wiki[edit]

  • Projet:Anthroponymie/Débat sur les liens avec Wikidata
  • Projet:Anthroponymie/Débat sur les liens avec le Wikitionnaire
  • Projet:Anthroponymie/Débat sur les liens avec les autres projets sur l'anthroponymie des Wikipédia dans d'autres langues

Débats sur les articles[edit]

  • Projet:Anthroponymie/Débat sur les possibles regroupements des surnoms, prénoms et noms de famille au sein d'un même article
  • Projet:Anthroponymie/Débat sur le regroupement des variantes
  • Projet:Anthroponymie/Débat sur le choix entre "nom de famille" ou "patronyme"
  • Projet:Anthroponymie/Débat sur les liens entre les articles de "noms de famille" et les articles de "famille"]]
  • Projet:Anthroponymie/Débat sur les pages d'homonymie
  • Projet:Anthroponymie/Débat sur les titres des articles de famille
  • Projet:Anthroponymie/Débat sur les surnoms

Débats sur le contenu des articles[edit]

  • Projet:Anthroponymie/Débat sur la structure type d'un article
  • Projet:Anthroponymie/Débat sur les infobox
  • Projet:Anthroponymie/Débat sur les possibles contenus encyclopédiques dans les articles
  • Projet:Anthroponymie/Débat sur les sources
  • Projet:Anthroponymie/Débat sur la catégorisation des articles et le choix des portails

Use of named after (P138)[edit]

Hi,

I tried something : using named after (P138) on Kiefer Sutherland (Q103946) and Kiefer Ravena (Q6405173). Is it the right way?

Cdlt, VIGNERON (talk) 22:22, 23 August 2015 (UTC)

I had tried the same at Q18002970#P735. --- Jura 06:27, 24 August 2015 (UTC)
Looks good to me except that I would like to see a reference for stuff like this that isn't obvious. Joe Filceolaire (talk) 13:42, 24 August 2015 (UTC)

Language property, replacement of language of work or name (P407) (for names only)[edit]

Please see the proposal at Wikidata:Property_proposal/Term#language_of_name. Please help save it from the controversy about what property to apply to works (unrelated to our topic). --- Jura 06:27, 24 August 2015 (UTC)


Given names: plateau at 72%[edit]

No P735
SitelinkItems of total
en 245,188 19%
ru 118,265 40%
ja 116,446 44%
zh 97,169 68%
fr 70,335 15%
de 68,575 11%
pl 42,287 15%
es 41,077 16%
pt 31,559 19%
nl 27,348 15%
ar 25,147
fa 23,881
eo 4,637 12%
any 802,752 28%
all, but * 445,761 17%*
* excluding [ja,zh,ru,uk,ar,fa]

While the quality of the items for given names improves, the coverage of item for people stagnates at 72% (items with P31=Q5 with P735 compared to all items with P31=Q5).

How to go about to lead this further? There are about 800,000 items that still lack P735.

Personally, I skip Japanese names, but even these only account for about 100,000.

As there are some items that wont ever have P735 maybe we are currently way beyond 72%. Some have already been excluded with "novalue" (sample: Q734717#P735).

I will try to reduce the number of items with links to Esperanto, maybe it helps me coming up with a solution for the remaining ones. --- Jura 14:34, 2 September 2015 (UTC)

family names/disambiguations messed up[edit]

if someone has time, please see Special:Contributions/114.191.246.187, the usual wrong edits by SU: checks/reverts/improvements needed (<5 % correct edits, majority to be reverted). Holger1959 (talk) 10:43, 6 September 2015 (UTC)

nearly all checked now, only a few left, see the ones marked with "current". Holger1959 (talk) 14:29, 6 September 2015 (UTC)
It seems to be the same as the one that is being followed on the admin board. --- Jura 06:16, 9 September 2015 (UTC)

"family name" versus "surname"[edit]

Should the description of "instance of family name" be "family name" or "surname". It seems like no label is standard and it would be nice if it was. "Surname" is ok, but the plain English "family name" conveys the information more clearly, especially where not all readers will have English as first language. The other property is "given name", so they should match. --Richard Arthur Norton (1958- ) (talk) 22:34, 10 September 2015 (UTC)

In some countries the family name is first. Using surname can cause confusion in those cases so I prefer family name. Joe Filceolaire (talk) 22:25, 12 September 2015 (UTC)
Should we migrate all of them from "surname to "family name"? --Richard Arthur Norton (1958- ) (talk) 21:22, 17 September 2015 (UTC)

Groupings of first-name variants: occurrence statistics; and representation[edit]

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Sascha
Joxemai
Place Clichy
Branthecan
Azertus
ToJack
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
ajayi adeniyi
Moebeus
Dcflyer
Looniverse
Pictogram voting comment.svg Notified participants of WikiProject Names

Project members may be interested in these tables, which include occurrence statistics for families of first-name variants (ie François / Franz / Frank / Francis etc), as well as queries for multiple individuals with those names sharing the same dates of birth and death, for investigation.

The lists are derived from this query tinyurl.com/pv64gfg (female names), and tinyurl.com/p6xc6yv (male names) on the new SPARQL query service -- be aware that the query for male names may or may not run within the time limit for queries -- it's right on the limit of what is possible.

The queries count how many people have given names that are said to be the same as (P460) each name.

Note that this means that there are many returns for each first-name family, with the 'best inbound connected' versions of the name occurring first.

It would be nice to produce a query where each different family of first-name variants occurred only once. It may be possible to do this, but the present organisation of items makes it a lot less easy than it could be.

I wonder if it would be worth introducing some new items to represent such families -- for example, a new item like "given name cognate to James or Jacob", or "set of given names cognate to James or Jacob", that could contain the grouping information all in one place, and make the querying much easier.

One would then have either

James (Q677191) .. instance of (P31) .. "given name cognate to James or Jacob" .. subclass of (P279) .. male given name (Q12308941)

or

James (Q677191) .. part of (P361) .. "set of given names cognate to James or Jacob" .. instance of (P31) .. "set of given names" .. subclass of (P279) .. "set of names" .. subclass of (P279) ..

I think the first structure would be neater.

Such name-groups could also be nested, for example

Jørgen (Q13409273) .. instance of (P31) .. "given name cognate to Jurgen" .. subclass of (P279) .. "given name cognate to Georg or Jürgen" .. subclass of (P279) .. {{Q|12308941

What do people think: would this be a useful step forward? Jheald (talk) 14:29, 12 September 2015 (UTC)

Hello Jheald,
in fact, you stress a problem that has long ago been already noticed, and discussed, but not acted upon, since there were other priorities.
I don't know exactly what "given name conjugate to James or Jacob" means (English is not my 1st language), but I guess it's something like the 4th solution I had proposed : a top item, that would allow to join all forms, for stats, interwikis, etc. Is it right ? if yes, then, you have my full support, of course :) — (see above discussions about Konstantin, and Need for discussion…). --Hsarrazin (talk) 14:50, 12 September 2015 (UTC)
Sorry, I meant to write "cognate to" rather than "conjugate to", (ie "prénom apparenté à Jacques ou Jacob" ?), as a possible name for a single item relating to Jacobus, Giacomo, Jacques, James, Jakub etc.
It would make stats a lot easier, because one could just count for ?name_family where ?item wdt:P735/wdt:P31 ?name_family and ?name_family wdt:P279+ wd:Q12308941. This would be far more efficient than the query I've given above; and would have the advantage that each family would only show up once.
It wouldn't solve all of the problems for interwiki links (eg there could still be "Bonnie and Clyde" issues), but I think it could certainly help ease some of them. Jheald (talk) 15:38, 12 September 2015 (UTC)
  • That's an interesting set of queries and summaries.
For related given names, currently Reasonator gives an overview. Samples: James or Jørgen.
Given that it took a quite some time to simplify the structure to the reach the current one, I'd hesitate to make it more complicated again. If you feel a need for grouping names, maybe a new property should be used instead. --- Jura 14:55, 12 September 2015 (UTC)
One of the problems is that said to be the same as (P460) is rather hard to maintain in the current structure -- there is no "master copy" for each group of names, which is why there are different counts for different variants of the same name in the query tinyurl.com/pv64gfg.
The difficult bit has been establishing different items for each different individual name spelling -- which has been a mammoth job, that everybody of this project deserves a lot of kudos for. What I'm suggesting wouldn't undo that, but would build on it; and I think ought to be a fairly simple one-off bot job to create. Jheald (talk) 15:50, 12 September 2015 (UTC)
Well, sitelinks found on these items and the mixup with disambiguations is another issue.
I'm not sure if you can establish a strictly hierarchical link between anything that is currently listed in P460 and I don't think adding it into P31 makes things easier.
As it's merely for statistics, you might want to do an extract of the items and analyze this directly. --- Jura 16:13, 12 September 2015 (UTC)
Here's a query for the families of first-names, sorted starting with the largest number of variants: tinyurl.com/ok5ke4q. (Sorry, don't yet know how to turn the numeric QIDs back into items). I would propose making a new item "given name conjugate to ..." for each of these name-families. That should be straightforward enough. Yes there would still be room for discussion as to whether Jurgen-names were or were not a separatable sub-class of George-names; but at least we would then have a structure in which we could analyse such questions. Jheald (talk) 16:25, 12 September 2015 (UTC)
@Jheald: Did the translation from Qids to labels in the query for {{PropertyForThisType}} ( Properties for the class <human> for an example.
Thanks, I'll look into it, but out of time for now. Jheald (talk) 16:44, 12 September 2015 (UTC)
Version of query with labels added tinyurl.com/n9pfqno. Jheald (talk) 10:02, 13 September 2015 (UTC)
And a (slightly baroque) query to return the female given-name families, with their constituent given-names listed by occurrence frequency, tinyurl.com/nvopozw. Times out when I try to run it for male names, so any rewrites to make it more efficient would be very welcome. Jheald (talk) 12:47, 13 September 2015 (UTC)
Also, if anyone (@Jura1: ?) can see how to get only the most frequent name to be printed out for each group, this would be good. Jheald (talk) 13:00, 13 September 2015 (UTC)

@Jheald: This is a case of classification of names, so yes, I'd say it's perfectly appropriate to use instance of (P31) and subclass of (P279). Using part of is not. author  TomT0m / talk page 16:16, 12 September 2015 (UTC)

Would you have some reference to support your POV about the proposed classification? "Conjugate/cognate of James"? --- Jura 16:37, 12 September 2015 (UTC)
en:Jacob (name), en:James (name) ? :-) We aren't exactly over-run with references for all the said to be the same as (P460) claims. Isn't usefulness of the grouping enough? Jheald (talk) 16:44, 12 September 2015 (UTC)
That's two pages at Wikipedia, not one .. --- Jura 16:47, 12 September 2015 (UTC)
So maybe we also create sub-classes "cognate to Jacob" and "cognate to James" for smaller sub-families of the bigger super-family. That's something we can be entirely free to do, or to add at any point, if eg we find we have particular groups of sitelinks we want to place together.
Besides, how often are any other occurrences of subclass of (P279) or instance of (P31) supported by references? (A query could answer that question :-) ) Jheald (talk) 16:59, 12 September 2015 (UTC)
If you want to map items for statistical purposes, why not create a separate property for it? --- Jura 17:06, 12 September 2015 (UTC)
Because dividing things into smaller groups is exactly what subclass of (P279) is for {as per User:TomT0m above; and the names then fit very neatly into an ontological chain. Let me turn it round: Why are you against refining male given name (Q12308941) in this way? Class refinement in this way is a very Wikidata-ish thing to do (it seems to me). Jheald (talk) 17:14, 12 September 2015 (UTC)
What you suggest is very wikitionarish. Things for which Wikidata isn't ready yet.
I think it adds unneeded complexity.
Obviously, I'm not the only participant of this project, if Hsarrazin wants to maintain it ..
Just scroll two topics back on this page to see what awaits you. --- Jura 17:27, 12 September 2015 (UTC)
I certainly wouldn't want to break any maintenance systems.
But presumably it would be very easy to adapt them to look for incidence in the subtree of male given name (Q12308941), rather than a direct instance of (P31) ? Jheald (talk) 17:33, 12 September 2015 (UTC)
Can you clarify your need? Maybe we can find a better approach? For merely finding duplicates, I don't P735 is of much use. --- Jura 17:37, 12 September 2015 (UTC)
"I think it adds unneeded complexity". Jura, you've added unneeded complexity already! Now with your infinite p460 webs I don't know frequently which name to choose, because tens of them have the same Cyrillic spelling... --Infovarius (talk) 14:27, 14 September 2015 (UTC)
  • I have started a page at Wikidata:WikiProject Names/given-name variants for statistics and links, which should help support discussion. So far I have put up a table of female given names with more than 10 variants, with links to Autolist -- these are quite revealing, to show which languages have articles for which names. (Though I am finding Autolist very slow to return labels and links at the moment). Jheald (talk) 14:48, 13 September 2015 (UTC)
    • It's a bit like the reports such as User:Jura1/Italian first names. --- Jura 14:53, 13 September 2015 (UTC)
      • Thanks, that's a nice report. I hadn't seen it. But, as I wrote above, the most interesting thing I think, for our discussions here, is to see which languages have articles for which names. Jheald (talk) 14:58, 13 September 2015 (UTC)
        • Another interesting search: names of the Apostles: tinyurl.com/pj9fkts. More bible people here: tinyurl.com/pn968td (though fewer than there should be, perhaps because not all of the Bible is yet marked as part of the Bible tinyurl.com/qya4o88Jheald (talk) 17:47, 14 September 2015 (UTC)

Must read : internationalisation of names[edit]

Members of this project will be interested by this mail posted by Nemo bis on wikidata mailing list : https://lists.wikimedia.org/pipermail/wikidata/2015-September/007165.html and especially the master thesis linked in it : http://ulir.ul.ie/handle/10344/3450 who discussed the way to model names in international systems such as Interpol (Q8475) View with Reasonator View with SQID database and the most efficient models to capture the naming of people in different culture. As it seems that right now we mostly have procedures for handling occidental naming, this seems especially interesting :) author  TomT0m / talk page 09:03, 19 September 2015 (UTC)

If it's something WMF employees must do, maybe you should send this to a staff only list. --- Jura 11:40, 27 September 2015 (UTC)

Surnames on enwiki[edit]

On enwiki there are many pages like this one: en:Adedoyin. The corresponding Wikidata items, e.g. Adedoyin (Q16479299) have in many cases two claims: instance of (P31)=Wikimedia disambiguation page (Q4167410) and instance of (P31)=family name (Q101352). Is it okay in such cases to remove instance of (P31)=Wikimedia disambiguation page (Q4167410) at least when there is no sitelink to another project? --Pasleim (talk) 14:37, 21 September 2015 (UTC)

In ru-wiki surname articles are mostly disambigs too. --Infovarius (talk) 21:29, 21 September 2015 (UTC)
Infovarius, Pasleim: I think the point of a wikiproject is that we can propose policies and I think this would be a very good policy to propose.
Where a wikipedia page lists people with a family name and related articles named after that family name
- people using the family name as a given name, places named after people with that family name etc. -
then wikidata should treat that article as a family name article and the corresponding wikidata item should have
the statement <instance of:family name> and not <instance of:wikimedia disambiguation page>.
Symbol support vote.svg Support as proposer Joe Filceolaire (talk) 22:45, 21 September 2015 (UTC)
Symbol support vote.svg Support --Pasleim (talk) 16:06, 26 September 2015 (UTC)
Symbol oppose vote.svg Oppose lacks sample. --- Jura 11:39, 27 September 2015 (UTC)
Symbol support vote.svg Support --Hsarrazin (talk) 23:56, 7 October 2015 (UTC)
Symbol support vote.svg Support --Sascha (talk) 03:29, 15 October 2015 (UTC)

Discussion[edit]

There's already a category Category:Disambiguation pages with surname-holder lists (Q8379354). Would the corresponding item to create not be "Disambiguation page with surname-holder list" ? Or does that cause problems with sitelinks, if some wikis have an article on the name, other wikis have a list of name-holders, other wikis have a mix of the two?

Maybe what we need is two items: one for the name, one for the list of name-holders -- with some sitelinks linking to appropriate redirects on the client wikis ? Jheald (talk) 22:39, 7 October 2015 (UTC)

en:Adedoyin isn't a disambiguation page. You don't find it in en:Special:DisambiguationPages because don't contains the tag __DISAMBIG__. --ValterVB (talk) 18:17, 8 October 2015 (UTC) ps. Is a list of person with the same surname. --ValterVB (talk) 18:20, 8 October 2015 (UTC)

Hi everyone. During the past days I tried to clean up this mass of mixtures of surnames and disambiguation pages. There have been more than 30,000 of this kind; currently there is still 28,000 left (see SPARQL query). To do this, I did the following with these items:

  • If there's only one sitelink, I removed instance of (P31)Wikimedia disambiguation page (Q4167410) and changed the descriptions to "family name" (and several other languages) according to the policy above.
  • If there's more than one sitelink, I cannot say if the Wikipedia articles are real disambiguation pages or all refer to the name only. Therefore I added a new item with instance of (P31)family name (Q101352) and removed this claim from the old item. I did not touch any sitelinks to keep the status quo of linked Wikipedia articles. It might be that some of these articles are about the surname only and therefore no real disambiguation pages, but this has to be decided "by hand" and by someone who speaks that language.

This resolved the disambiguation <-> surname mass for these items. However, today Infovarius raised concerns about this, so I want to discuss my solution before proceeding. Yellowcard (talk) 00:24, 12 November 2015 (UTC)

Family names are probably there now where given name have been a year ago. I'm sure you could work in a similar way. A first step could be to implement the result of the proposal above. --- Jura 08:05, 12 November 2015 (UTC)
Hi, Jura1, what do you mean by "Family names are probably there now where given name have been a year ago"? Where have given names been? According to your last sentence: I'm implementing the proposal above for all items with exactly one sitelink. For pages with more sitelinks, the articles in various languages are usually quite different to each other: Some only refer to the name, some also refer to places or terms. Therefore I don't change the sitelinks in items with more than one sitelink unless it is German and English, so I can decide. Yellowcard (talk) 09:16, 12 November 2015 (UTC)
Have look at the column with "[4]" (mixed given name items) at Wikidata:WikiProject_Names#Statistics. We spent quite some time trying to clean them up one by one, but a more top-down approach might work better.
With what is left after the above (removing P31:disambiguation from family names without disambiguation category at enwiki), you could just create new items with P31:family name and remove that P31 from the old items. Both items could be interlinked with the "different from"-property. --- Jura 14:20, 12 November 2015 (UTC)
@Jura1: Alright, sounds good. However, there seems to be a certain lack of consensus with Infovarius. Example: Roerich (Q2161495), I removed the claim "is a family name" as at least the German article is also about an asteroide and the English one is furthermore related to a museum and the en:Roerich Pact. So I created no label (Q21452679) and added "is a family name" as statement. I just got reverted by Infovarius (he merged the items, Roerich (Q2161495) now again contains both claims which is very obviously wrong). At the beginning he claimed I wasn't willing to discuss; now, however, it seems he does not take part into this discussion and reverts my changes instead. How to proceed? Yellowcard (talk) 16:52, 12 November 2015 (UTC)
I think we were several to have had a similar discussion with him. As disambiguations are about titles with the same string sequence, I'm not sure if Cyrillic sitelinks should be on items with non-Cyrillic page titles. Obviously, disambiguation pages can include all sorts of things and we could be tempted to add properties for anything mentioned there. For given names, it's made clear that we wont use disambiguations in P735. Eventually items for disambiguations might even be replaced by a software feature. We do need to find a solution for names in different scripts (see possible solutions further down on this page), but this shouldn't have an impact on handling items for "Smith", "Jones", "Williams" etc. --- Jura 17:07, 12 November 2015 (UTC)
It seems to me that the problem is that ru.wikipedia is partly messy. Unfortunately, this messiness then tends to spill over to Wikidata. --Leyo 22:46, 12 November 2015 (UTC) PS. See ru:Рерих (51 unreviewed changes since April!) or ru:Служебная:Статистика проверок for more evidence.
The problem with unreviewed changes is specific for this surname (it has some problematic persons). --Infovarius (talk) 13:27, 16 November 2015 (UTC)
Given that some of the structure at WD comes from KrBot, I would be inclined to assume that they are highly structured :) .. it might just be that they are a structured in a way different from other wikis (possibly preferring one sentence on a disambiguation page over one sentence in a stub). Besides, their context might get them problems other wikis don't have when writing "the sandwich is made of .. and can be eaten" --- Jura 07:33, 13 November 2015 (UTC)
OK, so I will continue my cleanup work, then. The idea to add a statement with different from (P1889) to both items to express the difference seems to be a very good one. @Leyo: This is by the way a problem that occured to me during the preparation to the automatic adding of family names to soccer players, so I want to solve it as good as possible before starting with the other job. I'm going to get back to you as soon as I'm done with this one. :) Regards, Yellowcard (talk) 12:59, 13 November 2015 (UTC)
Yes, Jura, you've guessed right that a page with one trivial sentence in ru-wiki rather get "disambig" status (of course, if there are several pages with similar titles) than stub. So most of the pages about names/surnames have some "disambig-like" template. I see that Template:surname in en-wiki looks the same as russian equivalent, is in the pages with the same content, but (formally) it is not disambig-template while russian equivalent is. So I just simply don't want to separate en/ru article only because of formal existance of __DISAMBIG__ word. @Yellowcard:, look what you've done to Carradine (Q1044840): de/en/ru pages all are about surnames only, all contains only persons and why it is not surname after all?? --Infovarius (talk) 13:27, 16 November 2015 (UTC)
Infovarius, it simply does not make any sense at all to have one item with two claims "is a disambiguation page" and "is a surname". As I have explained various times now, these two things are completely different: A Wikimedia disambiguation page only exists in the Wikimedia world while a surnime exists out there in the real world. So it is clear that there have to be two separate items. That's what I did, I separated both, there is now Carradine (Q1044840) for the Wikimedia disambiguation and Carradine (Q21482744) for the surname. The next question is what item the sitelinks better fit to. I left it as it is; if you can decide that all the Wikipedia articles are about the surname only (I cannot), just move the sitelinks from Carradine (Q1044840) to Carradine (Q21482744). Yellowcard (talk) 14:04, 16 November 2015 (UTC)
I don't need claim "is a disambiguation" - may be we should go this way and simply remove all such redundant claims? But we should teach some bots not to put it again... In this case, if I move all sitelinks to second item, what would mean that empty item "is a disambig"? --Infovarius (talk) 18:23, 16 November 2015 (UTC)
If all the linked Wikipedia articles are related to the surname, moving the sitelinks to the surname item would be the best solution. But we can only do this if all articles are related to the surname only. With articles in different languages, this becomes difficult to find out. But that would be the next step, then. Regarding the disambiguation items without sitelinks: I think there is no need for them. Probably they should be nominated for deletion (or redirect to the surname item)? Yellowcard (talk) 19:35, 16 November 2015 (UTC)
In such cases I don't understand the creation of new item and just merge them back. Infovarius (talk) 10:05, 17 November 2015 (UTC)
  • @Yellowcard:, if you create a new item for surname then you should change values in existing items for persons with such surname too. Otherwise they will have claims "have surname <disambig>" which you are trying to avoid. --Infovarius (talk) 10:35, 18 November 2015 (UTC)
    • I think it could be done in batches. Sample (1) create all new items, (2) remove P31 from old items, (3) reset descriptions on old items, (4) move all old uses to new items. Constraint reports (or queries) can help identify (4). --- Jura 10:47, 18 November 2015 (UTC)
Exactly, and that's what I originally planned. My cleanup work with the disambiguations is just something that popped up during my preparations. Please give me a little bit of time, I'm doing my best. Help is welcome, by the way. We have about 16,000 of these items left. : Yellowcard (talk) 15:50, 18 November 2015 (UTC)
@Yellowcard: Well in my watchlist I see items that were already mostly cleaned Kelly (Q928249), which just needed to delete some descriptions, and was already used as a family name, brusquely transformed in disambiguation page (where Kelly (Q257429) already existed!) to create no label (Q21507150). So right now what? Are we supposed to merge Kelly (Q928249) and Kelly (Q257429)? And deplacing sitelinks and such? From what I see the more simple manner would be delete the new item and finish to clean Kelly (Q928249). Less work for everyone.
You know I agree with separating disambig/family names. But I really don't agree with creating errors along the way or creating empty new items when existing ones just needed one or two edits to be clean. --Harmonia Amanda (talk) 11:31, 19 November 2015 (UTC)
I agree that the temporary situation might not be optimal, but compared to the earlier one, I think it's minor. It's really easier to fix 20000 descriptions at once, rather than one-by-one. --- Jura 11:58, 19 November 2015 (UTC)
Oh? thousand of edits to repair the system? I cleaned Kelly (Q928249) (less than 30 seconds work). I will delete no label (Q21507150), which doesn't apport anything (except problems). Don't broke items that are already mostly clean and correctly used. @Yellowcard: should create disambiguation pages and let the olf items be family names because dab pages are rarely, if ever, used in others items, when family names can be used many times. That would mean less work. Letting items with false descriptions is creating a huge mess. When s/he "correct" only the P31 without also correcting the description, s/he ensure tje item will be misused. If really s/he want to continue like that, there should be guidelines: create a new dab page instead of a family name one if the item is already in use, correct the description so to not confuse people in the meantime, don't ever create a third item when a dab one and a family name one already exist! These empty third items will have to be deleted later that's just stupid and confusing matters. --Harmonia Amanda (talk) 12:29, 19 November 2015 (UTC)
Well, if there is already a separate item for the family name, there is no need to create a second one (step 1 in my comment of 10:47, 18 November 2015). I don't quite see how step 2 increases the messy situation. --- Jura 12:55, 19 November 2015 (UTC)
@Jura1: I agree with you but it's not what @Yellowcard: is doing! --Harmonia Amanda (talk) 14:34, 19 November 2015 (UTC)
If there are a few duplicates, we can merge them afterwards. Works fairly well with QuickStatements. --- Jura 14:36, 19 November 2015 (UTC)
Hi Harmonia Amanda, thanks for your input. I'm irritated about your statement I create a new surname item when there is already existing one, this shouldn't be happening and I use a SPARQL query for every single item to check if there is already an existing surname item. Can you give me an example where this happend? I want to prevent this in any case. Regarding Kelly (Q928249): There were two disambiguation items existing. This is a rare and even more messy case I had not respected until now: I didn't expect that there could be more than one disambiguation item with the same label. In future, I will do another SPARQL query to also look for another disambiguation item. This issue shouldn't pop up again.
Regarding the other point (this discussion became a little messy, I'll try to separate the different arguments): Sometimes it will be more reasonable to keep the existing item as surname item (remove the disambiguation claim) and create the new page for the disambiguation. This is good when the item is used multiple times on Wikidata but could cause problems regarding the Wikipedia sitelinks as the linked articles (intentionally disambiguation pages, not necessarily related to surnames only) are connected to a surname item, then. This could be completely wrong. Anyways, it is much more difficult to decide about the latter situation due to lacking language skills. Therefore I planned to do it my way, keeping the sitelinks with the disambiguation item. The resulting constraint violations will be fixed in a second step: I will take ALL claims that have family name (P734) → disambiguation page and will repair these statements with the correct surname item. It is, as Jura said, much more easy to do it in two bunches than checking it in parallel. I'm going to be done with all this in less than two weeks. However, I stopped my work for now and wait for your input. My work might cause a few more edits than the other way, but on the other hand it will prevent existing errors after completion of the second step. Your suggestion could hard-to-detect errors regarding the sitelinks. Thanks again, Yellowcard (talk) 16:56, 19 November 2015 (UTC)
@Yellowcard: No the second Kelly item wasn't a disambiguation page, it was a family name page with some wrong descriptions. And you are creating items without the correct descriptions ("nom de famille" in French, "family name" in English, etc.) which would facilitate the work. I assume you create them with a gadget or some script, not manually, so you should add that. And of course it would help with the errors because you can't have two items with the same label/description. When exactly do you plan to correct the sitelinks you destroy? For Russell (Q1158262) you let all the sitelinks to the disambiguation pages when they should be with the family name. And you let also the aliases "*** (family name)" on now dab items. It's a mess.
You could:
  • Create a new family name, with labels, descriptions and aliases
  • Create a list of pages whose sitelinks should be verified (because I don't wait at all to pass on several thousands disambig pages we have already mostly cleaned because you couldn't treat the sitelinks)
  • Correct the descriptions and aliases: if now it's a dab page then the description should reflect that. Use DataDrainer to empty them of wrong descriptions and add after the correct ones.
Kelly isn't the exception. You want another one? Russell (Q1158262) was working as a family name and just needed a little cleanup (and has all the correct interwikis for a family name). Template:Q218032 was the disambig page. no label (Q21507870) is your new, useless family name. If I continue to poke into your edits, how many more items to delete will I find? --Harmonia Amanda (talk) 17:15, 19 November 2015 (UTC)

Per the example below that I wrote before noticing this discussion, if we have different items for the same family name using <instance of:family name>, <instance of:wikimedia disambiguation page> and also <instance of:family>, then I suggest linking them altogether using P:P460. The reasoning behind this is that their topic is... the same family name, the rest being formatting. Place Clichy (talk) 12:23, 19 November 2015 (UTC)

@Harmonia Amanda: Let's work out your points one by another. "No the second Kelly item wasn't a disambiguation page, it was a family name page with some wrong descriptions." Can you give me the specific ID, please? Kelly (Q928249) was the typical disamb-surname-mix, Kelly (Q257429) was a disambiguation page. Where is the item "Kelly" having been a surname? Yellowcard (talk) 17:31, 19 November 2015 (UTC)
Even if you make reasonable efforts, you can't avoid duplicates entirely. Labels and descriptions aren't necessarily complete and up-to-date. You might want to attempt to standardize some of the existing descriptions for family names. For given names, I had done a few queries with quarry. --- Jura 09:50, 21 November 2015 (UTC)
I add standardized descriptions in various langauges (I took them from LabelLister) when I clean up the disamb-surname-mixes. However, I'm only looking at the claims the item contains. If there's instance of (P31)family name (Q101352), I consider it a surname, and if there is instance of (P31)Wikimedia disambiguation page (Q4167410) then I consider it an item for a disambiguation page. The descriptions are totally unreliable and are often mixed up (e.g. German "Begriffsklärungsseite" → disambiguation page and English "surname").
However, and this bothers me, Infovarius has obviously started to revert some of my work without reacting on his talk page neither giving any reason in the summary. So there are items I had cleaned up back in the disamb-surname-mix state. Annoying. Yellowcard (talk) 17:12, 21 November 2015 (UTC)
Yellowcard, you haven't answered to my reply about merging, so I decided that you are agree with that. I am just correcting your redundant creations when there are no actual disambiguations and the item is about surname only. And it's annoying that Jura, and now you, don't understand that you are creating problems for cyrillic languages and are continuing to do that. --Infovarius (talk) 12:36, 22 November 2015 (UTC)

etymology[edit]

@Jura1: you reverted both my changes to the paragraph on etymology

As the discussion on the proposed property referenced in that paragraph makes clear that adding etymologies in a structured way will not happen until Wiktionary is datafied. The current wording is misleading and should be changed. I think my wording was better.

My other change was to show what we can do using named after (P138) and the limits of that property. I think it was accurate and useful. Please don't revert stuff that improves the content. Joe Filceolaire (talk) 21:56, 7 October 2015 (UTC)

The proposal doesn't really exclude anything. New proposals can be made.
Do you have anything to support your claim that in one year (maybe more) Wiktionary will bring this in a structured way?
Do you intend to contribute to this project in one way or the other? We could need help building the items. --- Jura 22:00, 7 October 2015 (UTC)

Cyrillic - values for given name (P735)[edit]

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Sascha
Joxemai
Place Clichy
Branthecan
Azertus
ToJack
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
ajayi adeniyi
Moebeus
Dcflyer
Looniverse
Pictogram voting comment.svg Notified participants of WikiProject Names

How about an approach like: Q21103659? --- Jura 17:02, 14 October 2015 (UTC)

Another approach could be: Q21104340 --- Jura 18:36, 14 October 2015 (UTC)

@Jura1: Both appear to have the potential to be an awful lot of work. I'm really sorry I don't have a better idea. Your personal feelings? Jared Preston (talk) 19:05, 14 October 2015 (UTC)
I do not have complete vision of P735 system, so I simple mention some disadvantages. Q21103659 is looked strange, because English or Chinese language does not use Cyrillic symbols, these languages have standard transliteration for names. Phrase "Don't use as single value in P735 for people born in the United States." from Q21104340 is looked strange too because human names are cross-countries in general case. — Ivan A. Krestinin (talk) 19:34, 14 October 2015 (UTC)
A small issue we have now is that an Italian could be named "Ivano" and this would probably get rendered as "Иван" in Russian, but in English it's likely to remain "Ivano". Thus we have an item for "Ivano" with the same label in all languages with Latin script.
A Russian named "Иван" might have his name rendered as "Ivan" in English, but as "Iwan" in German. There is some debate which item to use in these case. Q830350 isn't entirely suitable.--- Jura 19:49, 14 October 2015 (UTC)
I understand the problem, but it is looked wider. For example German name "Hanns". This name have at least 2 variants of Russian transliteration: Ханс and Ганс. Different German persons have same German name, but different name writing in Russian sources. Claim <given name (P735)> Hanns (Q16276296) (with double Russian naming) is true of cause, but it is inaccurate. So I afraid that property given name (P735) can not be fully consistent in principal. — Ivan A. Krestinin (talk) 20:35, 14 October 2015 (UTC)
You have been warned. Now probably you should create a separate item for every single person's first name and every single person's surname, since the combination of transcriptions and translations in other languages can be different for every single person. Enjoy ;) --Shlomo (talk) 21:03, 14 October 2015 (UTC)
  • @Ivan A. Krestinin:: the item should get two values in P735: one for "Hans", one for the transliteration. --- Jura 07:13, 15 October 2015 (UTC)
... one for transliteration into each non-Latin language (>1000). --Infovarius (talk) 17:54, 15 October 2015 (UTC)
 :) hopefully, the same Hans wasn't described in all those languages --- Jura 14:39, 18 October 2015 (UTC)
  • @Jared Preston:: Q21103659 might be a bit less maintenance (depending on the bot), the datatype suggested above would make it easy. As Ivan mentioned, Q21103659 can look strange in Chinese, but it makes it clear what the name is. Transliterations would still be available. Q21104340 is closer to what is currently done, but has the disadvantages of that (a transliteration is displayed, but that might not be the one to look for in this specific case). With both, P1705 should be accurate. --- Jura 14:39, 18 October 2015 (UTC)
  • Shall we go with Q21104340? --- Jura 14:39, 18 October 2015 (UTC)
    It is not fair symmetric to transliterate Cyrillic names into Latin script but not to transliterate Latin names into Cyrillic script. --Infovarius (talk) 15:29, 19 October 2015 (UTC)
    Both approaches can be made to be symmetric. --- Jura 06:18, 20 October 2015 (UTC)

Indeed it's more fair and circumvents the problem with different transliterations when we would use cyrillic names.--Kopiersperre (talk) 14:37, 10 November 2015 (UTC)

The situation where a language (even en) has several articles for several transcriptions of the same name seems to be the exception rather than the norm. In some cases, these articles are almost empty and can be redirected without trouble. For this reason, I support having a single Wikidata item for different transcriptions of the same name, until at least one language has at least 2 articles for 2 different transcriptions. That way we can have en:Ivan (name)/de:Iwan (name)/ru:Иван/uk:Іван on the same item, which by the way is the current situation on Ivan (Q830350).
Remember, the purpose of Wikidata is also (some would say initially) to provide inter-language links between Wikipedia articles, purpose which cannot be served if these articles are linked to standalone Wikidata items (with zero or very few other Wikipedia links), whereas there are other existing Wikipedia articles that could very well be linked. Place Clichy (talk) 10:17, 23 November 2015 (UTC)
We already have Q19688630. So your suggestion is to merge "Etienne" et "Stéphane" and choose no special approach for other scripts, e.g. Cyrillic? --- Jura 15:38, 23 November 2015 (UTC)
  • Just to be sure, do you understand guys that there are more than one language that using Cyrillic script (with different transliteration rules) and even could be few traditional transliterations of name and few transliteration rules systems in one language. For now Wikidata looks definitely unusable for Cyrillic languages Artem.komisarenko (talk) 15:11, 14 January 2017 (UTC)

Atsuko[edit]

How to model the given name Atsuko? In Japanese, there are ~13 different ways of writing it.

Should Q9161797 be split into 13 separate items with labels like {ja:敦子, en:Atsuko}, {ja:篤子, en:Atsuko}, {ja:惇子, en:Atsuko}, etc.? We already have different items for {en:Sara, ja:サラ}, {en:Sára, ja:サラ}; and it’s hard to argue why 篤子/惇子 should be modeled different from Sara/Sarah. Thoughts?

If we split names that have different spellings in other langauges, what would be the English description string for these items? With Sara/Sarah, the Japanese description indicates the Latin spelling for disambiguation, see for example Q603701. --Sascha (talk) 04:13, 15 October 2015 (UTC)

  • There was some discussion with Wylve further up on this page. Have a look at Kenichi. --- Jura 07:13, 15 October 2015 (UTC)

Merging and unmerging[edit]

Apparently, at the other end of Wikidata, Andrawaag and Sebotic have a similar issues with items about genes: Mailing list: Data model explanation and protection.

Most participants of WikiProject Names seem philosophic about such occasional hickups. Maybe there are some things that tend to work we can share with them. --- Jura 11:41, 29 October 2015 (UTC)

It seems they figured it out in the meantime. --- Jura 00:39, 31 October 2015 (UTC)


Roman names[edit]

How shall we map Roman names? praenomen, nomen, cognomen, agnomen are currently partially mapped to P735/P734. Shall we create separate properties for these? --- Jura 11:37, 5 November 2015 (UTC)

Proposals are at Wikidata:Property_proposal/Term#Roman_names. --- Jura 12:13, 10 November 2015 (UTC)

Family/Family names/disambiguation pages[edit]

Hello, bonjour,

I would like to have the opinion of the project on what is the best way to link the following three items covering very related subjects:

Currently Q1499207 (dab) and the empty Q21507177 (family name) are connected with a P:P1889 (different from) while Q21000667 (family) only has a P:P910 link to the category item Q10024382.

I believe that in such a case one option could be to link all three using P:P460, because for the reader these items cover pretty much the same subject, the nuance between the three is only the format in which each Wikipedia project would present the same information. Theotokis is just an example, the question happens whenever we have Wikidata items for a family name, the family with this name, and the corresponding dab page, which is often the most frequent type of page found on Wikipedia for a family name.

What do you think? Place Clichy (talk) 11:17, 19 November 2015 (UTC)

The type of the items is different (P31). Personally, I use P460 for items with the same or a similar type. To link disambiguations that tend to get mixed-up, I use different from (P1889). Not sure what the optimal link between family and family name items would be.
BTW, in the above samples en:Theotokis should probably go on Q21507177. If the name was "Dupont", one could imagine having hundreds of items for Q8436. --- Jura 13:01, 19 November 2015 (UTC)
@Jura1: I certainly would not support linking en:Theotokis to Q21507177, which has no other interwiki link. Remember, the purpose of Wikidata is also (some would say initially) to provide inter-language links between Wikipedia articles, purpose which cannot be served if these articles are linked to standalone Wikidata items, whereas there are other existing Wikipedia articles that could very well be linked. Place Clichy (talk) 10:17, 23 November 2015 (UTC)
We can't just add sitelinks to more or less related items just to generate more interwikis at Wikipedia. If links to related items are considered desirable, these need to be defined locally. Template:Interwikis from P460 (Q21529474) can help. --- Jura 11:34, 23 November 2015 (UTC)

Once more: Russian name[edit]

What the project can propose in the following case. We have a person Sergey Markin (Q19910005) with name "Сергей". When I want to add a name property, I see a bunch of items with such a label: Сергей, Сергей, Сергей, Сергей, Сергей, Сергей and also similar Sergius, Szergej (Q20188306). Almost all items are empty (without sitelinks). Which should I choose? And such mess was created by participants of the project and is considered by some of them as an advantage. By my opinion, it's a disaster. --Infovarius (talk) 12:46, 22 November 2015 (UTC)

I think we need to come to a conclusion with Wikidata_talk:WikiProject_Names#Cyrillic_-_values_for_given_name_.28P735.29. Maybe you could support one or the other solutions or propose another one. --- Jura 12:59, 22 November 2015 (UTC)
Unfortunately, I don't see any of these solutions as ideal, and currently I can't propose better one. I'll try to start thinking about this by creating a list of requirements to names model in Wikidata. Give me please some days to make it (parallel to real life). --Infovarius (talk) 13:06, 25 November 2015 (UTC)

Need help with list[edit]

Can someone help me with this list? I need list of names where Latvian (lv) label is not equal to English label (en). Some time ago I and some others added different labels in Latvian to foreign names (our language rules require this). However this is not how it should be here in Wikidata. --Papuass (talk) 11:25, 23 November 2015 (UTC)


Template:Interwikis from P460 (Q21529474)[edit]

This template at plwiki generates additional interwikis for names based on items linked in said to be the same as (P460): sample at pl:Paweł.

It only adds one interwiki for each wiki. The template uses the "interwiki" function in the module at Module:Patches (Q21529482) by Paweł Ziemian. --- Jura 12:39, 23 November 2015 (UTC)

I imported part of the module to Q21533309 and it can be tested at Wikidata:Sandbox/3 with values at Q15397819#P460. --- Jura 15:56, 23 November 2015 (UTC)
"One interwiki for each wiki" still not good solution. In sandbox there are awful things like bg:Юрий connected with ru:Джордж rather than ru:Юрий. --Infovarius (talk) 13:58, 24 November 2015 (UTC)
The question is if ru:Джордж is a reasonable match to "Sandbox/3", not if it's the closest match to "bg:Юрий". --- Jura 18:29, 24 November 2015 (UTC)
I added "old-style" interwiki link to the sandbox. The module look for them before adding automatic links. Paweł Ziemian (talk) 20:24, 24 November 2015 (UTC)
There is also Module:Interwiki (Q20819069) by Innocent bystander. The difference seems to be that it takes interwikis just from one item listed in P460. --- Jura 10:56, 25 November 2015 (UTC)
It take only interwiki from one single item, yes. It take form the first "best statement" ie, first preferred ranked statement if it exists, otherwise the first normal ranked statement, but never any "deprecated" statement. It also allows you to choose the used item yourself by the parameter qid. -- Innocent bystander (talk) 11:57, 25 November 2015 (UTC)

Wikimania 2016[edit]

Only this week left for comments: Wikidata:Wikimania 2016 (Thank you for translating this message). --Tobias1984 (talk) 11:49, 25 November 2015 (UTC)


Project focus for 2016[edit]

It seems that coverage and structure for given names has gotten fairly stable.

What shall we focus on in 2016?

Possibilities could be:

  • develop reference lists
  • expand items with additional statements (which properties?)
  • Roman names
  • family names
  • lists of most frequent first names

--- Jura 11:44, 22 December 2015 (UTC)

I'd like to see lists of most frequent given names in a particular language. Ham II (talk) 19:13, 21 January 2016 (UTC)
That would be a good addition indeed. Currently, we have two lists for the Netherlands.
--- Jura 07:31, 29 January 2016 (UTC)

Names as labels in as many languages as possible[edit]

Is there a tool for adding a person's name as a label in every language that uses the Latin alphabet? Ham II (talk) 19:17, 21 January 2016 (UTC)

@Ham II: Yes, NameGuzzler. --Harmonia Amanda (talk) 04:27, 22 January 2016 (UTC)
@Harmonia Amanda: Face-smile.svg Thank you Ham II (talk) 16:13, 23 January 2016 (UTC)
You could also use Add Names as labels (Q21640602) to copy all first name labels from "en" to another language.
--- Jura 07:31, 29 January 2016 (UTC)

Default label for given names[edit]

As we can now (or soon) use the language code "mul" (for multiple languages), we could use the native label property to store the default label for given names.
--- Jura 07:31, 29 January 2016 (UTC)

Revise "How to clean up a given name item"[edit]

As the initial cleanup is done, I think we should avoid converting disambiguations to given name items. I'd remove that step from the "How to".
--- Jura 07:31, 29 January 2016 (UTC)

If you mean "don't mix disambiguation page with page on name or surname" I agree. --ValterVB (talk) 19:45, 31 January 2016 (UTC)
It's about Wikidata:WikiProject_Names#How_to_clean_up_a_given_name_item.
When it was written we had many items mixing first names and disambiguations (see the stats: 7000+, now: only <100).
Given that this cleanup is now mostly done, we should avoid converting existing disambiguations to given name items.
This to provide stable QIDs.
So if one or several links on disambiguation items are about first names, these should be moved to other or new items.
Empty disambiguations shouldn't be redirected to first name items.
--- Jura 10:12, 1 February 2016 (UTC)

Roman names (bis)[edit]

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Sascha
Joxemai
Place Clichy
Branthecan
Azertus
ToJack
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
ajayi adeniyi
Moebeus
Dcflyer
Looniverse
Pictogram voting comment.svg Notified participants of WikiProject Names,

Wikidata:WikiProject Names#Roman names is still empty ; is there still no stable structures for roman names ? Can someone at least add the properties Roman praenomen (P2358), Roman nomen gentilicium (P2359), Roman cognomen (P2365), Roman agnomen (P2366) (if that's ok) and maybe some exemples?

(FYI, I need it for Titus Flavius Postuminus (Q3529655) and Lucius Campanius Priscus (Q22813570)).

Cdlt, VIGNERON (talk) 10:12, 18 February 2016 (UTC)

Just do it :) -Ash Crow (talk) 11:48, 18 February 2016 (UTC)
coucou VIGNERON,
c'est [1]que tu voulais ? ou tu en veux plus... si c'est le cas, n'hésite pas, vas-y !! --Hsarrazin (talk) 15:59, 18 February 2016 (UTC)
Merci Hsarrazin. It's a good start but I'd like some examples too. I'm just beggining with my two items so I'm unsure how to do it best ; for instance, is it mandatory to duplicate the roman praenomen? Like Lucius (Q6697451) and Lucius (Q12382759) (by the way, there is no relation between this two, shouldn't we put said to be the same as (P460) and/or different from (P1889) ?). Cdlt, VIGNERON (talk) 09:26, 19 February 2016 (UTC)
personally, I would swear they are the same, even if one is roman and the other the form used in northern Europe -- but, seems there are 2 enwiki links
the final structure for first names is still under discussion, because of some of the absurdities caused by translitteration -- at least, a said to be the same as (P460) is the norm. I guess some are still missing… - I added it on both items. - If you discover other cases, don't hesitate to boldly add it ;) --Hsarrazin (talk) 08:31, 20 February 2016 (UTC)
for examples on how to use the properties for Roman name parts, I'll work on it later. Mom's visiting ;) --Hsarrazin (talk) 08:31, 20 February 2016 (UTC)
VIGNERON je viens de compléter les 2 éléments que tu donnais en exemple avec Roman praenomen (P2358), Roman nomen gentilicium (P2359) et Roman cognomen (P2365). Lorsque le cognomen n'existe pas encore, il faut créer un élément : j'ai créé no label (Q22917052). Par contre, Postuminus (Q16036395) existait déjà avec une définition "page d'homonymie", mais le seul lien (frwiki) le désignait bien comme cognomen, donc j'ai changé l'élément pour l'utiliser ;)
si tu t'attaques aux romains, bon courage ; avantage : les prénoms sont très peu nombreu… pour le reste, il y a du boulot, et je crois que les bulgares, en particulier, en ont créé un bon paquet :) --Hsarrazin (talk) 18:55, 22 February 2016 (UTC)

verschieden von (P1889)[edit]

Wird in Bezug auf Namen "verschieden von (P1889)" irgendwo verwendet, z. B. auf gleichlautenden "Begriffsklärungsseiten" oder zur Unterscheidung von "Vorname" und "Familienname", bzw. "weiblicher Vorname" und "männlicher Vorname"? --Harry Canyon (talk) 20:28, 12 March 2016 (UTC)

If the question is where different from (P1889) is being used: it's being used for disambiguation pages, sample: Q16281827#P1889.
--- Jura 07:50, 13 March 2016 (UTC)

Klammerzusatz unter "Auch bekannt als"[edit]

Häufig findet sich der Klammerzusatz "Cato (given name)" in der Spalte "Auch bekannt als". Was soll damit geschehen, stehen lassen oder entfernen? --Harry Canyon (talk) 20:39, 12 March 2016 (UTC)

You shouldn't find them in German, please remove them there. It's only used in English. Search makes it sometimes hard to select a name.
--- Jura 07:50, 13 March 2016 (UTC)
@Jura1: You write that the addition should remain only in English. But here you doing my editing reversed if I remove the French supplementary. Why should not stand still the German additional? --Harry Canyon (talk) 07:35, 15 April 2016 (UTC)
" (given name)" is English, not German or French. So you shouldn't find as an alias in languages other than English.
--- Jura 13:58, 15 April 2016 (UTC)
@Jura1: In German is the alias "Hans (Vorname)". So should not remove the other aliases in other languages. Here I have added an example in German. --Harry Canyon (talk) 23:02, 15 April 2016 (UTC)
If you think such aliases are useful for German, you might want to add them. Here is a list of items without such an alias in German]. I think for English I added them to 1000 most frequent given names. It's fairly easy to do with QuickStatements as it wont add the same alias twice.
--- Jura 11:50, 16 April 2016 (UTC)


Barbora (Q13537192)[edit]

Ist es richtig, dass die beiden Objekte (Barbora und Borbora) zusammengeführt wurden? Gruß, Harry Canyon (talk) 16:03, 30 March 2016 (UTC)

Die selbe Frage zu Q6207220 und Q18121213. --Harry Canyon (talk) 16:25, 30 March 2016 (UTC)
Hallo User:ValterVB, have you seen my question before you made this change? I think User:Katzenfrucht has already made such changes frequently, see for example here. --Harry Canyon (talk) 20:18, 1 April 2016 (UTC)
@Harry Canyon: I don't see your question, where is? Anyway the problem is that we can't mix disambiguation with surname, given name or list of person. I have splitted Jochen (Q6207220) because de and ja are disambiguation, instead en is a page on given names, isn't a disambiguation. Barbora (Q13537192) isn't correct: only en and ja are disambiguation the other are page about given name. --ValterVB (talk) 20:33, 1 April 2016 (UTC)
@ValterVB:That was the beginning of my question, see one over it, if this change of User:Katzenfrucht was right. Best regards, Harry Canyon (talk) 20:43, 1 April 2016 (UTC)
No it's a wrong merge. I will fix it. --ValterVB (talk) 21:01, 1 April 2016 (UTC)
✓ Done Barbora (Q15938720), Barbora (Q13537192) and Borbora (Q893357)--ValterVB (talk) 21:11, 1 April 2016 (UTC)

Hanna[edit]

Are these two objects differently Hanna (Q15729076) and Hanna (Q19664967)? --Harry Canyon (talk) 11:35, 15 April 2016 (UTC)

@Чаховіч Уладзіслаў: What is the difference? --Harry Canyon (talk) 11:48, 15 April 2016 (UTC)
Q15729076 (Ханна) and Q19664967 (Ганна) --Чаховіч Уладзіслаў (talk) 11:52, 15 April 2016 (UTC)
@Harry Canyon, Чаховіч Уладзіслаў, Jura1, Infovarius:, I cleaned up the descriptions of: Anna (Q22713652) (Анна), Anna (Q666578) (Anna), Hanna (Q15729076) (Hanna), Hanna (Q19664967) ((Ганна)), Hannah (Q1554377) (Hannah), Hanne (Q1575808) (Hanne), Ana (Q482671) (Ana). There are probably given names still missing. --Harmonia Amanda (talk) 09:29, 18 October 2016 (UTC)
So you mean "Ганна" as Russian label for Belorussian name. So it would be wrong for Russian spelling. So we have to take 2 different values as a name for many persons - one for Russian spelling, second for Belorussian. So I suppose we end up taking different values as a name for every language - it will be very large mess... --Infovarius (talk) 11:19, 18 October 2016 (UTC)
Infovarius, белорусские, русские и украинские личные имена (за редким исключением) между собой переводятся. Русское Анна по-белорусски будет Ганна, а белорусское (и украинское?) Ганна по-русски Анна. Т.е. 2 элемента будут иметь в вост.-слав. языках одинаковое название (а в случае с Николай, Мікалай и Микола — 3), но разные в других языках — тут важно правильно оформить описание, чтобы избежать путаницы. Куда большую проблему представляют неразделенные по языкам имена, которые вводят пользователей в обман. --Чаховіч Уладзіслаў (talk) 11:54, 18 October 2016 (UTC)
Да, переводятся. Так же, как и все остальные имена - как-нибудь да переводятся. И что из этого значит? Белоруска с белорусским именем Ганна какое должна иметь русское имя? (Ответ: бывает по-всякому). Конкретно в случае этих двух имён - мы уже имеем не один элемент, причём be:Ганна и ru:Анна не соединены. К тому же транслитерируются эти имена в латиницу по-разному. Так что конфликт неизбежен. --Infovarius (talk) 14:44, 19 October 2016 (UTC)
Yes, the languages using Cyrillic will have to separate names not using the same exact string, exactly as we did for Latin languages. And yes, that will mean many people will have several given names, with the qualifier language of work or name (P407) when they had several official languages/given names (I think all Belarusian people had official Russian spelling of their names during USSR?). It's already how we do things for all Latin given names, and I see no reason why it wouldn't work for other writing systems. We already do this for Korean people later becoming American; their Korean name isn't the same item than the transliterated name, both are true, so both are listed (with start time (P580)/end time (P582) when we can). It's a mess when several organisational systems coexist. I'm not overly enthusiastic with the approach "a string = an item" but it's the one which was chosen and the one mostly used now. And no better idea was ever proposed, so we can at least try to implement this one properly. --Harmonia Amanda (talk) 12:51, 18 October 2016 (UTC)
I think "Hanna" as edited by Harmonia looks good. I assume the Belorussian name is transliterated correctly otherwise we would potentially have different strings in languages with Latin script. I'm a bit reluctant to include Russian names though. We keep getting problem reports about them and it seems that the current system works for names in Latin script, Japanese and Korean names, even Belorussian, but maybe we just need to see a clearly outlined solution for Russian ones before. Possibly, it's merely a problem about agreeing how to link ruwiki from such items.
--- Jura 11:45, 19 October 2016 (UTC)
Belorussian treating is no better than Russian. Now User:Чаховіч Уладзіслаў has a tactics to separate Belorussian names, and this creates problems for Soviet people (and all Belorussian, at most). Hieroglyphic names can have the same problems but I am not deep into them. Can anyone say: if there's situation when 1 Latin name translates into several Japanese/Korean strings? And opposite? --Infovarius (talk) 14:44, 19 October 2016 (UTC)
@Infovarius: There is no entity translating automatically with magic unified rules names. Usually one same person name immigrating into a country can maybe take the name people call him, so I say there could be probably one translation per person - how the person want to be called, how people call him, which language is used in the country ... There is probably many many examples. Take for example Beijing (Q956) View with Reasonator View with SQID. Traditionnaly in france the city is called "Pékin", in english "Beijing" ... And this is for a major name. Imagine what it is probably for the the zillions of family names ... author  TomT0m / talk page 14:55, 19 October 2016 (UTC)
@Infovarius:, Li (Q686223), Li (Q17008106), Li (Q15283218), Li (Q11983876), Li (Q2233716), Li (Q3447118), Li (Q10910874), Li (Q13588410) are all transliterated "Li" (Li (Q770891)) is that what you seek. Most Latin names will have several Japanese transliterations depending on the pronunciation, etc. --Harmonia Amanda (talk) 16:17, 19 October 2016 (UTC)
Ok, Latin-speakers, how do you feel about surname property in Yao Lee (Q1189731)? --Infovarius (talk) 06:34, 21 October 2016 (UTC)
Well thank you because the family name wasn't 莉 but 姚, and I wouldn't have corrected it without you linking it. And I don't see a problem (except this error). --Harmonia Amanda (talk) 07:38, 21 October 2016 (UTC)
Still I see that "Yao Lee has surname Li". --Infovarius (talk) 14:31, 24 October 2016 (UTC)
Yes. And? --Harmonia Amanda (talk) 16:24, 24 October 2016 (UTC)

Linking names item to their string[edit]

Hi people, for a while now I'm using Name in original language search to identify for sure how to write the name in a language, in a "monolingual text" datatype. Arguably the labels used in languages might be ambiguous. This edit : https://www.wikidata.org/w/index.php?title=Q22713652&diff=321920566&oldid=315798228 makes me begin a discussion about that because I can't explain the rationale of the author of the merge and the resulting fusioned item seems like a big ball of mess to me. How do we solve this kind of problem in a clear and unambiguous way ? author  TomT0m / talk page 10:42, 16 April 2016 (UTC)

  • It's a problem that users had in other fields as well (even biologists merging similar items in genetics). It can't be completely avoided.
Not too long ago, I added different from (P1889) with a qualifier to cut down on problematic merges (sample: Q666578#P1889). The property said to be the same as (P460) should also limit it. Obviously, users can still delete them.
native label (P1705) seems like a good solution. I think most of the time we should set the language code of monolingual string to the recently created "mul" (="Multiple languages").
--- Jura 11:50, 16 April 2016 (UTC)

Bernado[edit]

In dem Fall ist Bernado kein russischer/weißrussischer Name: no label (Q15731576) und no label (Q20823800). Laut enWP und deWP ist der Name portugiesischen/spanischen/italienischen Ursprungs, trotzdem wurde meine Zusammenführung rückgängig gemacht. Kann mir mal jemand erklären, weshalb die beiden Namen unterschiedlich sind und getrennt behandelt werden sollen. --Harry Canyon (talk) 16:46, 18 April 2016 (UTC)

(Sorry for English) @Harry Canyon: Because they are spelled differently in Cyrillic languages and being in 1 item they make mess to naming of specific persons. Please feel yourself uncomfortable here, as all Cyrillic-languaged users are feeling all the time with Jura's reforms. --Infovarius (talk) 10:32, 7 October 2016 (UTC)

Add labels from names item[edit]

Hi, this request needs input and is related to names : Wikidata:Bot_requests#labels_from_name_properties. author  TomT0m / talk page 16:54, 12 July 2016 (UTC)

Use of key event[edit]

Hi @Jura1:, I don't understand why you use significant event (P793) View with SQID for the last two examples of the table, like "most frequent name in" and "authorizisation". My opinion is that this is better modelled as instance of (P31) statements - especially the first one - with a "begin date" qualifier. The rationale is that this become a special kind of names, that come to belong to the class of frequent names. But this status can come to an end. It will be easier to query the frequent names at the time of the query if the status of names has a "preferred" rank than to query the class of all names that have a key event of "entering the most frequent name" with no "no more very frequent name". The rank can be changed to "normal" when the name is no more a member of the class. Did I miss something ? author  TomT0m / talk page 10:12, 21 September 2016 (UTC)

I don't think P31 is suitable for statistics nor is the use preferred rank for selection among dozens of statements in P31. I don't think we actually use it that way anywhere. Anything needing a reference is probably not of much use in there either.
--- Jura 17:56, 24 September 2016 (UTC)
Using ranks is no less relevant than in any over case as it's one of their main use - it's actually one of the purpose of ranks, highlight current data in preferred and put outdated one in normal. Statistics very often use of define classes, so I think instance of (P31) could especialy be useful in that area. Anything needing a reference is probably not of much use in there either. ?? author  TomT0m / talk page 18:12, 24 September 2016 (UTC)

Aganippides (Q21548091)[edit]

Aganippides (Q21548091) is another name for Muses (Q66016). Which instance of (P31) should it get and how can it be connected to Muses (Q66016)? Thanks, --Marsupium (talk) 11:34, 29 September 2016 (UTC)

Name-gender: can we reach a consensus?[edit]

Sorry if (as an outsider) I bring up a topic that has already been resolved or is a non-issue, but sorting out name objects imported from :slwiki, I inadvertedly created a confusion yesterday, and I'd like to have a clear guideline for the future. The issue is labeling names either as male given name (Q12308941) or female given name (Q11879590) where the use depends on the culture. For example, Andrea (Q493293) is a male given name in Italy, but predominantly female in English-speaking countries, so the current solution - labelling it unisex given name (Q3409032) may not be accurate, because it isn't really gender-neutral, at least not everywhere. One solution could be to have two instance of (P31) (male and female) if the goal is to have a single object for equivalently-spelled names. Then, using perhaps language of work or name (P407), we could specify which case applies in which culture.

The last point brings me to the second problem. Judging by the discussion two years ago, people seem to prefer having as few objects as possible. But if we look at the name "Andrea", there's as many as three objects - Andrea (Q493293) as the base, which has part (P527) (1) Andrea (Q18177306) (male) and (2) Andrea (Q18177321) (female). I find this confusing, so I'd like to contribute to a better solution. Thoughts? — Yerpo Eh? 07:45, 18 October 2016 (UTC)

This needs classes of names specialized for languages like "French (mainly) male name". This would naturally occurs however if we consider that linguistic variant of the names are different (related) names, for example if they spells or are pronounced differently and they are derived from each other or from a common linguistic root. Then each of the name items could have its own property (@Infovarius: because of the discussion above related to cyrillic names/spelling - one argument for having different items).
Naturally this solution requires to search if we have sources for derivation of names over times ... author  TomT0m / talk page 11:38, 19 October 2016 (UTC)
  • I don't see why we couldn't have several items for different names that are spelled the same way in Latin script, if there is a way to differentiate them (compare Santos (pt) and Santos (es) and Santos (mul); or Special:Search/Yuriko given name). We would have specialised properties for languages and native label, so we should make use of that.
    --- Jura 11:45, 19 October 2016 (UTC)
    • There's no need to have specialized properties, except maybe in a very few cases. We can use the same properties as we would already have specialized classes - also we definitely can used generic properties with specialized values. author  TomT0m / talk page 12:24, 19 October 2016 (UTC)
So, if I understand correctly, we should make three objects (generic, male, female) for any name that is used for both genders and put interwikis under the generic one (analogous to the "Andrea" example)? — Yerpo Eh? 09:39, 20 October 2016 (UTC)
Not in general, it depends on the name. Santos has several as they have different origins and pronunciations.
--- Jura 09:57, 20 October 2016 (UTC)
Ok, but three for the same origin and pronunciation (if they are used for both sexes)? — Yerpo Eh? 10:59, 20 October 2016 (UTC)
No, just one. Andrea isn't used the same way in, e.g., Italian as in other languages.
--- Jura 11:03, 20 October 2016 (UTC)

One to many[edit]

For example, both Greenaway and Greenway are transliterated into Russian as Гринуэй. Where should we place ru:Гринуэй: in Greenway (Q16870299) or Greenaway (Q1261077)? --Infovarius (talk) 11:22, 18 October 2016 (UTC)

A how-to?[edit]

Could this project produce a how-to for adding more detailed information for names of people? For example:

  • Let's say I have a person in Finland. Often the name in official documents is written in it's Swedish form. How should I mark that?
  • A person is known generally by the second first name, nickname, or this varies across sources. How could I keep the variants in a good way?
  • The last name is changed for various reasons. How are the reasons and dates attached?

and many more. I am less interested in etymology than having a solid lookup reference for that specific person. Apologies that I am bluntly asking this before reading all the documentation, but I would like to access the best practices in a "for dummies" format. Cheers, Susanna Ånäs (Susannaanas) (talk) 13:44, 5 November 2016 (UTC)

  • Maybe Q1124 can help you formulate one.
    --- Jura 13:54, 5 November 2016 (UTC)
  • I am confused. Do you mean the Q-item or the person? ;-) I will need to learn what you have worked on, but I cannot possibly start creating the how-to myself yet. Susanna Ånäs (Susannaanas) (talk) 13:59, 5 November 2016 (UTC)
  • Well, to be more precise: the properties and statements used on the item to describe the various names of subject of the item. ;) As this project frequently focuses on compiling (first) name items, you won't much information about it here.
    --- Jura 14:28, 5 November 2016 (UTC)

What do you think of this:

  • Anyone could write a (tricky) name or names of a person, explaining the whole history if needed.
  • Others could help modeling by adding all the needed properties and qualifiers for that name/person.
  • Finally that example could be placed under a proper heading, such as
  • Uses of maiden name
  • Uses of Spanish second family name
  • Uses of a patronym/matronym
  • Changes of names etc.

The topics could be briefly explained, at least the intended use of the existing name properties and how they are currently tackling all the tricky issues.

Cheers, Susanna Ånäs (Susannaanas) (talk) 14:44, 5 November 2016 (UTC)

  • Sure. Our current system isn't necessarily that consistent (see the open questions on Wikidata:Bistro). As naming conventions change by language and country, it might be easier to focus on specific combinations. Properties with item-datatype are somewhat different from the ones with string/text-datatype. For your application, the later ones seem to be the more relevant ones. With the sample Q1124, you might be able to solve most of the questions. Ideally every person's item had "name in native language" and, if different, "birth name".
    --- Jura 15:13, 5 November 2016 (UTC)
  • Yes I am being constantly a little confused by the name items versus names as text. I will face a similar modeling issue with historical place names for places. Is modeling names of people in general in the scope of this project? If, so, I could start a name clinic as a subpage, hoping that there will be people helping out! Do you think names for people and names for places should have a common project? --17:26, 5 November 2016 (UTC)
  • For places, I think you already started WikiProject Historical Place. Not sure if yet another one would help. For places, the question is mostly on which item a historic name would go, for people, it's generally just one item. I suppose we could split between item- and text-properties, but this might not make it easier for users if they have to check language/country specific sections in each, so I'd attempt to add both here.
    --- Jura 10:20, 6 November 2016 (UTC)

Icelandic people[edit]

Please see Property_talk:P734#Include_Icelandic_.22last.22_name.3F.
--- Jura 06:26, 17 January 2017 (UTC)

Salih[edit]

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Sascha
Joxemai
Place Clichy
Branthecan
Azertus
ToJack
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
ajayi adeniyi
Moebeus
Dcflyer
Looniverse
Pictogram voting comment.svg Notified participants of WikiProject Names,

Salih (Q1509192) was the page for the given name 'Salih' — which it still is in most languages. However, @Aboulouei1: has changed the English and French labels several times to 'Salah Pacha', on the basis (if I understand the French correctly) that Salih is a mistranslation of the Arabic name. My understanding is that even if the name is "incorrect" like this, if it's actually in use in this form, it should stay as 'Salih', with a said to be the same as (P460) to other versions, such as Salah (Q19882606). Is this correct? --Oravrattas (talk) 21:58, 27 January 2017 (UTC)

A very big problem[edit]

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Sascha
Joxemai
Place Clichy
Branthecan
Azertus
ToJack
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
ajayi adeniyi
Moebeus
Dcflyer
Looniverse
Pictogram voting comment.svg Notified participants of WikiProject Names

There is a very big project with the project. Let's take for example José (Q2190619). "José" is used in spanish and in portuguese, but it's is pronounced differently. So if you use latin script, which is the most common, it's all good. But if you use hebrew/arab/bengali/russian/chinese then it's a very big problem. Because in hebrew, the spanish pronouncation is "חוזה" (Hoze) and in portuguese it's "ז'וז'ה" (joje). Any solutions?--Mikey641 (talk) 14:42, 26 March 2017 (UTC)

Duplicate given names according pronunciation looks acceptable to me: a generic José, given name, then specialized José, Spanish given name ; José, Portugese given name ; José, French given name, etc. --Dereckson (talk) 18:20, 26 March 2017 (UTC)
Let's take in account that even Spanish pronunciation is different in particular regions. Just another case that shows how bad the idea of name related properties as data-type item is :( --Shlomo (talk) 06:55, 27 March 2017 (UTC)
For now, we handle these cases like all transliterations: all possibilities are listed as aliases in a language. See Alexey (Q29014670); the name is in Cyrillic, the English aliases are all possible transliterations. I would add IPA transcription (P898) as qualifiers for language of work or name (P407) actually, before starting to separate items. --Harmonia Amanda (talk) 07:35, 27 March 2017 (UTC)
@Harmonia Amanda: but it's very different, in all the english aliases in alexey, the pronouncation is the same, however, if we take the name jose, the pronouncation is different in each language. alexey won't need to be seperated, because it's the same in each language, but jose will have to be seperated to spanish and portoguese.--Mikey641 (talk) 19:22, 27 March 2017 (UTC)
@Mikey641:, sorry, I wasn't clear, I didn't think it was the same problem, I thought it was a similar one: the same original-string name has several value associated (in one case, several transliterations; in the other, several pronunciations). As I said, I would suggest one item "Jose" (with one native label (P1705) with a "multilingual" code) which would have several values for language of work or name (P407) with, for each value the qualifier IPA transcription (P898). So that we know that "Jose" is written similarly in Portuguese and in Spanish and isn't pronounced the same. --Harmonia Amanda (talk) 13:22, 28 March 2017 (UTC)
Actually, I think that this is the same problem. Having the name Alexey (Q29014670) for some person, how do you know if it is Alexey or Alexei or Aleksey or some other variant? --Infovarius (talk) 14:45, 29 March 2017 (UTC)
I don't think it's exacltely the same because all the alexy aliases are pronounced the same but in jose it isn't. It is a different problm though--Mikey641 (talk) 16:32, 29 March 2017 (UTC)
  • This is actually a frequent question. I think Santoz and Santosh, both spelled "Santos" are similar. We have three items for these. Yuriko is similar, but the other way round.
    --- Jura 05:45, 30 March 2017 (UTC)

Male given name[edit]

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Sascha
Joxemai
Place Clichy
Branthecan
Azertus
ToJack
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
ajayi adeniyi
Moebeus
Dcflyer
Looniverse
Pictogram voting comment.svg Notified participants of WikiProject Names Harmonia Amanda, I'm opening it in a new section as it is not about pronouncing as above. I think that the solution of adding the name in the description field is wrong. look on this example József (Q17498051) the Hebrew name added is יוסף but it is not the possibility it might be also ז'וזף. In Hebrew (and not only in Hebrew) evry latin name might habe few suitable translations to Hebrew. This is no solution, but I do have solution for all cases. We have to open a given name ticket for every name per alphabet and should not allow adding translations to other languages.

For example no label (Q29017247) it's שלמה in Hebrew. It mat translated as Shlomo (Q21069189) --> Shlomo in latin alphabet or Solomon (Q18607853) ---> Solomon in latin alphabet or Shelomoh (Q29017272) --> Shelomoh again in latin alphabet and may more that can be seen in "said to be the same as" of no label (Q29017247). Adding שלמה in Hebrew shouldn't be allowed and no need for that because Solomon (Q18607853) in Hebrew might be שלמה or סולומון. Adding שלמה label to the latin name is wrong. In that way evry person might have given name in Hebrew alphbet and also in English alphabet. And it might be different to every person. When I open the the property "given name" in Salomon Buber (Q982067) and I write שלמה which in his given name in Hebrew I should get only one posibility that says שלמה with the descriptin שם פרטי גברי and if I write salomon (his name in latin alphabet) I shoud get only one option with the description "male given name"

In that matter. Don't know if it's possible. To avoid any option to add the Hebrew label we should close that option. Items with latin given name should have only labels in latin languages. Items with Hebrew alphabet should have only in Hebrew language. If it's possible technically of course it will be the best.Geagea (talk) 07:05, 29 March 2017 (UTC)

@Geagea: I'm not exactly sure what you mean but I cleaned up your examples so we would better understand how the situation is handled right now (it's misleading to have items with incorrect descriptions as examples). When the language of the description doesn't use the same script as the name, the original name should be part of the description, so that we can differentiate. So no label (Q29017247) should have "male given name (שלמה)" as English description, and Shlomo (Q21069189) "שם פרטי גברי (Shlomo)" as Hebrew description. So that when we are searching for a name, we can differentiate between all items who share the same transliterations based on the description. --Harmonia Amanda (talk) 07:21, 29 March 2017 (UTC) And the correct name in the original language as alias in all languages with a different script, of course. --Harmonia Amanda (talk) 07:23, 29 March 2017 (UTC) And all transliterations of a name in a language should be listed as aliases for that language. So for József (Q17498051), if יוסף is the chosen label (more frequent transliteration?) then ז'וזף should be an alias. But yes @Geagea:, we should never have two items with the same label and description with no means to differentiate between and the great thing is that it's technically not possible on Wikidata. So we just have to complete missing descriptions and aliases. --Harmonia Amanda (talk) 07:33, 29 March 2017 (UTC)
I'm ok with you addings. My concern is about the list I get when I'm adding the name שלמה not about just search. When I'm adding שלמה to male given name in Hebrew I expect to see only one option with the description "שם פרטי גברי". If you add now henrew label שלמה to Shlomo (Q21069189) (which is missleading as it might be also שלומו) I will see two tickets. And if you add שלמה to all the list of "said to be the same as" I'll get a list 8 tickets with the name שלמה and the description "שם פרטי גברי". In my opinion adding label in hebrew to names in latin alphabet should be avoided by software. Geagea (talk) 07:52, 29 March 2017 (UTC)
Uh, no @Geagea: you would not see 8 items with שלמה / שם פרטי גברי, you would see one of those and seven שם פרטי גברי (Schlomo) / שלמה ,שם פרטי גברי (Salomon) /שלמהם פרטי גברי (Shlomo) / שלמה, etc. If the items are clean, the descriptions allow you to differentiate. --Harmonia Amanda (talk) 07:57, 3 April 2017 (UTC)
Harmonia Amanda, thank you for getting to the point.
And which one of them I should choose? Won't it be easier to choose שלמה (Q29017247) in hebrew alphabet to a person from Israel without dealing with the latin name?
I'm suggesting simple way to solve it. If we will add to a jewish person his name in hebrew alphabet, that will be necessarily correct. We can also add his name in cyrillic alphabet (different item of name with cyrillic alphabet) and in latin alphabet different item of name with latin alphabet). all of them are correct.
First it is the correct way to add a name. שלמה is a hebrew name from origin, why latin alphabet should have the priority. Anyway, I think that non of the alphabet should have priority and all of them should be respected.
See another problem with that kind of solution. The transcription from English to hebrew might be correct. as well maybe you can add correct transcription from english to Russian. But Is it nesasserily that the Hebrew transcription to Russian is correct? What about Greek Japanise and etc.
Second, we should think how to make things simple. It is very simple to say people to add a name based on his origin alphabet and if there is not exist such then to create. the data base can be created faster if it will be simple for people and more of them can add private name.Geagea (talk) 10:33, 3 April 2017 (UTC)
Most person don't have multiple given names. If a person is Japanese, their given name is in Japanese. The romanized version of the same name which exists as an American given name isn't a correct value for P735. Yes, there are cases where a person would have the "same" name in different alphabets: when the person changed countries for example. A Japanese person who became later an American citizen will have both their native given name and their Americanized one with start time (P580). But transliterations of the real given name are not expected values for given name (P735). We keep transliterations as label/aliases because when the French Wikipedia has an article about a Japanese person, the name will be in the romanized form. Someone who search the Japanese given name needs to find it while writing the romanized version (and need to have clear descriptions to know which one of the proposed items is the correct one). If a Russian person has as given name Alexey (Q29014670) (Алексей), then it's the only expected value and not the dozens of possible transliterations which also exist as their own right (Alexei (Q19820298) (Alexei) can be a transliteration of Алексей but it's also a German given name). People named Алексей are not people named Alexei, even if the first ones can sometimes have their names written Alexei when transliterated. --Harmonia Amanda (talk) 10:51, 3 April 2017 (UTC)
Saying most person don't have multiple given names is not a solution. Moas (maybe all) Jewish people (-Israeli born) have given name in Hebrew together with other name.
See Yosef Haim Brenner (Q939732). He have given name in hebrew Yosef (Q29051199). He have one name in cyrillic - Йосеф. Regarding to latin alphabet (or romanized to latin as you have mentioned) we have few options Yosef, Josef, Yossef and Joseph. He born in Ukraine so he have given name in Ukrainian, Russian (was part of the Russian Empire) and a hebrew name. Won't it be better th have one Hebrew alphabet name, cyrillic alphabet name? It can have few romanized names as mentioned above and also have the Arabic name يوسف which can be in hebrew יוסף or יוסוף. Hebrew speaking users can add the hebrew name and users Russian speakers can add the cyrrilic name and users speaking Arabic can add the arabic name. all the diffrent names can be connected to each other in the section "said to be the same as".
I believe I'm suggesting simple solutions which includes all possibilities. Geagea (talk) 11:58, 3 April 2017 (UTC)
As I don't even understand what your solution is @Geagea:, I'll have to disagree. From what I understand, you want to make it impossible to find names when searching their transliterations, and I don't see any case when making correct items more difficult to find is a solution. We already have people using wrong items (cities, musical albums, etc.) as given names because they don't even think of searching after the five first proposed items, how is making practically impossible for a French Wikidatian to find a Russian given name going to help us? You are not even clear to what the confusion is; all your examples are based on incomplete items: the obvious solution is to complete them. --Harmonia Amanda (talk) 18:57, 3 April 2017 (UTC)
  • #New_datatype:_monostring_item might have achieved what you are looking for. Eventually maybe new features for Wiktionary will solve it.
    --- Jura 05:45, 30 March 2017 (UTC)

Merge birth name (P1477) into name in native language (P1559)?[edit]

I have difficulties with properties birth name (P1477) and name in native language (P1559). Both are for persons only and of monolingual-text type. Do we actually need both? It feels strange that I am supposed to move (?) data from name in native language (P1559) to birth name (P1477) after a person has married, and add new data to name in native language (P1559) at the same time. Typically our data does not have to be treated like this and can stay in its property even if it “outdates”. An idea would be to merge birth name (P1477) (27k uses) into the more general name in native language (P1559) (228k uses) and use qualifiers such as object has role (P3831) name at birth (Q2507958) to qualify birth names.

  • Has this already been discussed somewhere?
  • What does WikiProject Names think?
  • I would be willing to officially propose a merge at Wikidata:Properties for deletion, if the idea has sufficient support here.

MisterSynergy (talk) 08:02, 14 June 2017 (UTC)

  • Where did you get the idea from that you are meant to move any statements?
    --- Jura 17:51, 14 June 2017 (UTC)
Huh, what else should I do?!
A person has a name which is stored in name in native language (P1559). This name becomes the “birth name” after a name change (e.g. after a wedding), and is superseded by the new name (e.g. “married name”). What to do now with the old name, now “birth name”? It needs to be moved to birth name (P1477) (or duplicated, even worse).
Or do I misunderstand P1477 completely? —MisterSynergy (talk) 17:59, 14 June 2017 (UTC)
You just add a new statement with a start date (and preferred rank).
--- Jura 18:04, 14 June 2017 (UTC)
Sample: If an item has P1559="Michelle Robinson", add a new statement with P1559="Michelle Obama" could be added.
--- Jura 18:16, 14 June 2017 (UTC)
This is what I do right now. But what’s P1477 good for then? —MisterSynergy (talk) 18:27, 14 June 2017 (UTC)
For "Michelle LaVaughn Robinson".
--- Jura 18:38, 14 June 2017 (UTC)
@MisterSynergy:
birth name (P1477) is also good for people known under a name totally different from their birth name (pseudonyms, actors, fighters, etc., and also for people who officially changed their name, whatever the reason of the change (migration, trans, religious... (etc.) :) --Hsarrazin (talk) 09:07, 21 June 2017 (UTC)

Reorganizing the project page[edit]

Apparently some want to move parts of the content to subpages and/or remove it. Oddly undiscussed deletions are deemed consensual, while restores of such deletions are considered "non consensual". In any case, let's restore the version we had before and then check what should be moved.

Are we ok with the "to do list" to be moved to Wikidata:WikiProject Names/to do? Obviously we could mark some as done. Not quite sure if it's worth translating, but apparently this had been done.
--- Jura 10:31, 19 June 2017 (UTC)

I'm ok with the list of properties to be moved to a separate tab (subpage).
--- Jura 10:35, 19 June 2017 (UTC)
The obvious aim of this reorganization was to have an actually usable page and not at mostly three years out of date, incredibly heavy, not readable in small devices one. No consensual point was deleted; some complicated cases were moved to specific help pages, with a link to the generic help page in the new page. That's from experience with new contributors (I animate regularly workshops): too specific information is spooky when they are already trying to understand generic use. The only content which was summarily deleted is links to non-functional tools (such as Autolist) and very old subpages which haven't been updated since 2014. All the rest was only moved, not deleted. The "to do list" has nothing to do on the home page; all its content was rewritten in Wikidata:WikiProject Names/Help, except for misleading entries (like saying that the label should be the same in all Roman languages, when it's wrong when the item is about a Russian name for example; that's true only for Latin-script names, it should be explained more deeply). We should create more specific help pages, explaining clearly all the complex cases for which we have reached a consensus.
I'm really curious what consensual thing has actually been lost in the change. For example, you added that the priority of the project was given names. That may be your priority, but that's not mine and probably not the priority of the project as a whole. You added that we should separate Spanish and Portuguese names, when @Hsarrazin: and myself have always stated our strong oppose with this practice (we can still debate but right now it's certainly not consensual).
So, except for things which were added to the old home page without consensus, what did we lose? I know what we gained: an easy-to-read helpful page with links to go further as wanted. --Harmonia Amanda (talk) 11:14, 19 June 2017 (UTC)
It's not entirely clear which changes you proposed. I do think it's a good thing to have a concise summary on one page, but it's a bad idea to spread the explanation on too many pages. If you create separate subpages, you should link them as tabs. No, I don't think we should remove the samples from the explanation. I don't think it should be problem stating what is or had been the project focus. Last discussion about this is at Wikidata_talk:WikiProject_Names#Project_focus_for_2016.
--- Jura 11:29, 19 June 2017 (UTC)
Not all subpages are deemed to be tabs. For detailed explanation, it's better to have a link in the relevant page section(s). For the project focus, you link to a discussion where you ask for a project focus and get no answer on that (and someone asks for a stats table, which is not a project focus), and for 2016. We are in 2017. Last, you reverted the page to the June 1st, 21:42 CEST stating that it is the last consensual version. Please point to the discussion that defines a consensus about this version. -Ash Crow (talk) 11:49, 19 June 2017 (UTC) Edit: @Jura1:, sorry I forgot to ping you. -Ash Crow (talk) 12:01, 19 June 2017 (UTC) - And with the correct name, it will work better >_< -Ash Crow (talk) 12:02, 19 June 2017 (UTC)
The project focus was stated that way for quite some time, but we haven't revised it (and none requested it). I fail to see the discussion about the consensusal changes .. Is this offline discussion you participated? The only discussion I was involved myself was on Hsarrazin's talk page. Maybe another ping you forgot or you inserted in a way that it can't work?
--- Jura 17:50, 19 June 2017 (UTC)
I propose a page with only three tabs (home, properties and reports/queries), a clear presentation of the project stating that it's complex and many complicated cases need to have their own help page, listing the few basic principles (only those which are without a doubt consensual, nothing controversial at all in this section), listing the most used tools and gadgets so people know how we operate and then listing the numerous specific subpages the project has (or should have; we should create more help pages). The end would be a page light enough to be readable, not too large so it can be viewed small devices such as phones(I would prefer only two tabs actually, but both properties and reports should be easily accessible, so…), with clear sections so everyone know where to find information.
I would structure /Help as a list of specific help pages, and I maybe would create a page about all "non consensual practices" which some contributors want and some strongly oppose, which would link to all relevant discussions here. But the actual home page of the project would only list practices we all agree with, which is totally not the case in the old version. --Harmonia Amanda (talk) 12:14, 19 June 2017 (UTC)
I think it's going in the right direction, but there is just a risk that you are splitting things up too much. Removing samples from the explanation isn't ideal either (I don't mind if you change or rotate them).
Personally, I like the general structure of WikiProject Movies:
  • A home page
  • a page that explains how it's done (with properties and more). Currently you seem to split this between "home", "properties" and "help" for WikiProject Names.
  • a page with tools and reports (to build things). Part of the procedural "help" on "help" for WikiProject Names might fit to this.
  • some statistics (most of those on WikiProject Movies is automated).
  • and page with actual reference lists generated through project. (called "lists" not "reports" as for WikiProject Names).
It also has "new" tabs, but these aren't essential.
Wikidata:WikiProject Czech Republic has a similar structure. Note the difference between maintenance queries and showcase queries. I don't think it matters whether these are lists or queries ..
--- Jura 17:50, 19 June 2017 (UTC)
What would be the advantages of combining reference lists with pure maintenance reports?
--- Jura 09:57, 24 June 2017 (UTC)

Tabs[edit]

Ok, vote time: who want which version?

  1. [2] clean tabs with three links: home, properties, reports and queries
  2. [3] clean tabs with four tabs: home, maintenance reports, reference lists, queries
  3. [4], bad code, three tabs: home, maintenance reports, reference lists

I didn't think that correcting bad code and updating the links would be controversial but if it is, then vote! --Harmonia Amanda (talk) 21:18, 19 June 2017 (UTC) Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Sascha
Joxemai
Place Clichy
Branthecan
Azertus
ToJack
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
ajayi adeniyi
Moebeus
Dcflyer
Looniverse
Pictogram voting comment.svg Notified participants of WikiProject Names

  • 1 --Harmonia Amanda (talk) 21:18, 19 June 2017 (UTC)
  • 1 -Ash Crow (talk) 21:20, 19 June 2017 (UTC)
  • 1 reports and queries are the same type of info - can be grouped on same subpage, with 2 sections... ref lists are queries too... --Hsarrazin (talk) 21:21, 19 June 2017 (UTC)

Flow[edit]

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Sascha
Joxemai
Place Clichy
Branthecan
Azertus
ToJack
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
ajayi adeniyi
Moebeus
Dcflyer
Looniverse
Pictogram voting comment.svg Notified participants of WikiProject Names

Ok, today I had a trouble with pings AND an edit conflict, both of which would have been easily averted if that page used Flow, so I propose that we vote on it.

To sum up the advantages:

  • no edit conflicts;
  • automatic sorting of discussions, the most recent appear first;
  • easy archive management;
  • it is possible to follow particular topics, and get notifications for them;
  • no risk of forgetting to sign;
  • date/time is not set in plain text, so everyone can adjust it to their local time;
  • automatic indentation.

Flow has been successfully in use on the French village pump for six months now, without any downside. Ping @Trizek (WMF): for WMF follow-up on this ;) -Ash Crow (talk) 21:31, 19 June 2017 (UTC)

  1. Symbol support vote.svg Support -Ash Crow (talk) 21:31, 19 June 2017 (UTC)
  2. Symbol support vote.svg Support (even if I don't like flow very much, but it's efficient for notification :) --Hsarrazin (talk) 21:34, 19 June 2017 (UTC)
  3. Pictogram voting comment.svg Comment it's correct that you can't search comments made on such pages?
    --- Jura 21:35, 19 June 2017 (UTC)
    First, we need to define the search operation. With Flow, you can use ctrl + f on the page, and results appear external Google search too. What is correct is the results for the namespace Topic: aren't currently available to the internal MediaWiki search engine. --Dereckson (talk) 10:45, 20 June 2017 (UTC)
    Pictogram voting comment.svg Comment So practically, there is no way to search archived discussions? I can understand that some users who tend to ignore online discussions may want to achieve that, but I don't think this is desirable.
    --- Jura 09:55, 21 June 2017 (UTC)
    Personally, I noticed this when trying to search for past discussions on Wikidata:Bistro. It probably went mostly unnoticed so far as Flow is mostly used on userpages. It suggests that it's not fit for general use.
    --- Jura 09:41, 24 June 2017 (UTC)
  4. Support. Sure. --Dereckson (talk) 10:45, 20 June 2017 (UTC)
  5. Pictogram voting comment.svg Comment Another little problem: To hide/delete the title of topic is necessary an oversighter, an admin haven't the right to do it. (Phab:T163061) --ValterVB (talk) 11:35, 21 June 2017 (UTC)
  6. Pictogram voting comment.svg Comment Another issue is history of edits. It is not trivial (and not useful) at Flow. --Infovarius (talk) 12:39, 23 June 2017 (UTC)
  7. Pictogram voting comment.svg Comment There are a few advantages listed in the initial comment. Have any of these every been a concern on this page? If there is no advantage, why would we want to loose all past discussions?
    --- Jura 09:59, 24 June 2017 (UTC)
    Yes. As steted with the very first phrase of hte initial comment. -Ash Crow (talk) 13:17, 2 July 2017 (UTC)
    The only advantage I am aware of is that it is easy to "thank" for an edit in Flow. Otherwise Flow only makes it more complicated! I am not enrolled in this Wikiproject, I therefor do not vote, but you can count me out on participating in discussions if/when they are turned into Flow. -- Innocent bystander (talk) 13:37, 2 July 2017 (UTC)
    We are still in the phase of evaluating the need for it given the loss of the project's history on one side and a single edit conflict by Ash Crow on the other side.
    --- Jura 15:51, 2 July 2017 (UTC) the
    There would be no lost of the project's history. Old discussions are automatically archived when Flow is activated. -Ash Crow (talk) 09:39, 16 July 2017 (UTC)

Combining Santos and Santosh[edit]

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Sascha
Joxemai
Place Clichy
Branthecan
Azertus
ToJack
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
ajayi adeniyi
Moebeus
Dcflyer
Looniverse
Pictogram voting comment.svg Notified participants of WikiProject Names

The other day I tried to add a summary to the project page about the different Santos items people made. There are:

  1. Santos (Q18352926) (no language specified).
  2. Santos (Q20813204) (Spanish)
  3. Santos (Q20813183) (Portuguese)
  4. Santos (Q20813195) (Brazilian Portuguese)

One would use (1) unless it's known that one of the more specific ones applies. All are spelled "Santos", but pronunciation can be quite different.

Personally, I don't think that this is a priority, but, if people want to use them, I don't see why not. I'd restore the corresponding summary in the page.

It's not fundamentally different than items we always had for Jean (fr) and Jean (en).
--- Jura 10:09, 24 June 2017 (UTC)

I don't see any fundamental difference between those 4 items, which are all written in the same alphabet, other than the pronunciation... the origin of the name is the same, the writing is the same, the meaning is the same...
the Name project will become absolutely impossible to manage if the only difference is phonetic... since in every language, and even in the same language, in different regions, the pronunciation of a word can be different...
I cannot see any reason why those should not be merged
it is very possible to complete a different IPA transcription (P898) for different pronunciation in different languages on the same item, corresponding to the same writing :/ --Hsarrazin (talk) 12:18, 25 June 2017 (UTC)
How will interwiki links be managed for these items? How can we link related Wikipedia articles together in the structure that you suggest? Remember, hosting interwiki links is also the primary purpose of Wikidata, and this purpose is not served if we split quasi-equivalent concepts on 5 different Wikidata items. Place Clichy (talk) 08:28, 27 June 2017 (UTC)
    • Infovarius: I think it's good to re-visit questions once in a while, nothing to be angry about it. A potential maintenance issue Hsarrazin mentions could be solved for Santos and Jean if we use a different P31 value for 2./3./4. above. Do you agree?
      --- Jura 15:57, 2 July 2017 (UTC)
User:Place Clichy if I understand your question correctly, the answer to your question is the use of said to be the same as (P460) :)
sorry, Infovarius, I don't understand if you're angry with my proposition or Jura1's ... :)
we are now trying to solve the non-latin name problem (Konstantin or other) by using name in native language (P1559) to distinguish if the writing is the "original" writing of the name, or a transcription in another alphabet... this way, we can recognize the names, whatever the original writing is, including cyrillic russian, ukrainian, belarus, or other... don't you agree with it ? --Hsarrazin (talk) 17:13, 2 July 2017 (UTC)
@Hsarrazin: As far as I know, said to be the same as (P460) does not provide interwiki links between related items. (Maybe it should, but at the moment it does not.) Therefore, in my opinion items considered to be the same should be merged if it is the technical way to provide these interwiki links. The fact that the same first name is spelled Wladimir in German, Vladimir in English and Vladimír in Czech does not change the fact that it is the same name, and actually pronounced the same, and interwiki links between the corresponding Wikipedia articles are legitimate. Splitting these items results in removing legitimate and useful navigation links. My concern is for the casual reader, who actually reads Wikipedia, not Wikidata. I have made this point at Project:Names before. I also agree with arguments put forward by Infovarius on latin-centricity this issue. I am not in favour of merging all items of similar-looking or related names though (such as all versions of John/Ivan/Giovanni/Hans, that would be absurd) : in my opinion, a good place to stop is any time you find two concurrent articles in the same Wikipedia language (in some cases these Wikipedia articles can also benefit from being merged, but that is another issue). Place Clichy (talk) 09:23, 4 July 2017 (UTC)
I think we answered your question before and provided a solution. Is there are Wikipedia where you contribute and need help implementing it?
--- Jura 09:29, 4 July 2017 (UTC)
it is possible to build a template that uses said to be the same as (P460) to bring interwiki links on a specific wiki. It has been done on some wikis... Need to search more to find where... --Hsarrazin (talk) 09:38, 4 July 2017 (UTC)
  • Hsarrazin: Do you think the dedicated P31 values could solve it or are there other issues to take care?
    --- Jura 09:29, 4 July 2017 (UTC)
  • no, Jura, this is absolutely contrary to the option that has been taken to use a specific item for each string... you can make it (because you know these specific languages) for Santos, but you (and I) would not be able to recognize thousands of cases where other people would think it would be better... it is an open door for nonsense and totally unmanageable project. Please, don't ! and this is a Symbol oppose vote oversat.svg Strong oppose --Hsarrazin (talk) 09:38, 4 July 2017 (UTC)
  • Can you give a sample? Is there are a problem with some of the items using Jean or Santos? --- Jura 09:41, 4 July 2017 (UTC)

Royalties and family names[edit]

Autograf, Gustaf I, Nordisk familjebok.png

Gustaf V of Sweden (Q52890) has birth name (P1477):"sv:Oscar Gustav Adolf Bernadotte" and family name (P734):Bernadotte (Q21502260). I am not 100% sure, but I think that is wrong. That Swedish royalties have family names at all is unusual as long as they stay inside the royal house. Also Princess Madeleine, Duchess of Hälsingland and Gästrikland (Q212035) has such claims and there I am pretty sure it's wrong. Count Sigvard Bernadotte of Wisborg (Q447209) used this family name, but I doubt he was born with it. How do you handle persons without family names?

Another problem comes with older names, from times when the orthography had not become stable yet. Gustav I of Sweden (Q52947) has Gustav (Q746076) as "given name". How do you select which spelling you should use? -- Innocent bystander (talk) 14:28, 28 June 2017 (UTC)

That is the problem, Swedish spelling wasn't invented yet in the 16th century. -- Innocent bystander (talk) 07:12, 29 June 2017 (UTC)
We could try to transcribe the spelling on the image I added to the right of this section (File:Autograf, Gustaf I, Nordisk familjebok.png).
--- Jura 07:21, 29 June 2017 (UTC)
In this case it looks like "ff" in the end. But this was not the only spelling he used. As I told you, he couldn't spell. Face-smile.svg Nobody could, not even those who translated the bible. -- Innocent bystander (talk) 13:46, 2 July 2017 (UTC)
What would you use?
--- Jura 15:24, 2 July 2017 (UTC)

Wikidata:WikiProject Names/Request properties for a specific item ?[edit]

Once in a while, I come across a name that seems hard to de-compose into properties. I think it would be good if we had a subpage just for asking about such cases. Maybe there is a better subpage name.
--- Jura 09:51, 4 July 2017 (UTC)

Cleaning up the category tree[edit]

Today I was looking at the subcategory tree for "anthrponym". It looks like there are many opportunites for merging here, but I think a basic hierarchy should be established first. Has anyone done any work on that, or does anyone know of an external ontology for names that we could use as a model? - PKM (talk) 23:18, 6 August 2017 (UTC)

This projects mainly uses the ones listed on the project page: given name (Q202444), female given name (Q11879590), male given name (Q12308941), unisex given name (Q3409032), family name (Q101352), and name.
There a few sub-classes with misspelled English labels that are mostly unused and could be merged into these or deleted.
--- Jura 04:42, 7 August 2017 (UTC)

Items needing spliting[edit]

There're

They need spliting.--GZWDer (talk) 18:11, 16 August 2017 (UTC)

New report: missing P1705 for given names[edit]

The above report lists given names that currently lack native label (P1705). It's ordered by the number of P735 statement using them. Ideally, every given name item has such a statement. When creating new items, please add one.

Currently the list has mainly Latin script versions of Japanese names. Sample: "Koji" (Q1250765). Ideally, these items wouldn't be used that frequently, specifically not for people born in Japan. The applicable item at Q1250765#P460 would be better.
--- Jura 08:26, 5 September 2017 (UTC)

Distribution maps: add or remove Property:P1846 ?[edit]

Popularity of name Irene.svg

c:Category:Name maps has a maps for given names, possibly suitable for distribution map (P1846).

Sample: File:Popularity of name Irene.svg (also on the right side). It's described as "World map showing the countries where the names Irene, Irina, Irena, Ena, Arina, Ira, Irëne, Erina, Irini, Irén, Iryna, Iren, Irène, Arja, Iria or Irja are popular. Any color means the name is among the 100 most popular names either in the population or among newborns. Map created using ZiMapMaker and data from multiple sources. ".
It's now at Q389528#P1846 and the items for Irina, Irena, Ena, Arina, Ira, Irëne, Erina, Irini, Irén, Iryna, Iren, Irène, Arja, Iria or Irja.
The user who first created them doesn't seem to be active any more.

At some point, I add the maps to a few Wikidata items, mostly items for the name in the filename.

The question is if we should add these maps to more items (maybe 3000) or remove them (from maybe 300 items).
--- Jura 08:13, 6 September 2017 (UTC)

English descriptions for first names[edit]

As we have "Olga (for Ольга) and Olga (for Olga). I wonder if we should start including the native label value in the description even if it's the same spelling as English.
--- Jura 16:29, 9 September 2017 (UTC)

Milestone .. kind of[edit]

There are now 50,000 people (fictional or not) with P735="John" !
--- Jura 16:29, 9 September 2017 (UTC)

MacKenzie and Mackenzie[edit]

There are family names MacKenzie (with uppercase “K”) and Mackenzie (with lowercase “k”). Shall items about persons with that surname use Mackenzie (Q13553907) for family name (P734) regardless of the spelling, or do I need another family name item for the uppercase variant? At enwiki, both are listed on en:Mackenzie (surname). —MisterSynergy (talk) 20:22, 22 September 2017 (UTC)

  • User:Mistersynergy doesn't seem to be you either.
    --- Jura 08:10, 1 October 2017 (UTC)
    • true for Mediawiki, but generally this depends on the actual software implementation Face-smile.svg
      I agree with you, there should be two items and Wikipedias shouldn’t mix those names up. Which is not in my control, however… Thanks for input, —MisterSynergy (talk) 08:35, 1 October 2017 (UTC)

compound given names[edit]

I did not find information about compound given names, particularly how to distinguish them from multiple given names. I’d like to make some claims and ask you whether they are correct:

  1. Given names such as "Hanspeter" (no hyphen or space) are always compound given names, so use Hanspeter (Q18012619) instead of Hans (Q632842) and Peter (Q2793400)
  2. Given names such as "Karl-Heinz" (with a hyphen) are always compound given names, so use Karl-Heinz (Q1729801) instead of Karl (Q15731830) and Heinz (Q11682369)
  3. No the more difficult part: there are compound given names such as "George Washington" (George Washington (Q16275947)), "Mary Ann" (Mary Ann (Q18083402)), or "José Luis" (José Luis (Q20856658)) with spaces between the parts. How do I tell these cases apart from multiple given names?

MisterSynergy (talk) 09:44, 13 October 2017 (UTC)

Little help[edit]

Hi! I knew nothing about name items before few weeks ago but I am refining a cluster of items for this event. As I will say also at the village pump, the page will be visited by thousands of people who will discover wikidata.

I am doing my best to improve the item in my spare times, as a result I started to create also new items such as Q41480368. I try to limit myself to what is and necessary (commons missing surnames for example) and I learn looking at your edits on my pages, but in these days I am very busy, so I am asking you some advice and also if you can revise everything on that area. I can see from your next edit what else I can do. I create here two questions as separated threads. I try to do my best sorry if it is written somewhere and I did not find everything on my own.--Alexmar983 (talk) 12:08, 21 October 2017 (UTC)

A missing item of middle name, similar to a more common form[edit]

My first question is about a gut with three names. I have learned how to write the numeral, but second name is a rare form. See also Talk:Q42297641. What should I do now? create it? Another guy uses the form in his middle name Q35285685. If you make an item can you show it to me? Thank you.--Alexmar983 (talk) 12:08, 21 October 2017 (UTC)

Brazilian double surnames[edit]

My second question is about a very rare Brazilian (wrong transliteration fro Italian) surname of Q42177679, I wanted to add it but it is very rare. I cannot find anyone with decent IDs with it. I removed the SCOPUS alias that "mistakes" it as a middle name see here but I don't like that. The alias is given by scopus and should be kept, but how to inform that Guarnetti is a surname? Same problem with Brazilian Q42149058. Other examples that are not addressing the problem are Q13632547, Q10361254 and so on.Any advice here? Thank you.--Alexmar983 (talk) 12:08, 21 October 2017 (UTC)

Two occurences are enough for an item of a surname[edit]

When I create an item of a person without surname here on wikidata, I create it if at least another person item with such surname exists. Is this ok as a strategy?--Alexmar983 (talk) 11:48, 28 October 2017 (UTC)

One is sufficient if you are sure it's the family name. When doing large numbers of first names, I used 2 or even more, just to make sure I didn't create too many for misspellings.
--- Jura 08:57, 5 November 2017 (UTC)

Person, Jr. vs. Person Jr.[edit]

The English Wikipedia is harmonizing on "Person Jr." instead of "Person, Jr.", should we too? --Richard Arthur Norton (1958- ) (talk) 13:34, 3 November 2017 (UTC)

This feels like something where harmonisation (rather than reflecting local practice) is perhaps going to be a bit confusing - I'm surprised WP are doing it. Andrew Gray (talk) 20:49, 3 November 2017 (UTC)
One reason I was against it at the vote in Wikipedia was because the other projects are not going to do it, so Commons, Quote, Data and the other projects are not following it. As people add in names in the body of articles they still add them under the old convention and leads to inconsistency. --Richard Arthur Norton (1958- ) (talk) 16:35, 4 November 2017 (UTC)
w:Wikipedia:Manual_of_Style/Biographies#Generational_and_regnal_suffixes spearheaded by w:User:Dicklyon. "Omission of the comma before Jr., Jr, or Jnr, and Sr., Sr, or Snr, is preferred. The comma can be used in cases where it is clearly and consistently preferred for a particular subject in current, reliable sources (most likely a living subject whose own preference is clear and consistent). Articles should be internally consistent in either omission or use of the comma for any given person's name.." The vote was against allowing a bot to change all, so they are doing one at a time manually. The key to changing all is the word "current", it isn't based on a simple Google count of "Person, Jr." vs. "Person Jr." --Richard Arthur Norton (1958- ) (talk) 14:06, 6 November 2017 (UTC)

Using opposite of (P461)[edit]

Sometimes two given names in Latin script just happen to be spelled the same and we can differentiate them by gender, e.g. Jean (Q7521081) [m] and Jean (Q7521081) [f]. These are linked together with opposite of (P461). When querying for Latin script given names and this allows to exclude them directly. It occurred to me that the same works for various Santos items. This way only Santos (Q18352926) shows up when one tries to find items than can be matched against the name of a person. Also, when one tries to merge duplicates, these items want get in the way.
--- Jura 18:47, 19 November 2017 (UTC)

Overview at WikiProject[edit]

Rowing has a nice overview of birth names and given names for various names with non-Latin script: Wikidata:WikiProject Rowing/reports/P1559 for rowers (non-latin script).
--- Jura 13:15, 22 November 2017 (UTC)

Preparing dataset - some questions[edit]

Hello! I have almost finished matching Latvian names with OpenRefine, will create something like 500 new items. I have some doubts about two cases:

  • Agne is Latvian female name, but Agne (Q4926003) currently is instance of male name.
  • Ardis is Latvian male name, but Ardis (Q25113822) currently is instance of female name.

Should I change them to unisex name or create new items for these? --Papuass (talk) 14:36, 4 December 2017 (UTC)

  • I'd do new items, with language of name = Latvian. If possible, please include "native label" and "writing system" statements.
    --- Jura 16:22, 4 December 2017 (UTC)

Qualifier for name format[edit]

There are several properties that provide the name as a string.

To identify different namings, I'd like to add a qualifier to the some of the name properties. Items might still need to be made. These could describe:

  • name format: first name, family name
  • name format: first name, patronymic, family name
  • name format: first name, another given name, family name
  • etc.

These items could provide more information in a structured way.

Not sure about which qualifier to use. Maybe criterion used (P1013), has quality (P1552), instance of (P31).
--- Jura 09:43, 14 December 2017 (UTC)

Etc. = 8 (personal name system (Q16655449) for example)? --Fractaler (talk) 10:32, 14 December 2017 (UTC)
We have 2 formats/formulas: full nominal formula/format, full name (second column) and short name, modern short formula/format (third column). --Fractaler (talk) 11:01, 14 December 2017 (UTC)
The idea is to use the one value that describes best the string found in P1477/P1559/etc. All these values could be instances of Q16655449. So yes, maybe 8, 16, etc values, at least if they are different in one way or the other.
I think @Matěj_Suchánek: tried to parse some of these to generate P735/P734. With some surprises obviously.
--- Jura 11:09, 14 December 2017 (UTC)

Tibetan names[edit]

Hi, I'm wondering how to deal with Tibetan names. For example, in the case of the Dalai Lama Tenzin Gyatso (Q17293), he has a birth name and a Dharma name (Q1543672). Both are technically given names (most Tibetans don't have family names), and each one contains two names (parts). The birth name is ལྷ་མོ་ (Lha-mo) + དོན་འགྲུབ་ (Don-'grub), and the dharma name is བསྟན་འཛིན་ (Bstan-'dzin) + རྒྱ་མཚོ་ (Rgya-mtsho) [actually the full dharma name is much longer, but it's often shortened to those two parts]. So should we put all 4 names into given name (P735)? Or maybe create a separate property for the dharma name? If we just use given name (P735), which qualifier should be used to distinguish the two names? Any thoughts? --Stevenliuyi (talk) 16:42, 23 December 2017 (UTC)

I went ahead and added the names Q17293#P735. Please let me know if my approach is appropriate. --Stevenliuyi (talk) 21:35, 9 January 2018 (UTC)

Duplicate Japanese name?[edit]

I just found Hamano (family name, instance of (P31)  family name (Q101352), language of work or name (P407)  Japanese (Q5287)) and Hamano (Japanese family name (浜野), instance of (P31)  family name (Q101352), language of work or name (P407)  Japanese (Q5287), also has native label (P1705), name in kana (P1814), writing system (P282)). Is this a duplicate item that should be merged, or is this correct (perhaps multiple forms of a name, e. g. original Japanese vs. English transcription)? —Galaktos (talk) 21:52, 27 January 2018 (UTC)

✓ Done merged after no one replied in two weeks… in case it was wrong and the items need to be split after all, the only page that linked to Q26203631 (now a redirect) was Hamaya (Q26203632). --Galaktos (talk) 17:56, 11 February 2018 (UTC)

Wiktionary[edit]

As we are stuck to spellings, why not to add sitelinks to Wiktionary articles about these specific spellings of names? It would also helps to prevent unnecessary merges. --Infovarius (talk) 13:53, 2 February 2018 (UTC)

Property Surname[edit]

For all persons not having surname, I.E Royals, Icelandics, Norse and so on. Shall those have family name (P734) With No value? Breg Pmt (talk) 19:08, 16 February 2018 (UTC)

Initials like « T. » as name classes ?[edit]

Just seen the T. (Q19803520) item. It’s noted as « instance of : first name ». I’d prefer at least noting this as « subclass of : first name » because an initial stands where many first name can stand. I guess that the creator @Jura1: did this to be able to use this in « first name » property without constraint violation, or in a batch mixed with actual name. Can’t we do better ? Like authorizing the constraint on « first name » to have a class of name as values ? Doing stuffs this way seem to me like shoehorsing to our model incorrectly to make the constraint system happy. This is (I think) to avoid such stuffs that Wikidata did not have a constraint system in the first place :/ author  TomT0m / talk page 14:31, 22 February 2018 (UTC)

Different transliteration of the same firstname : different items ?[edit]

Hi everyone, I've created an item for a woman from Russia who lived in France ; I only know the transliteration of her first name, which is Glafira ; I've seen that there are other women with this transliteration, whose first name is Глафира; however, in the item of the first name Glafyra (Q4139559), the transliteration used in French is Glafyra ; should I add Glafira as an alias or create a new item ? Thanks ! Léna (talk) 13:23, 17 March 2018 (UTC)

  • I noticed you already did: Q50671449. Yes I'd use that on the item for that person, assuming its the spelling she is using in Latin script. You might have noticed the Latin script-statement I added on Q50671449. Also, I'd add a second statement on the item for the person with the (also) new Q50675025 for the Cyrillic script version of the name of the person.
    --- Jura 14:38, 17 March 2018 (UTC)

Plowman[edit]

Hi everyone, Plowman (Q1513274) is currently both a family name and a Wikimedia disambiguation page; I've sorted a few of these out recently but in this case it looks like the linked English wikipedia page is a page for the family name and the linked German wikipedia page is a disambiguation page. There were originally two separate items but they got merged in February 2015.

Am I correct in thinking these should be two separate items? I'm not sure how to split them out and retain the correct wiki links... Thanks in advance ! WhiteHartLane (talk) 05:18, 9 April 2018 (UTC)

Multiple issues from Wikidata:Interwiki conflicts[edit]

Hello, there are multiple issues with names listed in Interwiki conflicts. Could somebody take a look at these and resolve them using your WikiProject's technique? I'm new to Wikidata, so I rather don't try it by myself, but I know you have defined rules and I hope you'll help with those:

  1. Kamilla×Camilla×Kamila
  2. Jonáš×Jonah
  3. Sarah×Sara
  4. Irene×Irene×Irina
  5. Luigi×Louis
  6. Jenovéfa×Genoveva×Geneviève

They should be mostly fine but some of them are listed in interwiki conflicts for years. --Dvorapa (talk) 20:46, 24 May 2018 (UTC)

  • I had a look at them. Most seem fine now as far as this WikiProject goes. However, one is about a person and another about disambiguation pages.
    --- Jura 07:21, 23 June 2018 (UTC)

Shouldn't names be lexemes instead of items?[edit]

Hi,

I am not sure if this has already been suggested somewhere: apologies if so.

Now that we have lexemes, I wonder whether they would not be more appropriate than items to model name. Name items are basically determined by their native label (P1705) and their labels tend to be identical to that string: that seems to indicate that names do not really belong to the conceptual domain, but rather to the lexical one. Names can have variations by gender or case: that seems to match pretty well the structure of forms of a lexeme. Furthermore, people already create lexeme for names (such as Lydia (L362)) - surely we do not want to maintain the same database in two different namespaces?

Lexeme support is still experimental and no mass import should be done, so in any case we should wait before migrating anything.

Pinging Tpt who proposed something similar on IRC (I think). − Pintoch (talk) 14:41, 16 June 2018 (UTC)

Pictogram voting comment.svg Comment We have lexemes, yes, but unfortunatly only for single languages. Lydia (L362) = Lydia as a German name.
We need to implement Script first:
Lexeme Lydia Script Latin alphabet Lexical Category proper noun (Q147276)given name (Q202444), feminine (Q1775415)
--Kolja21 (talk) 00:49, 17 June 2018 (UTC)
It seems also to me that first and last names are much more lexemes than usual entity. About the script problem, we could maybe just create some macro-language items like "languages written in Latin script" and use language codes like "mul-Latn". Tpt (talk) 17:44, 18 June 2018 (UTC)


todo (family name items)[edit]

Items with P31=family name

  1. mostly ✓ Done delete P31=disambiguation (or P31=family name). A check is at User:Jura1/family names/items (P31: nok - conflicts with disambiguation)
  2. delete P31=given name. Some left, check at User:Jura1/first names/items (P31: not ok)
  3. check P279 on such items
  4. check/fix subclasses. ✓ Done. All items have P31=family name
  5. mostly ✓ Done fix English descriptions of items with P31=family name
  6. mostly ✓ Done add missing English descriptions of items with P31=family name
  7. mostly ✓ Done fix English descriptions of items with P31=disambiguation and description = "family name" or "surname"
  8. mostly ✓ Done fix English descriptions of items with P31=disambiguation and description containing "name" or "family"
  9. mostly ✓ Done check links from P734, create new items when needed
  10. delete enwiki sitelinks to disambiguation pages, add to dab items
  11. mostly ✓ Done move enwiki sitelinks of surname pages on dab items
  12. add writing system property. Some done
  13. add native label property. Some done
  14. add Commons sitelinks. Added some, more being done by Laddo
  15. merge possible duplicates
  16. mostly ✓ Done fix Wikidata:Database_reports/Constraint_violations/P734
  17. add P734=novalue to items for people who don't have one
  18. link available dictionary entries
  19. add missing Soundex
  20. add more P734 statement
  21. TBD
  22. @Mike Peel: that he might need to reload the lists for the P734 bot

Some cleanup I'm currently doing. Might take some time to complete. I started out with one and found that I had to do the others.
--- Jura 08:41, 14 July 2018 (UTC)

  • mostly ✓ Done. I added more steps. Wikidata:Database_reports/Constraint_violations/P734 is currently in good shape. I think the problematic additions from earlier this year have mostly been fixed. Please avoid adding random statements to family name items in the future. If sitelinks aren't correct, they should be moved to new properties. --- Jura 04:51, 14 September 2018 (UTC)


Roman numerals in names of kings, popes, etc.[edit]

Should we (do we?) have items for the numbered part of kings' and popes' names? We have initials instead of given names (Q19803443) for initials, maybe we could have e.g. V (alias "the fifth", 5, etc.) to use with the given name (or another?) property? --Azertus (talk) 16:54, 12 September 2018 (UTC)

  • Yeah, maybe, but I'd rather see it as a qualifier of some statement. series ordinal (P1545) wouldn't work on given names as we use it to order multiple names of one person (sample: 1 at Q9682#P735). --- Jura 04:51, 14 September 2018 (UTC)
  • Maybe we could come up with something that would also cover things like "Jr.", "Sr.", "hijo", "viejo", etc. as well? Moebeus (talk) 17:57, 14 September 2018 (UTC)


significant event (P793)[edit]

Currently, some items include "(frequent )first names in .."-statements with that property. Sample: Q4925477#P793. While we agreed that this isn't exactly optimal, we didn't really have good replacement either. In the meantime there is attested in (P5323). I think that could work fine. --- Jura 04:51, 14 September 2018 (UTC)

That sounds good, but I think attested in (P5323) needs both some property examples and a slightly more "family friendly" description. Maybe it's just me, but I'm not that clear about what constitutes a "lemme" in this context. Moebeus (talk)

  • I updated the description of P5323. Maybe @Sjoerddebruin: wants to comment too. Otherwise, I'd go ahead and convert. --- Jura 12:27, 23 September 2018 (UTC)

Featured item: Crawford (Q20731004)[edit]

As we didn't have a family name one before, I added the above. It's a somewhat random choice except that all items about people that should use it as value in P734, do have it. Also related items are available and used correctly as well. --- Jura 04:51, 14 September 2018 (UTC)


New report: WikiProject Names/reports/new/family names[edit]

The above lists recently created items and highlights some elements. Ideally each item would have at least the "writing system" property set. The reminder can be completed by various tools. --- Jura 04:51, 14 September 2018 (UTC)

Statistics[edit]

Wikidata:Lexicographical data/Ideas of queries has plenty that could also work for names. .. I will try to add some to Wikidata:WikiProject Names/numbers as weekly listeria lists. Can we the stats page to the navbar? --- Jura 12:27, 23 September 2018 (UTC)