Wikidata talk:WikiProject Names/Archive/2

From Wikidata
Jump to navigation Jump to search

One to many

For example, both Greenaway and Greenway are transliterated into Russian as Гринуэй. Where should we place ru:Гринуэй: in Greenway (Q16870299) or Greenaway (Q1261077)? --Infovarius (talk) 11:22, 18 October 2016 (UTC)

A how-to?

Could this project produce a how-to for adding more detailed information for names of people? For example:

  • Let's say I have a person in Finland. Often the name in official documents is written in it's Swedish form. How should I mark that?
  • A person is known generally by the second first name, nickname, or this varies across sources. How could I keep the variants in a good way?
  • The last name is changed for various reasons. How are the reasons and dates attached?

and many more. I am less interested in etymology than having a solid lookup reference for that specific person. Apologies that I am bluntly asking this before reading all the documentation, but I would like to access the best practices in a "for dummies" format. Cheers, Susanna Ånäs (Susannaanas) (talk) 13:44, 5 November 2016 (UTC)

  • Well, to be more precise: the properties and statements used on the item to describe the various names of subject of the item. ;) As this project frequently focuses on compiling (first) name items, you won't much information about it here.
    --- Jura 14:28, 5 November 2016 (UTC)

What do you think of this:

  • Anyone could write a (tricky) name or names of a person, explaining the whole history if needed.
  • Others could help modeling by adding all the needed properties and qualifiers for that name/person.
  • Finally that example could be placed under a proper heading, such as
  • Uses of maiden name
  • Uses of Spanish second family name
  • Uses of a patronym/matronym
  • Changes of names etc.

The topics could be briefly explained, at least the intended use of the existing name properties and how they are currently tackling all the tricky issues.

Cheers, Susanna Ånäs (Susannaanas) (talk) 14:44, 5 November 2016 (UTC)

  • Sure. Our current system isn't necessarily that consistent (see the open questions on Wikidata:Bistro). As naming conventions change by language and country, it might be easier to focus on specific combinations. Properties with item-datatype are somewhat different from the ones with string/text-datatype. For your application, the later ones seem to be the more relevant ones. With the sample Q1124, you might be able to solve most of the questions. Ideally every person's item had "name in native language" and, if different, "birth name".
    --- Jura 15:13, 5 November 2016 (UTC)
  • Yes I am being constantly a little confused by the name items versus names as text. I will face a similar modeling issue with historical place names for places. Is modeling names of people in general in the scope of this project? If, so, I could start a name clinic as a subpage, hoping that there will be people helping out! Do you think names for people and names for places should have a common project? --17:26, 5 November 2016 (UTC)
  • For places, I think you already started WikiProject Historical Place. Not sure if yet another one would help. For places, the question is mostly on which item a historic name would go, for people, it's generally just one item. I suppose we could split between item- and text-properties, but this might not make it easier for users if they have to check language/country specific sections in each, so I'd attempt to add both here.
    --- Jura 10:20, 6 November 2016 (UTC)

Icelandic people

Please see Property_talk:P734#Include_Icelandic_.22last.22_name.3F.
--- Jura 06:26, 17 January 2017 (UTC)

Salih

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names,

Salih (Q1509192) was the page for the given name 'Salih' — which it still is in most languages. However, @Aboulouei1: has changed the English and French labels several times to 'Salah Pacha', on the basis (if I understand the French correctly) that Salih is a mistranslation of the Arabic name. My understanding is that even if the name is "incorrect" like this, if it's actually in use in this form, it should stay as 'Salih', with a said to be the same as (P460) to other versions, such as Salah (Q19882606). Is this correct? --Oravrattas (talk) 21:58, 27 January 2017 (UTC)

In Turkish we have both Salih and Salah, and I never thought they were the same name. See Salih Güney (Q7404493) and Salah Birsel (Q635144) please. Sorry for responding after four years ... :) --E4024 (talk) 01:02, 13 July 2021 (UTC)

A very big problem

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names

There is a very big project with the project. Let's take for example José (Q2190619). "José" is used in spanish and in portuguese, but it's is pronounced differently. So if you use latin script, which is the most common, it's all good. But if you use hebrew/arab/bengali/russian/chinese then it's a very big problem. Because in hebrew, the spanish pronouncation is "חוזה" (Hoze) and in portuguese it's "ז'וז'ה" (joje). Any solutions?--Mikey641 (talk) 14:42, 26 March 2017 (UTC)

Duplicate given names according pronunciation looks acceptable to me: a generic José, given name, then specialized José, Spanish given name ; José, Portugese given name ; José, French given name, etc. --Dereckson (talk) 18:20, 26 March 2017 (UTC)
Let's take in account that even Spanish pronunciation is different in particular regions. Just another case that shows how bad the idea of name related properties as data-type item is :( --Shlomo (talk) 06:55, 27 March 2017 (UTC)
For now, we handle these cases like all transliterations: all possibilities are listed as aliases in a language. See Alexey (Q29014670); the name is in Cyrillic, the English aliases are all possible transliterations. I would add IPA transcription (P898) as qualifiers for language of work or name (P407) actually, before starting to separate items. --Harmonia Amanda (talk) 07:35, 27 March 2017 (UTC)
@Harmonia Amanda: but it's very different, in all the english aliases in alexey, the pronouncation is the same, however, if we take the name jose, the pronouncation is different in each language. alexey won't need to be seperated, because it's the same in each language, but jose will have to be seperated to spanish and portoguese.--Mikey641 (talk) 19:22, 27 March 2017 (UTC)
@Mikey641:, sorry, I wasn't clear, I didn't think it was the same problem, I thought it was a similar one: the same original-string name has several value associated (in one case, several transliterations; in the other, several pronunciations). As I said, I would suggest one item "Jose" (with one native label (P1705) with a "multilingual" code) which would have several values for language of work or name (P407) with, for each value the qualifier IPA transcription (P898). So that we know that "Jose" is written similarly in Portuguese and in Spanish and isn't pronounced the same. --Harmonia Amanda (talk) 13:22, 28 March 2017 (UTC)
Actually, I think that this is the same problem. Having the name Alexey (Q29014670) for some person, how do you know if it is Alexey or Alexei or Aleksey or some other variant? --Infovarius (talk) 14:45, 29 March 2017 (UTC)
I don't think it's exacltely the same because all the alexy aliases are pronounced the same but in jose it isn't. It is a different problm though--Mikey641 (talk) 16:32, 29 March 2017 (UTC)
  • This is actually a frequent question. I think Santoz and Santosh, both spelled "Santos" are similar. We have three items for these. Yuriko is similar, but the other way round.
    --- Jura 05:45, 30 March 2017 (UTC)

Male given name

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names

Harmonia Amanda, I'm opening it in a new section as it is not about pronouncing as above. I think that the solution of adding the name in the description field is wrong. look on this example József (Q17498051) the Hebrew name added is יוסף but it is not the possibility it might be also ז'וזף. In Hebrew (and not only in Hebrew) evry latin name might habe few suitable translations to Hebrew. This is no solution, but I do have solution for all cases. We have to open a given name ticket for every name per alphabet and should not allow adding translations to other languages.

For example Shlomo (Q29017247) it's שלמה in Hebrew. It mat translated as Shlomo (Q21069189) --> Shlomo in latin alphabet or Solomon (Q18607853) ---> Solomon in latin alphabet or Shelomoh (Q29017272) --> Shelomoh again in latin alphabet and may more that can be seen in "said to be the same as" of Shlomo (Q29017247). Adding שלמה in Hebrew shouldn't be allowed and no need for that because Solomon (Q18607853) in Hebrew might be שלמה or סולומון. Adding שלמה label to the latin name is wrong. In that way evry person might have given name in Hebrew alphbet and also in English alphabet. And it might be different to every person. When I open the the property "given name" in Salomon Buber (Q982067) and I write שלמה which in his given name in Hebrew I should get only one posibility that says שלמה with the descriptin שם פרטי גברי and if I write salomon (his name in latin alphabet) I shoud get only one option with the description "male given name"

In that matter. Don't know if it's possible. To avoid any option to add the Hebrew label we should close that option. Items with latin given name should have only labels in latin languages. Items with Hebrew alphabet should have only in Hebrew language. If it's possible technically of course it will be the best.Geagea (talk) 07:05, 29 March 2017 (UTC)

@Geagea: I'm not exactly sure what you mean but I cleaned up your examples so we would better understand how the situation is handled right now (it's misleading to have items with incorrect descriptions as examples). When the language of the description doesn't use the same script as the name, the original name should be part of the description, so that we can differentiate. So Shlomo (Q29017247) should have "male given name (שלמה)" as English description, and Shlomo (Q21069189) "שם פרטי גברי (Shlomo)" as Hebrew description. So that when we are searching for a name, we can differentiate between all items who share the same transliterations based on the description. --Harmonia Amanda (talk) 07:21, 29 March 2017 (UTC) And the correct name in the original language as alias in all languages with a different script, of course. --Harmonia Amanda (talk) 07:23, 29 March 2017 (UTC) And all transliterations of a name in a language should be listed as aliases for that language. So for József (Q17498051), if יוסף is the chosen label (more frequent transliteration?) then ז'וזף should be an alias. But yes @Geagea:, we should never have two items with the same label and description with no means to differentiate between and the great thing is that it's technically not possible on Wikidata. So we just have to complete missing descriptions and aliases. --Harmonia Amanda (talk) 07:33, 29 March 2017 (UTC)
I'm ok with you addings. My concern is about the list I get when I'm adding the name שלמה not about just search. When I'm adding שלמה to male given name in Hebrew I expect to see only one option with the description "שם פרטי גברי". If you add now henrew label שלמה to Shlomo (Q21069189) (which is missleading as it might be also שלומו) I will see two tickets. And if you add שלמה to all the list of "said to be the same as" I'll get a list 8 tickets with the name שלמה and the description "שם פרטי גברי". In my opinion adding label in hebrew to names in latin alphabet should be avoided by software. Geagea (talk) 07:52, 29 March 2017 (UTC)
Uh, no @Geagea: you would not see 8 items with שלמה / שם פרטי גברי, you would see one of those and seven שם פרטי גברי (Schlomo) / שלמה ,שם פרטי גברי (Salomon) /שלמהם פרטי גברי (Shlomo) / שלמה, etc. If the items are clean, the descriptions allow you to differentiate. --Harmonia Amanda (talk) 07:57, 3 April 2017 (UTC)
Harmonia Amanda, thank you for getting to the point.
And which one of them I should choose? Won't it be easier to choose שלמה (Q29017247) in hebrew alphabet to a person from Israel without dealing with the latin name?
I'm suggesting simple way to solve it. If we will add to a jewish person his name in hebrew alphabet, that will be necessarily correct. We can also add his name in cyrillic alphabet (different item of name with cyrillic alphabet) and in latin alphabet different item of name with latin alphabet). all of them are correct.
First it is the correct way to add a name. שלמה is a hebrew name from origin, why latin alphabet should have the priority. Anyway, I think that non of the alphabet should have priority and all of them should be respected.
See another problem with that kind of solution. The transcription from English to hebrew might be correct. as well maybe you can add correct transcription from english to Russian. But Is it nesasserily that the Hebrew transcription to Russian is correct? What about Greek Japanise and etc.
Second, we should think how to make things simple. It is very simple to say people to add a name based on his origin alphabet and if there is not exist such then to create. the data base can be created faster if it will be simple for people and more of them can add private name.Geagea (talk) 10:33, 3 April 2017 (UTC)
Most person don't have multiple given names. If a person is Japanese, their given name is in Japanese. The romanized version of the same name which exists as an American given name isn't a correct value for P735. Yes, there are cases where a person would have the "same" name in different alphabets: when the person changed countries for example. A Japanese person who became later an American citizen will have both their native given name and their Americanized one with start time (P580). But transliterations of the real given name are not expected values for given name (P735). We keep transliterations as label/aliases because when the French Wikipedia has an article about a Japanese person, the name will be in the romanized form. Someone who search the Japanese given name needs to find it while writing the romanized version (and need to have clear descriptions to know which one of the proposed items is the correct one). If a Russian person has as given name Alexey (Q29014670) (Алексей), then it's the only expected value and not the dozens of possible transliterations which also exist as their own right (Alexei (Q19820298) (Alexei) can be a transliteration of Алексей but it's also a German given name). People named Алексей are not people named Alexei, even if the first ones can sometimes have their names written Alexei when transliterated. --Harmonia Amanda (talk) 10:51, 3 April 2017 (UTC)
Saying most person don't have multiple given names is not a solution. Moas (maybe all) Jewish people (-Israeli born) have given name in Hebrew together with other name.
See Yosef Haim Brenner (Q939732). He have given name in hebrew Yosef (Q29051199). He have one name in cyrillic - Йосеф. Regarding to latin alphabet (or romanized to latin as you have mentioned) we have few options Yosef, Josef, Yossef and Joseph. He born in Ukraine so he have given name in Ukrainian, Russian (was part of the Russian Empire) and a hebrew name. Won't it be better th have one Hebrew alphabet name, cyrillic alphabet name? It can have few romanized names as mentioned above and also have the Arabic name يوسف which can be in hebrew יוסף or יוסוף. Hebrew speaking users can add the hebrew name and users Russian speakers can add the cyrrilic name and users speaking Arabic can add the arabic name. all the diffrent names can be connected to each other in the section "said to be the same as".
I believe I'm suggesting simple solutions which includes all possibilities. Geagea (talk) 11:58, 3 April 2017 (UTC)
As I don't even understand what your solution is @Geagea:, I'll have to disagree. From what I understand, you want to make it impossible to find names when searching their transliterations, and I don't see any case when making correct items more difficult to find is a solution. We already have people using wrong items (cities, musical albums, etc.) as given names because they don't even think of searching after the five first proposed items, how is making practically impossible for a French Wikidatian to find a Russian given name going to help us? You are not even clear to what the confusion is; all your examples are based on incomplete items: the obvious solution is to complete them. --Harmonia Amanda (talk) 18:57, 3 April 2017 (UTC)

I have difficulties with properties birth name (P1477) and name in native language (P1559). Both are for persons only and of monolingual-text type. Do we actually need both? It feels strange that I am supposed to move (?) data from name in native language (P1559) to birth name (P1477) after a person has married, and add new data to name in native language (P1559) at the same time. Typically our data does not have to be treated like this and can stay in its property even if it “outdates”. An idea would be to merge birth name (P1477) (27k uses) into the more general name in native language (P1559) (228k uses) and use qualifiers such as object has role (P3831) name at birth (Q2507958) to qualify birth names.

  • Has this already been discussed somewhere?
  • What does WikiProject Names think?
  • I would be willing to officially propose a merge at Wikidata:Properties for deletion, if the idea has sufficient support here.

MisterSynergy (talk) 08:02, 14 June 2017 (UTC)

Huh, what else should I do?!
A person has a name which is stored in name in native language (P1559). This name becomes the “birth name” after a name change (e.g. after a wedding), and is superseded by the new name (e.g. “married name”). What to do now with the old name, now “birth name”? It needs to be moved to birth name (P1477) (or duplicated, even worse).
Or do I misunderstand P1477 completely? —MisterSynergy (talk) 17:59, 14 June 2017 (UTC)
You just add a new statement with a start date (and preferred rank).
--- Jura 18:04, 14 June 2017 (UTC)
Sample: If an item has P1559="Michelle Robinson", add a new statement with P1559="Michelle Obama" could be added.
--- Jura 18:16, 14 June 2017 (UTC)
This is what I do right now. But what’s P1477 good for then? —MisterSynergy (talk) 18:27, 14 June 2017 (UTC)
For "Michelle LaVaughn Robinson".
--- Jura 18:38, 14 June 2017 (UTC)
@MisterSynergy:
birth name (P1477) is also good for people known under a name totally different from their birth name (pseudonyms, actors, fighters, etc., and also for people who officially changed their name, whatever the reason of the change (migration, trans, religious... (etc.) :) --Hsarrazin (talk) 09:07, 21 June 2017 (UTC)

Reorganizing the project page

Apparently some want to move parts of the content to subpages and/or remove it. Oddly undiscussed deletions are deemed consensual, while restores of such deletions are considered "non consensual". In any case, let's restore the version we had before and then check what should be moved.

Are we ok with the "to do list" to be moved to Wikidata:WikiProject Names/to do? Obviously we could mark some as done. Not quite sure if it's worth translating, but apparently this had been done.
--- Jura 10:31, 19 June 2017 (UTC)

I'm ok with the list of properties to be moved to a separate tab (subpage).
--- Jura 10:35, 19 June 2017 (UTC)
The obvious aim of this reorganization was to have an actually usable page and not at mostly three years out of date, incredibly heavy, not readable in small devices one. No consensual point was deleted; some complicated cases were moved to specific help pages, with a link to the generic help page in the new page. That's from experience with new contributors (I animate regularly workshops): too specific information is spooky when they are already trying to understand generic use. The only content which was summarily deleted is links to non-functional tools (such as Autolist) and very old subpages which haven't been updated since 2014. All the rest was only moved, not deleted. The "to do list" has nothing to do on the home page; all its content was rewritten in Wikidata:WikiProject Names/Help, except for misleading entries (like saying that the label should be the same in all Roman languages, when it's wrong when the item is about a Russian name for example; that's true only for Latin-script names, it should be explained more deeply). We should create more specific help pages, explaining clearly all the complex cases for which we have reached a consensus.
I'm really curious what consensual thing has actually been lost in the change. For example, you added that the priority of the project was given names. That may be your priority, but that's not mine and probably not the priority of the project as a whole. You added that we should separate Spanish and Portuguese names, when @Hsarrazin: and myself have always stated our strong oppose with this practice (we can still debate but right now it's certainly not consensual).
So, except for things which were added to the old home page without consensus, what did we lose? I know what we gained: an easy-to-read helpful page with links to go further as wanted. --Harmonia Amanda (talk) 11:14, 19 June 2017 (UTC)
It's not entirely clear which changes you proposed. I do think it's a good thing to have a concise summary on one page, but it's a bad idea to spread the explanation on too many pages. If you create separate subpages, you should link them as tabs. No, I don't think we should remove the samples from the explanation. I don't think it should be problem stating what is or had been the project focus. Last discussion about this is at Wikidata_talk:WikiProject_Names#Project_focus_for_2016.
--- Jura 11:29, 19 June 2017 (UTC)
Not all subpages are deemed to be tabs. For detailed explanation, it's better to have a link in the relevant page section(s). For the project focus, you link to a discussion where you ask for a project focus and get no answer on that (and someone asks for a stats table, which is not a project focus), and for 2016. We are in 2017. Last, you reverted the page to the June 1st, 21:42 CEST stating that it is the last consensual version. Please point to the discussion that defines a consensus about this version. -Ash Crow (talk) 11:49, 19 June 2017 (UTC) Edit: @Jura1:, sorry I forgot to ping you. -Ash Crow (talk) 12:01, 19 June 2017 (UTC) - And with the correct name, it will work better >_< -Ash Crow (talk) 12:02, 19 June 2017 (UTC)
The project focus was stated that way for quite some time, but we haven't revised it (and none requested it). I fail to see the discussion about the consensusal changes .. Is this offline discussion you participated? The only discussion I was involved myself was on Hsarrazin's talk page. Maybe another ping you forgot or you inserted in a way that it can't work?
--- Jura 17:50, 19 June 2017 (UTC)
I propose a page with only three tabs (home, properties and reports/queries), a clear presentation of the project stating that it's complex and many complicated cases need to have their own help page, listing the few basic principles (only those which are without a doubt consensual, nothing controversial at all in this section), listing the most used tools and gadgets so people know how we operate and then listing the numerous specific subpages the project has (or should have; we should create more help pages). The end would be a page light enough to be readable, not too large so it can be viewed small devices such as phones(I would prefer only two tabs actually, but both properties and reports should be easily accessible, so…), with clear sections so everyone know where to find information.
I would structure /Help as a list of specific help pages, and I maybe would create a page about all "non consensual practices" which some contributors want and some strongly oppose, which would link to all relevant discussions here. But the actual home page of the project would only list practices we all agree with, which is totally not the case in the old version. --Harmonia Amanda (talk) 12:14, 19 June 2017 (UTC)
I think it's going in the right direction, but there is just a risk that you are splitting things up too much. Removing samples from the explanation isn't ideal either (I don't mind if you change or rotate them).
Personally, I like the general structure of WikiProject Movies:
  • A home page
  • a page that explains how it's done (with properties and more). Currently you seem to split this between "home", "properties" and "help" for WikiProject Names.
  • a page with tools and reports (to build things). Part of the procedural "help" on "help" for WikiProject Names might fit to this.
  • some statistics (most of those on WikiProject Movies is automated).
  • and page with actual reference lists generated through project. (called "lists" not "reports" as for WikiProject Names).
It also has "new" tabs, but these aren't essential.
Wikidata:WikiProject Czech Republic has a similar structure. Note the difference between maintenance queries and showcase queries. I don't think it matters whether these are lists or queries ..
--- Jura 17:50, 19 June 2017 (UTC)
What would be the advantages of combining reference lists with pure maintenance reports?
--- Jura 09:57, 24 June 2017 (UTC)

Tabs

Ok, vote time: who want which version?

  1. [1] clean tabs with three links: home, properties, reports and queries
  2. [2] clean tabs with four tabs: home, maintenance reports, reference lists, queries
  3. [3], bad code, three tabs: home, maintenance reports, reference lists

I didn't think that correcting bad code and updating the links would be controversial but if it is, then vote! --Harmonia Amanda (talk) 21:18, 19 June 2017 (UTC)

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names

Flow

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names

Ok, today I had a trouble with pings AND an edit conflict, both of which would have been easily averted if that page used Flow, so I propose that we vote on it.

To sum up the advantages:

  • no edit conflicts;
  • automatic sorting of discussions, the most recent appear first;
  • easy archive management;
  • it is possible to follow particular topics, and get notifications for them;
  • no risk of forgetting to sign;
  • date/time is not set in plain text, so everyone can adjust it to their local time;
  • automatic indentation.

Flow has been successfully in use on the French village pump for six months now, without any downside. Ping @Trizek (WMF): for WMF follow-up on this ;) -Ash Crow (talk) 21:31, 19 June 2017 (UTC)

  1.  Support -Ash Crow (talk) 21:31, 19 June 2017 (UTC)
  2.  Support (even if I don't like flow very much, but it's efficient for notification :) --Hsarrazin (talk) 21:34, 19 June 2017 (UTC)
  3.  Comment it's correct that you can't search comments made on such pages?
    --- Jura 21:35, 19 June 2017 (UTC)
    First, we need to define the search operation. With Flow, you can use ctrl + f on the page, and results appear external Google search too. What is correct is the results for the namespace Topic: aren't currently available to the internal MediaWiki search engine. --Dereckson (talk) 10:45, 20 June 2017 (UTC)
     Comment So practically, there is no way to search archived discussions? I can understand that some users who tend to ignore online discussions may want to achieve that, but I don't think this is desirable.
    --- Jura 09:55, 21 June 2017 (UTC)
    Personally, I noticed this when trying to search for past discussions on Wikidata:Bistro. It probably went mostly unnoticed so far as Flow is mostly used on userpages. It suggests that it's not fit for general use.
    --- Jura 09:41, 24 June 2017 (UTC)
  4. Support. Sure. --Dereckson (talk) 10:45, 20 June 2017 (UTC)
  5.  Comment Another little problem: To hide/delete the title of topic is necessary an oversighter, an admin haven't the right to do it. (Phab:T163061) --ValterVB (talk) 11:35, 21 June 2017 (UTC)
  6.  Comment Another issue is history of edits. It is not trivial (and not useful) at Flow. --Infovarius (talk) 12:39, 23 June 2017 (UTC)
  7.  Comment There are a few advantages listed in the initial comment. Have any of these every been a concern on this page? If there is no advantage, why would we want to loose all past discussions?
    --- Jura 09:59, 24 June 2017 (UTC)
    Yes. As steted with the very first phrase of hte initial comment. -Ash Crow (talk) 13:17, 2 July 2017 (UTC)
    The only advantage I am aware of is that it is easy to "thank" for an edit in Flow. Otherwise Flow only makes it more complicated! I am not enrolled in this Wikiproject, I therefor do not vote, but you can count me out on participating in discussions if/when they are turned into Flow. -- Innocent bystander (talk) 13:37, 2 July 2017 (UTC)
    We are still in the phase of evaluating the need for it given the loss of the project's history on one side and a single edit conflict by Ash Crow on the other side.
    --- Jura 15:51, 2 July 2017 (UTC) the
    There would be no lost of the project's history. Old discussions are automatically archived when Flow is activated. -Ash Crow (talk) 09:39, 16 July 2017 (UTC)

Combining Santos and Santosh

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names

The other day I tried to add a summary to the project page about the different Santos items people made. There are:

  1. Santos (Q18352926) (no language specified).
  2. Santos (Q20813204) (Spanish)
  3. Santos (Q20813183) (Portuguese)
  4. Santos (Q20813195) (Brazilian Portuguese)

One would use (1) unless it's known that one of the more specific ones applies. All are spelled "Santos", but pronunciation can be quite different.

Personally, I don't think that this is a priority, but, if people want to use them, I don't see why not. I'd restore the corresponding summary in the page.

It's not fundamentally different than items we always had for Jean (fr) and Jean (en).
--- Jura 10:09, 24 June 2017 (UTC)

I don't see any fundamental difference between those 4 items, which are all written in the same alphabet, other than the pronunciation... the origin of the name is the same, the writing is the same, the meaning is the same...
the Name project will become absolutely impossible to manage if the only difference is phonetic... since in every language, and even in the same language, in different regions, the pronunciation of a word can be different...
I cannot see any reason why those should not be merged
it is very possible to complete a different IPA transcription (P898) for different pronunciation in different languages on the same item, corresponding to the same writing  :/ --Hsarrazin (talk) 12:18, 25 June 2017 (UTC)
How will interwiki links be managed for these items? How can we link related Wikipedia articles together in the structure that you suggest? Remember, hosting interwiki links is also the primary purpose of Wikidata, and this purpose is not served if we split quasi-equivalent concepts on 5 different Wikidata items. Place Clichy (talk) 08:28, 27 June 2017 (UTC)
    • Infovarius: I think it's good to re-visit questions once in a while, nothing to be angry about it. A potential maintenance issue Hsarrazin mentions could be solved for Santos and Jean if we use a different P31 value for 2./3./4. above. Do you agree?
      --- Jura 15:57, 2 July 2017 (UTC)
User:Place Clichy if I understand your question correctly, the answer to your question is the use of said to be the same as (P460) :)
sorry, Infovarius, I don't understand if you're angry with my proposition or Jura1's ... :)
we are now trying to solve the non-latin name problem (Konstantin or other) by using name in native language (P1559) to distinguish if the writing is the "original" writing of the name, or a transcription in another alphabet... this way, we can recognize the names, whatever the original writing is, including cyrillic russian, ukrainian, belarus, or other... don't you agree with it ? --Hsarrazin (talk) 17:13, 2 July 2017 (UTC)
@Hsarrazin: As far as I know, said to be the same as (P460) does not provide interwiki links between related items. (Maybe it should, but at the moment it does not.) Therefore, in my opinion items considered to be the same should be merged if it is the technical way to provide these interwiki links. The fact that the same first name is spelled Wladimir in German, Vladimir in English and Vladimír in Czech does not change the fact that it is the same name, and actually pronounced the same, and interwiki links between the corresponding Wikipedia articles are legitimate. Splitting these items results in removing legitimate and useful navigation links. My concern is for the casual reader, who actually reads Wikipedia, not Wikidata. I have made this point at Project:Names before. I also agree with arguments put forward by Infovarius on latin-centricity this issue. I am not in favour of merging all items of similar-looking or related names though (such as all versions of John/Ivan/Giovanni/Hans, that would be absurd) : in my opinion, a good place to stop is any time you find two concurrent articles in the same Wikipedia language (in some cases these Wikipedia articles can also benefit from being merged, but that is another issue). Place Clichy (talk) 09:23, 4 July 2017 (UTC)
I think we answered your question before and provided a solution. Is there are Wikipedia where you contribute and need help implementing it?
--- Jura 09:29, 4 July 2017 (UTC)
it is possible to build a template that uses said to be the same as (P460) to bring interwiki links on a specific wiki. It has been done on some wikis... Need to search more to find where... --Hsarrazin (talk) 09:38, 4 July 2017 (UTC)
  • no, Jura, this is absolutely contrary to the option that has been taken to use a specific item for each string... you can make it (because you know these specific languages) for Santos, but you (and I) would not be able to recognize thousands of cases where other people would think it would be better... it is an open door for nonsense and totally unmanageable project. Please, don't ! and this is a  Strong oppose --Hsarrazin (talk) 09:38, 4 July 2017 (UTC)

Royalties and family names

Gustaf V of Sweden (Q52890) has birth name (P1477):"sv:Oscar Gustav Adolf Bernadotte" and family name (P734):Bernadotte (Q21502260). I am not 100% sure, but I think that is wrong. That Swedish royalties have family names at all is unusual as long as they stay inside the royal house. Also Princess Madeleine, Duchess of Hälsingland and Gästrikland (Q212035) has such claims and there I am pretty sure it's wrong. Sigvard Bernadotte (Q447209) used this family name, but I doubt he was born with it. How do you handle persons without family names?

Another problem comes with older names, from times when the orthography had not become stable yet. Gustav I of Sweden (Q52947) has Gustav (Q746076) as "given name". How do you select which spelling you should use? -- Innocent bystander (talk) 14:28, 28 June 2017 (UTC)

That is the problem, Swedish spelling wasn't invented yet in the 16th century. -- Innocent bystander (talk) 07:12, 29 June 2017 (UTC)
We could try to transcribe the spelling on the image I added to the right of this section (File:Autograf, Gustaf I, Nordisk familjebok.png).
--- Jura 07:21, 29 June 2017 (UTC)
In this case it looks like "ff" in the end. But this was not the only spelling he used. As I told you, he couldn't spell. Nobody could, not even those who translated the bible. -- Innocent bystander (talk) 13:46, 2 July 2017 (UTC)
What would you use?
--- Jura 15:24, 2 July 2017 (UTC)

Once in a while, I come across a name that seems hard to de-compose into properties. I think it would be good if we had a subpage just for asking about such cases. Maybe there is a better subpage name.
--- Jura 09:51, 4 July 2017 (UTC)

Cleaning up the category tree

Today I was looking at the subcategory tree for "anthrponym". It looks like there are many opportunites for merging here, but I think a basic hierarchy should be established first. Has anyone done any work on that, or does anyone know of an external ontology for names that we could use as a model? - PKM (talk) 23:18, 6 August 2017 (UTC)

This projects mainly uses the ones listed on the project page: given name (Q202444), female given name (Q11879590), male given name (Q12308941), unisex given name (Q3409032), family name (Q101352), and name.
There a few sub-classes with misspelled English labels that are mostly unused and could be merged into these or deleted.
--- Jura 04:42, 7 August 2017 (UTC)

Items needing spliting

There're

They need spliting.--GZWDer (talk) 18:11, 16 August 2017 (UTC)

The above report lists given names that currently lack native label (P1705). It's ordered by the number of P735 statement using them. Ideally, every given name item has such a statement. When creating new items, please add one.

Currently the list has mainly Latin script versions of Japanese names. Sample: "Koji" (Q1250765). Ideally, these items wouldn't be used that frequently, specifically not for people born in Japan. The applicable item at Q1250765#P460 would be better.
--- Jura 08:26, 5 September 2017 (UTC)

Distribution maps: add or remove Property:P1846 ?

c:Category:Name maps has a maps for given names, possibly suitable for distribution map (P1846).

Sample: File:Popularity of name Irene.svg (also on the right side). It's described as "World map showing the countries where the names Irene, Irina, Irena, Ena, Arina, Ira, Irëne, Erina, Irini, Irén, Iryna, Iren, Irène, Arja, Iria or Irja are popular. Any color means the name is among the 100 most popular names either in the population or among newborns. Map created using ZiMapMaker and data from multiple sources. ".
It's now at Q389528#P1846 and the items for Irina, Irena, Ena, Arina, Ira, Irëne, Erina, Irini, Irén, Iryna, Iren, Irène, Arja, Iria or Irja.
The user who first created them doesn't seem to be active any more.

At some point, I add the maps to a few Wikidata items, mostly items for the name in the filename.

The question is if we should add these maps to more items (maybe 3000) or remove them (from maybe 300 items).
--- Jura 08:13, 6 September 2017 (UTC)

English descriptions for first names

As we have "Olga (for Ольга) and Olga (for Olga). I wonder if we should start including the native label value in the description even if it's the same spelling as English.
--- Jura 16:29, 9 September 2017 (UTC)

Milestone .. kind of

There are now 50,000 people (fictional or not) with P735="John" !
--- Jura 16:29, 9 September 2017 (UTC)

MacKenzie and Mackenzie

There are family names MacKenzie (with uppercase “K”) and Mackenzie (with lowercase “k”). Shall items about persons with that surname use Mackenzie (Q13553907) for family name (P734) regardless of the spelling, or do I need another family name item for the uppercase variant? At enwiki, both are listed on en:Mackenzie (surname). —MisterSynergy (talk) 20:22, 22 September 2017 (UTC)

compound given names

I did not find information about compound given names, particularly how to distinguish them from multiple given names. I’d like to make some claims and ask you whether they are correct:

  1. Given names such as "Hanspeter" (no hyphen or space) are always compound given names, so use Hanspeter (Q18012619) instead of Hans (Q632842) and Peter (Q2793400)
  2. Given names such as "Karl-Heinz" (with a hyphen) are always compound given names, so use Karl-Heinz (Q1729801) instead of Karl (Q15731830) and Heinz (Q11682369)
  3. No the more difficult part: there are compound given names such as "George Washington" (George Washington (Q16275947)), "Mary Ann" (Mary Ann (Q18083402)), or "José Luis" (José Luis (Q20856658)) with spaces between the parts. How do I tell these cases apart from multiple given names?

MisterSynergy (talk) 09:44, 13 October 2017 (UTC)

Little help

Hi! I knew nothing about name items before few weeks ago but I am refining a cluster of items for this event. As I will say also at the village pump, the page will be visited by thousands of people who will discover wikidata.

I am doing my best to improve the item in my spare times, as a result I started to create also new items such as Q41480368. I try to limit myself to what is and necessary (commons missing surnames for example) and I learn looking at your edits on my pages, but in these days I am very busy, so I am asking you some advice and also if you can revise everything on that area. I can see from your next edit what else I can do. I create here two questions as separated threads. I try to do my best sorry if it is written somewhere and I did not find everything on my own.--Alexmar983 (talk) 12:08, 21 October 2017 (UTC)

A missing item of middle name, similar to a more common form

My first question is about a gut with three names. I have learned how to write the numeral, but second name is a rare form. See also Talk:Q42297641. What should I do now? create it? Another guy uses the form in his middle name Q35285685. If you make an item can you show it to me? Thank you.--Alexmar983 (talk) 12:08, 21 October 2017 (UTC)

Brazilian double surnames

My second question is about a very rare Brazilian (wrong transliteration fro Italian) surname of Q42177679, I wanted to add it but it is very rare. I cannot find anyone with decent IDs with it. I removed the SCOPUS alias that "mistakes" it as a middle name see here but I don't like that. The alias is given by scopus and should be kept, but how to inform that Guarnetti is a surname? Same problem with Brazilian Q42149058. Other examples that are not addressing the problem are Q13632547, Q10361254 and so on.Any advice here? Thank you.--Alexmar983 (talk) 12:08, 21 October 2017 (UTC)

Two occurences are enough for an item of a surname

When I create an item of a person without surname here on wikidata, I create it if at least another person item with such surname exists. Is this ok as a strategy?--Alexmar983 (talk) 11:48, 28 October 2017 (UTC)

One is sufficient if you are sure it's the family name. When doing large numbers of first names, I used 2 or even more, just to make sure I didn't create too many for misspellings.
--- Jura 08:57, 5 November 2017 (UTC)

Person, Jr. vs. Person Jr.

The English Wikipedia is harmonizing on "Person Jr." instead of "Person, Jr.", should we too? --Richard Arthur Norton (1958- ) (talk) 13:34, 3 November 2017 (UTC)

This feels like something where harmonisation (rather than reflecting local practice) is perhaps going to be a bit confusing - I'm surprised WP are doing it. Andrew Gray (talk) 20:49, 3 November 2017 (UTC)
One reason I was against it at the vote in Wikipedia was because the other projects are not going to do it, so Commons, Quote, Data and the other projects are not following it. As people add in names in the body of articles they still add them under the old convention and leads to inconsistency. --Richard Arthur Norton (1958- ) (talk) 16:35, 4 November 2017 (UTC)
w:Wikipedia:Manual_of_Style/Biographies#Generational_and_regnal_suffixes spearheaded by w:User:Dicklyon. "Omission of the comma before Jr., Jr, or Jnr, and Sr., Sr, or Snr, is preferred. The comma can be used in cases where it is clearly and consistently preferred for a particular subject in current, reliable sources (most likely a living subject whose own preference is clear and consistent). Articles should be internally consistent in either omission or use of the comma for any given person's name.." The vote was against allowing a bot to change all, so they are doing one at a time manually. The key to changing all is the word "current", it isn't based on a simple Google count of "Person, Jr." vs. "Person Jr." --Richard Arthur Norton (1958- ) (talk) 14:06, 6 November 2017 (UTC)

Sometimes two given names in Latin script just happen to be spelled the same and we can differentiate them by gender, e.g. Jean (Q7521081) [m] and Jean (Q7521081) [f]. These are linked together with opposite of (P461). When querying for Latin script given names and this allows to exclude them directly. It occurred to me that the same works for various Santos items. This way only Santos (Q18352926) shows up when one tries to find items than can be matched against the name of a person. Also, when one tries to merge duplicates, these items want get in the way.
--- Jura 18:47, 19 November 2017 (UTC)

Overview at WikiProject

Rowing has a nice overview of birth names and given names for various names with non-Latin script: Wikidata:WikiProject Rowing/reports/P1559 for rowers (non-latin script).
--- Jura 13:15, 22 November 2017 (UTC)

Preparing dataset - some questions

Hello! I have almost finished matching Latvian names with OpenRefine, will create something like 500 new items. I have some doubts about two cases:

  • Agne is Latvian female name, but Agne (Q4926003) currently is instance of male name.
  • Ardis is Latvian male name, but Ardis (Q25113822) currently is instance of female name.

Should I change them to unisex name or create new items for these? --Papuass (talk) 14:36, 4 December 2017 (UTC)

  • I'd do new items, with language of name = Latvian. If possible, please include "native label" and "writing system" statements.
    --- Jura 16:22, 4 December 2017 (UTC)

Qualifier for name format

There are several properties that provide the name as a string.

To identify different namings, I'd like to add a qualifier to the some of the name properties. Items might still need to be made. These could describe:

  • name format: first name, family name
  • name format: first name, patronymic, family name
  • name format: first name, another given name, family name
  • etc.

These items could provide more information in a structured way.

Not sure about which qualifier to use. Maybe criterion used (P1013), has characteristic (P1552), instance of (P31).
--- Jura 09:43, 14 December 2017 (UTC)

Etc. = 8 (personal name system (Q16655449) for example)? --Fractaler (talk) 10:32, 14 December 2017 (UTC)
We have 2 formats/formulas: full nominal formula/format, full name (second column) and short name, modern short formula/format (third column). --Fractaler (talk) 11:01, 14 December 2017 (UTC)
The idea is to use the one value that describes best the string found in P1477/P1559/etc. All these values could be instances of Q16655449. So yes, maybe 8, 16, etc values, at least if they are different in one way or the other.
I think @Matěj_Suchánek: tried to parse some of these to generate P735/P734. With some surprises obviously.
--- Jura 11:09, 14 December 2017 (UTC)

Tibetan names

Hi, I'm wondering how to deal with Tibetan names. For example, in the case of the Dalai Lama Tenzin Gyatso (Q17293), he has a birth name and a Dharma name (Q1543672). Both are technically given names (most Tibetans don't have family names), and each one contains two names (parts). The birth name is ལྷ་མོ་ (Lha-mo) + དོན་འགྲུབ་ (Don-'grub), and the dharma name is བསྟན་འཛིན་ (Bstan-'dzin) + རྒྱ་མཚོ་ (Rgya-mtsho) [actually the full dharma name is much longer, but it's often shortened to those two parts]. So should we put all 4 names into given name (P735)? Or maybe create a separate property for the dharma name? If we just use given name (P735), which qualifier should be used to distinguish the two names? Any thoughts? --Stevenliuyi (talk) 16:42, 23 December 2017 (UTC)

I went ahead and added the names Q17293#P735. Please let me know if my approach is appropriate. --Stevenliuyi (talk) 21:35, 9 January 2018 (UTC)

Duplicate Japanese name?

I just found Hamano (family name, instance of (P31)family name (Q101352), language of work or name (P407)Japanese (Q5287)) and Hamano (Japanese family name (浜野), instance of (P31)family name (Q101352), language of work or name (P407)Japanese (Q5287), also has native label (P1705), name in kana (P1814), writing system (P282)). Is this a duplicate item that should be merged, or is this correct (perhaps multiple forms of a name, e. g. original Japanese vs. English transcription)? —Galaktos (talk) 21:52, 27 January 2018 (UTC)

✓ Done merged after no one replied in two weeks… in case it was wrong and the items need to be split after all, the only page that linked to Q26203631 (now a redirect) was Hamaya (Q26203632). --Galaktos (talk) 17:56, 11 February 2018 (UTC)

Wiktionary

As we are stuck to spellings, why not to add sitelinks to Wiktionary articles about these specific spellings of names? It would also helps to prevent unnecessary merges. --Infovarius (talk) 13:53, 2 February 2018 (UTC)

Property Surname

For all persons not having surname, I.E Royals, Icelandics, Norse and so on. Shall those have family name (P734) With No value? Breg Pmt (talk) 19:08, 16 February 2018 (UTC)

Initials like « T. » as name classes ?

Just seen the T. (Q19803520) item. It’s noted as « instance of : first name ». I’d prefer at least noting this as « subclass of : first name » because an initial stands where many first name can stand. I guess that the creator @Jura1: did this to be able to use this in « first name » property without constraint violation, or in a batch mixed with actual name. Can’t we do better ? Like authorizing the constraint on « first name » to have a class of name as values ? Doing stuffs this way seem to me like shoehorsing to our model incorrectly to make the constraint system happy. This is (I think) to avoid such stuffs that Wikidata did not have a constraint system in the first place :/ author  TomT0m / talk page 14:31, 22 February 2018 (UTC)

Different transliteration of the same firstname : different items ?

Hi everyone, I've created an item for a woman from Russia who lived in France ; I only know the transliteration of her first name, which is Glafira ; I've seen that there are other women with this transliteration, whose first name is Глафира; however, in the item of the first name Glafyra (Q4139559), the transliteration used in French is Glafyra ; should I add Glafira as an alias or create a new item ? Thanks ! Léna (talk) 13:23, 17 March 2018 (UTC)

  • I noticed you already did: Q50671449. Yes I'd use that on the item for that person, assuming its the spelling she is using in Latin script. You might have noticed the Latin script-statement I added on Q50671449. Also, I'd add a second statement on the item for the person with the (also) new Q50675025 for the Cyrillic script version of the name of the person.
    --- Jura 14:38, 17 March 2018 (UTC)

Plowman

Hi everyone, Plowman (Q1513274) is currently both a family name and a Wikimedia disambiguation page; I've sorted a few of these out recently but in this case it looks like the linked English wikipedia page is a page for the family name and the linked German wikipedia page is a disambiguation page. There were originally two separate items but they got merged in February 2015.

Am I correct in thinking these should be two separate items? I'm not sure how to split them out and retain the correct wiki links... Thanks in advance ! WhiteHartLane (talk) 05:18, 9 April 2018 (UTC)

Multiple issues from Wikidata:Interwiki conflicts

Hello, there are multiple issues with names listed in Interwiki conflicts. Could somebody take a look at these and resolve them using your WikiProject's technique? I'm new to Wikidata, so I rather don't try it by myself, but I know you have defined rules and I hope you'll help with those:

  1. Kamilla×Camilla×Kamila
  2. Jonáš×Jonah
  3. Sarah×Sara
  4. Irene×Irene×Irina
  5. Luigi×Louis
  6. Jenovéfa×Genoveva×Geneviève

They should be mostly fine but some of them are listed in interwiki conflicts for years. --Dvorapa (talk) 20:46, 24 May 2018 (UTC)

  • I had a look at them. Most seem fine now as far as this WikiProject goes. However, one is about a person and another about disambiguation pages.
    --- Jura 07:21, 23 June 2018 (UTC)

Shouldn't names be lexemes instead of items?

Hi,

I am not sure if this has already been suggested somewhere: apologies if so.

Now that we have lexemes, I wonder whether they would not be more appropriate than items to model name. Name items are basically determined by their native label (P1705) and their labels tend to be identical to that string: that seems to indicate that names do not really belong to the conceptual domain, but rather to the lexical one. Names can have variations by gender or case: that seems to match pretty well the structure of forms of a lexeme. Furthermore, people already create lexeme for names (such as Lydia (L362)) - surely we do not want to maintain the same database in two different namespaces?

Lexeme support is still experimental and no mass import should be done, so in any case we should wait before migrating anything.

Pinging Tpt who proposed something similar on IRC (I think). − Pintoch (talk) 14:41, 16 June 2018 (UTC)

 Comment We have lexemes, yes, but unfortunatly only for single languages. Lydia (L362) = Lydia as a German name.
We need to implement Script first:
Lexeme Lydia Script Latin alphabet Lexical Category proper noun (Q147276)given name (Q202444), feminine (Q1775415)
--Kolja21 (talk) 00:49, 17 June 2018 (UTC)
It seems also to me that first and last names are much more lexemes than usual entity. About the script problem, we could maybe just create some macro-language items like "languages written in Latin script" and use language codes like "mul-Latn". Tpt (talk) 17:44, 18 June 2018 (UTC)


todo (family name items)

Items with P31=family name

  1. mostly ✓ Done delete P31=disambiguation (or P31=family name). A check is at User:Jura1/family names/items (P31: nok - conflicts with disambiguation)
  2. delete P31=given name. Some left, check at User:Jura1/first names/items (P31: not ok)
  3. check P279 on such items
  4. check/fix subclasses. ✓ Done. All items have P31=family name
  5. mostly ✓ Done fix English descriptions of items with P31=family name
  6. mostly ✓ Done add missing English descriptions of items with P31=family name
  7. mostly ✓ Done fix English descriptions of items with P31=disambiguation and description = "family name" or "surname"
  8. mostly ✓ Done fix English descriptions of items with P31=disambiguation and description containing "name" or "family"
  9. mostly ✓ Done check links from P734, create new items when needed
  10. delete enwiki sitelinks to disambiguation pages, add to dab items
  11. mostly ✓ Done move enwiki sitelinks of surname pages on dab items
  12. add writing system property. Some done
  13. add native label property. Some done
  14. add Commons sitelinks. Added some, more being done by Laddo
  15. merge possible duplicates
  16. mostly ✓ Done fix Wikidata:Database_reports/Constraint_violations/P734
  17. add P734=novalue to items for people who don't have one
  18. link available dictionary entries
  19. add missing Soundex
  20. add more P734 statement
  21. TBD
  22. @Mike Peel: that he might need to reload the lists for the P734 bot

Some cleanup I'm currently doing. Might take some time to complete. I started out with one and found that I had to do the others.
--- Jura 08:41, 14 July 2018 (UTC)


Roman numerals in names of kings, popes, etc.

Should we (do we?) have items for the numbered part of kings' and popes' names? We have initials instead of given names (Q19803443) for initials, maybe we could have e.g. V (alias "the fifth", 5, etc.) to use with the given name (or another?) property? --Azertus (talk) 16:54, 12 September 2018 (UTC)


Currently, some items include "(frequent )first names in .."-statements with that property. Sample: Q4925477#P793. While we agreed that this isn't exactly optimal, we didn't really have good replacement either. In the meantime there is attested in (P5323). I think that could work fine. --- Jura 04:51, 14 September 2018 (UTC)

That sounds good, but I think attested in (P5323) needs both some property examples and a slightly more "family friendly" description. Maybe it's just me, but I'm not that clear about what constitutes a "lemme" in this context. Moebeus (talk)

Featured item: Crawford (Q20731004)

As we didn't have a family name one before, I added the above. It's a somewhat random choice except that all items about people that should use it as value in P734, do have it. Also related items are available and used correctly as well. --- Jura 04:51, 14 September 2018 (UTC)


The above lists recently created items and highlights some elements. Ideally each item would have at least the "writing system" property set. The reminder can be completed by various tools. --- Jura 04:51, 14 September 2018 (UTC)

Statistics

Wikidata:Lexicographical data/Ideas of queries has plenty that could also work for names. .. I will try to add some to Wikidata:WikiProject Names/numbers as weekly listeria lists. Can we the stats page to the navbar? --- Jura 12:27, 23 September 2018 (UTC)

Latvian and proper nouns

@Jura1: you are welcome to read this article and familiarize yourself with the way Latvian deals with proper nouns. TL;DR version: there won't ever be a proper noun written as "Simanovich" in Latvian or let alone a surname page for it as all proper nouns are converted into Latvian alphabeth and grammaticised. –Turaids (talk) 17:21, 20 October 2018 (UTC)

Baptismal name and native name

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names

Paquiquineo (Q5293029) is an interesting case. His native name was transcribed as Paquiquineo, but we don't know what language this was (though it was almost certainly an Algonquian language). He was baptized as an adult in New Spain and took the name "Don Luís de Velasco" after the viceroy. Certainly his given name is not "Don" and I have removed that, but how can this be better modeled? - PKM (talk) 20:52, 20 October 2018 (UTC)

Lexeme as names, this is happening

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names

Trying to answer a question on Request a Query, I wrote a query about names that found lexemes :

select ?nameEntity ?name (lang(?name) as ?lang) {
  ?nameEntity wdt:P31/wdt:P279* wd:Q101352 .
  optional {
    ?nameEntity (wdt:P1705|wikibase:lemma) ?name .
  }
   ?nameEntity a ontolex:LexicalEntry . # check the lexeme namespace only
}
Try it!

It’s interesting to see that the « family name » item is used also for instance of (P31) statements about lexeme. I don’t really know if it’s a good practice or not, or what it implies, but this is happening. This is an occasion to raise the question : should most name items be actually migrated to lexeme ?

author  TomT0m / talk page 09:35, 31 October 2018 (UTC)

@TomT0m: It can be worth to put this question to Wikidata talk:Lexicographical data as well (I found this thread by accident). KaMan (talk) 16:48, 31 October 2018 (UTC)
@KaMan: Suit yourself, I’m not really actively involved in discussions in neither place, just passing the information :) author  TomT0m / talk page 17:00, 31 October 2018 (UTC)
And even more personal names:
select ?nameEntity ?name (lang(?name) as ?lang) {
  ?nameEntity wdt:P31/wdt:P279* wd:Q202444 .
  optional {
    ?nameEntity (wdt:P1705|wikibase:lemma) ?name .
  }
   ?nameEntity a ontolex:LexicalEntry . # check the lexeme namespace only
}
Try it!

--Infovarius (talk) 13:13, 2 November 2018 (UTC)

Doubt about not latin name

If we have a name write in not latin language, and in some language we have 2 version of transliterated name, how we must manage this case? Or if a language uses a transliteration different from other languages? --ValterVB (talk) 20:01, 3 December 2018 (UTC)

Status quo is that we create as many items for different versions of the name (transliterations, transcriptions, translations...) as we want. Principles are the same as you prefer for disambigs: every spelling worth each own item. --Infovarius (talk) 19:51, 4 December 2018 (UTC)

The Morphology of Steve

Just came across The Morphology of Steve (Q50422077) — a paper by and about people named Steve or variants thereof (e.g. Stefan, Stephanie). --Daniel Mietchen (talk) 22:05, 8 December 2018 (UTC)

Fictional names

Hi, I encountered Cambremer (Q2387635), which seems to be about a fictional name, not about a character. What is the best way to handle this? Should I create a new item <fictional family name> and make this an instance of it? (There are also other rather famous fictional names, like "Buddenbrook", the family name of the title characters of Thomas Mann's Buddenbrooks (Q326909)). Beyond the fact that there are articles about fictional names such items could be useful to describe the origins of the fictional names (inspiration etc.). If there are fictional names, how should they be linked to real names if there happen to be people actually having this family name (even though the author made it up)? Maybe via fictional or mythical analog of (P1074)? Any ideas? - Valentina.Anitnelav (talk) 10:58, 16 December 2018 (UTC)

Seems to me that a name can't be fictional. It can be invented by some author, yes (but some "real" names for real persons were invented by some human too). It is used by fictional character, ok. But it (not its analogue) can be (potentially) used by a real person. Actually real names are creative works as much as names of characters. --Infovarius (talk) 23:19, 16 December 2018 (UTC)
Not everything human-made is fictional but I agree that this is odd as for every fictional name there could exist a person that carries this name. But then: what to do with Cambremer (Q2387635)? Just make it an instance of family name (Q101352) even though it is not a page about a name in reality (concerning its distribution, variations, etc.)? Make it an instance of fictional family (Q15331236) even though it is rather about the name? - Valentina.Anitnelav (talk) 06:31, 17 December 2018 (UTC)
I would use as simply family name (Q101352). Hm, there are several families in Harry Potter and they are not marked as surnames... --Infovarius (talk) 20:52, 17 December 2018 (UTC)
A (subclass of (P279)family name (Q101352)) item for that seems to be reasonable to enable finding/querying/… those names. What about taking a label like "family name of fictional characters" or something similar more elegant? --Marsupium (talk) 01:49, 27 December 2018 (UTC)

Carmel: constraint with P1533

Hi,

I created Carmel (Q60439268) as a male name. There is also Carmel (Q20001709) as a female name. The names are identical, but I thought that if male given name (Q12308941) and female given name (Q11879590) exist, then there should be distinct items.

I see a message about a constraint there: that they cannot both have the same value for family name identical to this given name (P1533).

Is this a good constraint?

Should there be just one item for both male and female names?

Is this perhaps an FAQ that should be documented somewhere?

Thanks! :) --Amir E. Aharoni (talk) 14:45, 5 January 2019 (UTC)

CJKV names

How do we model Chinese, Japanese, Korean, and Vietnamese names, which are usually combinations of 1-3 normal words? We don't seem to have an agreed convention on this yet and it's time that we decide. Deryck Chan (talk) 13:58, 1 February 2019 (UTC)

We model them exactly the same as others, with use of native label (P1705), and additionnal properties if needed (kana for Japanese, for example). Yamamoto (Q24090378) is an example. --Harmonia Amanda (talk) 21:22, 2 May 2019 (UTC)
@Harmonia Amanda: Do we treat the parts of the name as separate items, or treat the whole given name as one item (and create items for every combination of given name that is used by a notable person)? e.g. is Jacky Cheung (Q16781)'s given name "學" [series ordinal 1] "友" [series ordinal 2], or "學友"? Deryck Chan (talk) 13:59, 3 February 2020 (UTC)
@Deryck Chan: For CJV, we create the "compound" given names always. It would be "學友". For Korean, it should be the same (like Jeong-hwan (Q16256121) "정환"), but Korean Wikipedians added some given names in two parts years ago. So now there are thousands of entries using "연" (1) "준" (2) instead of "연준". Most of these have Korean given name element (Q69680123) as an additional P31. Sometimes I try to create the missing compound given names and replace all the uses, but it's not that frequent. There are so many simply missing given names in these languages that "cleaning up" this Korean specificity doesn't seem worth it (but if you are motivated, I'll help, I'm just discouraged doing it alone ^^). --Harmonia Amanda (talk) 14:11, 3 February 2020 (UTC)
@Harmonia Amanda: Understood. Yeah it does sound like a humongous task to try to automate. Deryck Chan (talk) 16:30, 3 February 2020 (UTC)

Property for Digital Dictionary of Surnames in Germany

Dear everyone, recently the property for the Digital Dictionary of Surnames in Germany (Q61889795) was approved: Digital Dictionary of Surnames in Germany ID (P6597). Maybe some of you already saw this.

As noted in the property proposal, my colleagues and I want to look into making the DFD available in mix’n’match and/or doing some reconciliation ourselves, using OpenRefine. This will (probably) unfold over the coming months.

I hope this may be helpful. Best wishes, Julian Jarosch (digicademy) (talk) 10:50, 15 March 2019 (UTC)

Thanks for the suggestion! I’ll keep that in mind. The DFD lists which language(s) a name belongs to – we’d have to compare this to the language of the monolinqual text in native label (P1705).
By the way, our current plan is that we’ll work on matching DFD entries in late autumn / early winter as a concentrated project. I’ll give updates when the plans substantiate.
Julian Jarosch (digicademy) (talk) 14:59, 24 June 2019 (UTC)
@Julian Jarosch (digicademy): I don't think you need to do that. There should be just one item for a given spelling in Latin script. --- Jura 16:52, 4 July 2019 (UTC)
OK! Julian Jarosch (digicademy) (talk) 13:21, 10 September 2019 (UTC)

Update on our intended reconciliation efforts: Starting in early October until just before Christmas, a student will work on mix’n’match and OpenRefine. When they start, they’ll probably introduce themselves. I’ll be mentoring the student, and I will (also) be available if there are any queries or concerns. (As stated on my user page, what I do is paid editing (though it’s scientific work), and for the student, it will be part of their studies. I hope I’m sufficiently transparent about our plans and intentions.) So, there should be some developments in just about a month! Julian Jarosch (digicademy) (talk) 13:21, 10 September 2019 (UTC)

Hello everyone, I am a student of Digital Humanities. Since last week I am working on the project to link the articles of the German digital family name dictionary with the Wikidata entries and will work on it for another eight weeks. Ckubosch (talk) 12:06, 16 October 2019 (UTC)

#all items for names with Latin script (generally just one per spelling, a few incorrect ones mixed in (e.g. in Cyrillic)
SELECT ?item ?nl ?id
WHERE
{
    ?item wdt:P31 wd:Q101352; wdt:P282 wd:Q8229 .
    MINUS { ?item wdt:P282 ?ws . FILTER(?ws != wd:Q8229 ) }
    ?item wdt:P1705 ?nl
    OPTIONAL { ?item wdt:P6597 ?id }
}

Try it!

#items with Latin script, lacking "native label" property
SELECT ?item ?l ?nl
WHERE
{
    ?item wdt:P31 wd:Q101352; wdt:P282 wd:Q8229 .
    MINUS { ?item wdt:P282 ?ws . FILTER(?ws != wd:Q8229 ) }
    MINUS { ?item wdt:P1705 ?nl }
    OPTIONAL { ?item rdfs:label ?l . FILTER(lang(?l)="en") }
}

Try it!

Sample input format for QuickStatements for Digital Dictionary of Surnames in Germany ID (P6597):

Q13406268P6597”1”
Q4115189P6597”2”

This allows to add a statement to many items in a series. It wont re-add a statement if it's already present.

To add missing native label (P1705) statements:

Q67501872P1705mul:”Warrain”

Thank you very much! Ckubosch (talk) 13:50, 16 October 2019 (UTC)

Hello everybody, I have linked about 2000 to 3000 surnames from Digital Dictionary of Surnames in Germany ID (P6597) with Wikidata so far. For this I tried different tools like Mix'n'Match and OpenRefine but also compared our data with Wikidata using KNIME and linked some names using Quickstatements. Ckubosch (talk) 13:09, 6 November 2019 (UTC)

@Ckubosch, Julian Jarosch (digicademy): seems to be coming along fine. Supposedly all entries are in Latin script and one could create new items for any we are still missing. If you want some help to finish it up, I can try to do the upload of the remaining ones. If so, please save the list at User:Julian Jarosch (digicademy)/CC0 released data? --- Jura 15:00, 7 December 2019 (UTC)
Hello Jura, yes, there are currently only about 2300 clear matches between Wikidata and DFD left, which we’ll probably upload quite soon. After that, I intend to match and upload newly published DFD entries at irregular intervals. I’ll probably stick with our current approach, which is to match conservatively: only Wikidata items with only Latin script (Q8229) as value of writing system (P282) (based on your query above, thanks!), and with an exactly matching native label (P1705) value. I want to minimise the danger of making erroneous matches on my end. I hope the M’n’M catalogue will also become usable, so that there will be another way of adding statements.
If you want to go into creating new Wikidata items from DFD entries, you could use the list hosted at [4], which has all published DFD entries. It’s already being updated every two weeks; it should be up to date around the third and seventeenth each month. I might also make the workflow we use to match between Wikidata and DFD public; that could be one possible starting point for creating new items.
Julian Jarosch (digicademy) (talk) 12:54, 10 December 2019 (UTC)
As of last Tuesday, there were 9000 items with Digital Dictionary of Surnames in Germany ID (P6597), which should be most of the currently possible matches. On Tuesday I also reduced the number of constraint violations to close to none – except for the requirement of language of work or name (P407). Yesterday, I released the workflow we used to find the matches while preventing a few possible constraint violations. As I said last week, I’ll keep using this workflow (unless someone discovers a flaw in the process).
Magnus Manske has set up a periodical update of the MnM catalog, based on the full list of published DFD entries. (There’s also a list of newly published DFD entries which will be updated every two weeks.) The automatic matching by MnM has increased the number of Digital Dictionary of Surnames in Germany ID (P6597) statements to ~9830, but also introduced some new constraint violations.
From now on and for the foreseeable future, I only intend to match and upload newly published entries, possibly at irregular intervals. I (currently) don’t have plans to create new family name items. I’d love to see the DFD links in use on dewiki, though, but of course that’s a community decision.
Thanks for your (virtual) hospitality towards our project! Julian Jarosch (digicademy) (talk) 16:30, 19 December 2019 (UTC)

Tool to help creating family names from items' label

Hello, I want to share with you my script to help creating family names from items' label :
https://gist.github.com/jonadem/434151b95308403f36a980cc30e612cb
It works with the command line and propose to link the item "Doe (family name)" from the item "John Doe" if the latter has no family name property. If "Doe (family name) doesn't exist, it will first create it. There are room for improvements but it helps me to add 150 family name properties to a list of politicians.
Jona (talk) 22:05, 20 March 2019 (UTC)

Family names containing 'von', 'van', 'de', etc

What is our recommended practice for family names containing a prefix such as 'von', 'van', 'de', etc -- ie a nobiliary particle (Q355505) ?

Looking at actual examples, we seem to be inconsistent:

  • As far as I can see, for people with names prefixed with "de", it appears the value of family name (P734) generally does include the "de". (Though this search only found 317 such names -- though that will be an undercount because of missing native label (P1705) statements.
  • On the other hand, for names prefixed with "von" and "van", it seems we give values of family name (P734) that generally don't include the "von" or the "van" -- for instance, if one looks at the backlinks for eg Kampen (Q37220938)  View with Reasonator View with SQID or Sivers (Q25693108)  View with Reasonator View with SQID on Reasonator, one finds links back to people with and without the van or von. (Though looking with this search and this find does find 370 and 113 items for family names with such prefixes; and again that will be an undercount as before).

Part of the issue may be that we want to do two things with family name (P734).

  • On the one hand, we want to be able reconstruct the name -- and we want to reconstruct "Charles de Gaulle", not "Charles Gaulle"
  • On the other hand, we want to be able to use P734 for sorting: for example the Template:Wikidata Infobox (Q47517487) on Commons tries to use it to create a DEFAULTSORT (pinging @Mike Peel: for input re this): but most catalogues would index the name as "Gaulle, Charles de" -- eg LoC, [BnF, GND, others at VIAF.

It seems to me there is a choice of viable ways forward:

  • Either include the prefix as part of the family name, so have different family name items for "Kampen" and "van Kampen", but include a property on the family-name item to indicate that the name starts with a non-indexing prefix
  • Or only have a single family name item for both "Kampen" and "van Kampen", but include a qualifer in the statement on the person item (eg "family name prefix") for people where the name is prefixed by van (Q1258618).

For myself I can see the attractions of the first approach: a clean different item, with a statement that only needs to be made once -- but I marginally prefer the second approach for practical reasons: firstly, because it is easier to have to add in text to make a full name, than to have to remove or reorganise text to make an index form, the latter requires string operations that can be costly or difficult eg in queries. Secondly, because if we are using LoC, BnF, GND to authoritatively source what part of the name should be considered the family name, users are likely to write statements based on what they see there, ie ignoring the 'von' or 'van', and if both "name" and "de name" point to the same item when people try to add it as a value, then that should result in systematic addition, whereas if there are two different items that could be added, there will always be some users adding the wrong one.

But that's just my instinct, it would be good to know what other people on the project think. Either way I think we're going to need a new property; but it would be good to get a sense of which route is preferred, before opening a formal property proposal.

One further note. For "Van" (with a capital V) in American surnames, the indexing convention is not to regard this as a prefix, but as part of the name itself -- eg David Van Nostrand (Q5240604) -> LoC, VIAF; John Hasbrouck Van Vleck (Q193655) -> LoC, GND, VIAF. So on the first model, an American capital "Van" name would need a different item (with no non-indexing prefix statement), compared to an item for a Dutch small "van" name. On the second model David Van Nostrand (Q5240604)family name (P734)Van Nostrand (Q7913548), but Nico van Kampen (Q1284063)family name (P734)Kampen (Q37220938) with qualifier "family name prefix" -> van (Q1258618)

As I said, both models seem workable; we don't seem to consistently jump one way or the other at the moment; my own slight preference would be for the second one, as easier to work with for the indexing; but I'd like to hear wider thoughts, and from people with sharp-end implementation experience. Jheald (talk) 18:58, 27 April 2019 (UTC)

@Harmonia Amanda: Andrew Gray thought you might be a good person to ask for your recommendations on this. Jheald (talk) 21:17, 2 May 2019 (UTC)
  • The common element in your sample seem to be the " ". Not sure why that should lead us to mix them, even if the US census ignores cases and aggregates them for statistical purposes. In general, if a person has two surnames, add items. If the surname consists of several parts, make an item for the entire name. You may describe the parts on the item. If sometimes one version is used, sometimes another, add several values to P734. --- Jura 13:51, 23 June 2019 (UTC)

Jean opposite of Jean?

Wikidata:WikiProject Names/Help, Jean (Q4160311) is opposite of Jean (Q7521081)? No, they are not opposite. They are just two different items. If they were opposite, then anyone not Q4160311 is Q7521081, and vice versa.--Roy17 (talk) 14:50, 29 May 2019 (UTC)

Agree. There's given name version for other gender (P1560) for this. --Infovarius (talk) 13:50, 30 May 2019 (UTC)
Hm, Jura1, and what is given name used for females or males compared to the same used for the other gender (Q21012914) for? --Infovarius (talk) 13:52, 30 May 2019 (UTC)
  • The names are a problem for automatic matching. If you don't need the statement, please don't use it. P1560 is for same names in the same language (e.g. Jean (fr) → Jeanne (fr) ) --- Jura 13:51, 23 June 2019 (UTC)

Latvian

See #Latvian and proper nouns above. --- Jura 13:51, 23 June 2019 (UTC)

LC on family names

SELECT ?item ?itemLabel ?value
{
	?item wdt:P244 ?value; wdt:P31 wd:Q101352
	SERVICE wikibase:label { bd:serviceParam wikibase:language "en"  }    
}
LIMIT 1000

Try it!

Not sure what to think of it.

  • Currently there are some 300 items for family names with Library of Congress authority ID (P244).
  • The ones I checked were labelled there "<family name> family".
  • While this may correspond to a specific family with that surname from a given place, I don't think the notices go beyond the mere name. This would be an argument for adding them to family name items as opposed to adding them to items for specific families.

@Thisismattmiller: who added some of them. --- Jura 16:52, 4 July 2019 (UTC)

In terms of what they mean in context of the Library of Congress Subject Heading system (these are all subject headings):

"A heading needed for use as a subject should be established in LCSH and tagged 100. Once established, the heading is used for works about all families with that name and is not specific to a particular family in a particular place or time" source

So they are generic family name identifiers. We do have specific family headings in our name authority file but only when the family name is involved with the creation of a work. Thisismattmiller (talk) 17:38, 11 July 2019 (UTC)

Irish names

Over the last couple of days, I tried to work on Irish names, here meaning names of people with country of citizenship (P27)=Republic of Ireland (Q27).

A few summary reports (most frequent):

Also names with prefixes (any P27):

Lists including P734:

A few todo lists at (also including P19/P20/P119, if >1400):

Cross-referencing of items for Irish and anglicized forms still needs to be done.

Some stats:

Suggestions for additions are welcome. --- Jura 13:17, 20 July 2019 (UTC)

family names: 2,000,000

count P734
approx. 2,000,000
count P734 with P735
approx. 1,400,000 (= approx. 600,000 without P735)
count P735
approx. 3,300,000
distinct P734 values
approx: 200,000
most frequent P734 values
Li (33000), Wang, Zhang, Chen, Liu, Yang, Zhao, Wu, Smith (9500), Huang (9400)
most recent family names
report
sampling by P734 value
reports at Arnulf, Crawford, Wilson
featured items
Crawford (Q20731004)
P734 values by country of citizenship
Ireland
P734 available per Wikipedia
itwiki (43%)
Infoboxes making use of P734
Commons (for categorization and sorting)

Looks like we are almost at 2 million statements with family name (P734). Interestingly, we start getting more items that have P734, but no P735 yet.

Maybe a good occasion to produce more statistics on P734. Please add to the table to the right. --- Jura 11:33, 24 August 2019 (UTC)

Some more statistics at Property_talk:P734/numbers/values. --- Jura 15:00, 7 December 2019 (UTC)

Interesting discussion about Slavic family names

See [[Wikidata:Project_chat#Gender_version_of_the_family name (P734)|Project chat]]. Klaas `Z4␟` V07:51, 23 September 2019 (UTC)

Name frequency

Are there any suggestions for how the frequency of a name (in a given country) should be described? I've seen the frequency given indirectly via attested in (P5323) but that does not seem to be the primary functio of that property. /Lokal_Profil 19:07, 1 October 2019 (UTC)

It's the solution we are currently using. Before we used "significant event". Please avoid adding it in P31. --- Jura 00:19, 24 November 2019 (UTC)

Help with Sybil/Sybilla

I understand that name items need to be for a specific string, so Sybil (Q4851022) and new item Sybilla (Q70630281) should be separate - but what should I do with the labels and sitelinks for "Sybilla" that are on "Sybil"? - PKM (talk) 20:24, 12 October 2019 (UTC)

Never mind, Jura1 (talkcontribslogs) sorted these out. I’ll know for next time. - PKM (talk) 00:49, 13 October 2019 (UTC)

For some reason, new editors tend to try to add their name there, bork the page and then disappear.

I requested semi-protection of the page to avoid that. New editors would have to do some edits (50 or 100 I think) to become "autoconfirmed" before they can sign up. I don't think we would loose many contributors by doing that.

In the discussion with @1997kB: and @KlaasZ4usV: on WD:AN, it was preferrerd to bring this up here first. --- Jura 00:19, 24 November 2019 (UTC)

Obviously, it would still be possible to do a protected page edit request ;) --- Jura 18:05, 1 December 2019 (UTC)

German family names

This finds a series of recently created items that still need family name (P734). I'm not really sure what to do with them.

The answer probably varies depending on whether the person is a DE/AT/CH/LU/FL/other-national and when they were born. Maybe someone wants to work on these.

Items for other non-British people may have the same problem.

BTW please bear in mind the values used for P734 are different from the ones for family (P53). --- Jura 18:05, 1 December 2019 (UTC)


"van" and "Van" prefixes

Recently a new property was created for family name prefixes (Property:P7377): such as "van", "Van", "de", "von", etc.

It's called "tussenvoegsel", but it wasn't specifically proposed or created for Dutch names. @1Veertje: who proposed it.

Some names spelled it with an uppercase "V", others with a lowercase "v". A specific name item should only use one version.

Most items just use one version and I added P7377 to some of these, but a few other items, still need cleanup, e.g. native label (P1705) doesn't match the version used in the label or some of the labels are with "V" other with "v". This finds some. --- Jura 18:05, 1 December 2019 (UTC)

The uppercasing of Van is a bit tricky, since it depends on context. A surname is written like "Van der Waals", but in the Netherlands, with a given name it's lower case as in "Johannes van der Waals". In Belgium, it's always capitalized, as in Jean-Claude Van Damme. Ghouston (talk) 10:29, 4 January 2020 (UTC)
On an item like van der Velden (Q7913878), I think it should have a capital V, since the surname alone would always be written that way. Only when combined with first names or initials would it be lower-case V, but then I don't know how to distinguish Netherlands / Belgian conventions if you want to be able to construct a full name from parts. It can't be done on the surname itself, since the same surname may be used with either convention depending on who is using it, but we can probably assume that there's a preferred style for any given full name. Ghouston (talk) 21:25, 11 February 2020 (UTC)
Thinking about it some more, constructing a name from parts won't work in general: there are too many pitfalls if you want to get the way that a person's name is usually written. E.g., somebody may be generally known by intialls like J. K. Rowling (Q34660), another may always use a nickname (like Steve when their legal name is Stephen), somebody else may prefer a middle name, others are stage names etc. In general, you'd use the label for the commonly used form, not try to construct it. I'm not even sure what's supposed to go into the forename and surname properties: the parts of the commonly used form, so that they can be used to construct a sort order? But to get back on topic, on items like van der Velden (Q7913878), use a capital V. Ghouston (talk) 21:43, 11 February 2020 (UTC)
Spelling Naamdragers 2008
Vanderlinden 2.677
Van der Linden 1.723
Vander Linden 507
Van Der Linden 329
van der Linden 308
Van Derlinden 12
Van der linden 8
VanderLinden 1
How would you accomodate nl:Van_der_Linden#België (table above)?
It should be possible parse "name in native language" (and a few others) with P735/P734 etc. --- Jura 13:20, 12 February 2020 (UTC)
Interesting. I'd say "Van Der Linden" and "van der Linden" are the same name, but the full names are written with different conventions. The surname on its own is always written "Van Der Linden". So the capitalization is a property of the full name, not of the surname alone. The spelling variants I'd treat as different names. I'm not sure if the variant Vanderlinden is considered to have a tussenvoegsel or not: it just depends on whether you want the surname sorted under "V" or "L" when ignoring the tussenvoegsel. Ghouston (talk) 21:01, 12 February 2020 (UTC)
Hmm, that's not quite right, since "Van der linden" can be written like that, with the lower case "d", and then the case of the "d" never changes. That implies that "Van Der Linden" and "Van der linden" could be treated as different surnames if desired, but they are identical to "van Der Linden" and "van der linden" respectively. It's tricky, and probably a Falsehoods Programmers Believe About Names situation. Ghouston (talk) 21:10, 12 February 2020 (UTC)

(enwiki) sitelinks for surnames

At User_talk:Tagishsimon#Wikimedia_list_of_persons_by_surname_(P734)_(Q58408484), I had a short discussion with @Moebeus: about a new type of items where some sitelinks to enwiki are placed.

Feel free to comment here or there about it. --- Jura 15:00, 7 December 2019 (UTC)

Compound names (2)

Compound names were mentioned above, but what is the justification for David John (Q65556213)? It's just a first name and a middle name, isn't it?. @Moebeus:, who created it. Ghouston (talk) 10:23, 4 January 2020 (UTC)

Hi there! I think maybe there are some cultural differences here (or maybe I'm just ignorant), to me "David John" is a double given name, or a common pairing, much like José Antonio (Q6291381). If this is against some guideline it was unintentional and I'd happily adhere to the agreed upon consensus of course. Moebeus (talk) 11:04, 4 January 2020 (UTC)
I think it may be just a 2nd order effect of two very common names "David" and "John". To me, it would never be used as a single name, as in "Good morning David John, how are you today?" like you would find with a true compound name. Ghouston (talk) 11:23, 4 January 2020 (UTC)
Oh, I'm pretty sure David John's mother would address him as "David John" whenever he was naughty ;-) Some people choose to use both their names, some people just one of them, some use initials. That's really up to the name bearer him/herself, and not for us or anybody else to decide? If in doubt, and so not to be rude, I would ask David John: "What should I call you, David or John or do you use both?". At least that's what I grew up with. Moebeus (talk) 14:26, 4 January 2020 (UTC)
But that would apply to any name combination. I suppose we could ask, but that wouldn't be a decent reference for Wikidata. If you can find in documents that a person is generally addressed as "David John" in a first name context, then fine. That seems to be the case with names like Billy Ray (Q63093190). But if we add both names as given name (Q202444)¸with series ordinal (P1545), it's not too hard to find all of the "David John" instances with a query. Ghouston (talk) 23:08, 4 January 2020 (UTC)
I'd favor adding the ordinal qualifier over making such combined items.--- Jura 19:37, 2 February 2020 (UTC)
Maybe it's worth mentioning that the items as such can be useful, e.g. Q63093237#P1449. --- Jura 14:11, 9 February 2020 (UTC)
  • Guys, please do not create “compound names” for Portuguese-language person names, as there is no need: We almost all have two given names, yes, (António Fernando here) but that’s 99.99% of the time a free-style combo of regular individual given names. (I promise I will come here and take the day off to create the needed items for the remaining 0.01%, once the clutter is cleared.) Spanish and Italian language names may follow exactly the same pattern, but better have that said by a native. See also Topic:Vgjwoar2l4vv10co, anyway. Thanks. Tuvalkin (talk) 04:12, 11 February 2020 (UTC)

Use of mix'n'match (family names Austria)

Interesting use at https://tools.wmflabs.org/mix-n-match/#/catalog/2890 to check for family name items for Austrian politicians.--- Jura 19:37, 2 February 2020 (UTC)


Categorization at Commons

The categorization of people categories by name is somewhat sub-optimal. Although this isn't directly a problem or a concern for Wikidata, it sometimes leads to contributors doing edits here that are meant to be useful there, but break things around here.

Accordingly, I made a few suggestions at: c:Template_talk:Wikidata_Infobox#Auto-categorization_for_names.

This should end categorization by English label of P735/P734-values at Commons. --- Jura 19:37, 2 February 2020 (UTC)


Infobox at Commons

A few suggestion to improve things at: c:Template_talk:Wikidata_Infobox --- Jura 19:37, 2 February 2020 (UTC)


100,000 "John"

There are now 100,000 items for people with John (Q4925477) as value for given name (P735). A few queries at Talk:Q4925477.

I noticed a recent import had saturated a check I did as the number of items increased by 50%. Fixed and P735 added.

Please suggest queries to add to Talk:Q4925477. Obviously, they could also work for any other given name. --- Jura 19:37, 2 February 2020 (UTC)


  • I changed around the layout and add, notably, the said to be the same as (P460) values. Most queries are accessible from any name (click on the links).
What do you think of it? --- Jura 13:20, 12 February 2020 (UTC)

@Moebeus: --- Jura 13:21, 12 February 2020 (UTC)

  • I'm a big fan! An additional query/report I would love to see is "This <name> forms part of these <compound/double name>s". Like John: John-David, John-Thomas, John-John, etc. Moebeus (talk) 09:55, 19 February 2020 (UTC)
    • ✓ Done @Moebeus: Thanks for your feedback. I wasn't really sure where to add it. Ideally it would be an enumeration above "related", but LUA would need to have them listed with "has part" (not sure if we really want to do that). So currently it's a query in the count section. Jean might be a good sample. I'm less convinced by the ones shown at "John", but that's another debate. --- Jura 10:58, 19 February 2020 (UTC)
Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names

I rearranged the queries. Please test the talk pages of a few given name items (using English interface language). Samples:

What do you think of it? Can we announce it to other users? --- Jura 08:41, 19 February 2020 (UTC)

@Jura1: Thanks for this. Since I have preferences in French I had never since this template before, It is really interesting. Where is the documentation for this template and how does it work? I would be happy to translate into French if needed. PAC2 (talk) 22:03, 13 December 2020 (UTC)

Russian descriptions for items

Please see Wikidata:Bot_requests#Fix_ru_description_for_given_names.

I fixed some of the more frequently used items. --- Jura 14:11, 9 February 2020 (UTC)


Pronunciation audio

Some time ago, I added more pronunciation audio (P443)-statements to items and made some stats at Property_talk:P443#Statistics_by_language_(items) (last column "nameitems" is about given name/family name items).

If there are more in German than other languages, that might be due to the categorization of the files at Commons. I found them easier to list than those for other languages.

Maybe you manage to identify more. English and French probably have more. --- Jura 14:11, 9 February 2020 (UTC)

More about Dutch and French audio files. --- Jura 13:20, 12 February 2020 (UTC)

Create items about undifferentiated Japanese names?

Some sources only list either the Kanji or kana/transliterated form of name of somebody. I propose to create new items about undifferentiated forms of Japanese names. For example for Itō (Q6500676) we create two new items for "伊藤" and "いとう". @Suisui:--GZWDer (talk) 01:57, 23 March 2020 (UTC)

Ahh, yes. I think we need these items too. I've misunderstood 伊藤 has already have item for いとう. I just corrected a Japanese surname incorrectly given by a bot. Sorry about that. Many wikipedia has article about Japanese name/family name as it's name in kana (P1814) (most of those are disambiguation. So create Kana name/familyname is a way it shuld be as I think. Thanks. --Suisui (talk) 14:45, 23 March 2020 (UTC)
That sounds like within the realm of lexemes, perhaps (or maybe both)? English Wiktionary already has wikt:いとう, wikt:伊藤, wikt:伊東 etc. Multilingual names might be an issue (#Shouldn't names be lexemes instead of items?), but this particular case doesn't seem to involve languages other than Japanese. whym (talk) 12:47, 24 March 2020 (UTC)

Different items for Spanish / Portuguese

Is this normal and desirable - José Francisco (Q54210754), José Francisco (Q52240811)? I'd be inclined to consider them the same name, and merge. Ghouston (talk) 08:23, 1 April 2020 (UTC)

They have different translation (transcription?transliteration?) to Russian... --Infovarius (talk) 19:55, 21 June 2020 (UTC)
Do they? Neither has a Russian label, and Google translate suggests Хосе Франциско either way. Ghouston (talk) 03:54, 22 June 2020 (UTC)
I've added both labels. This is a subtle thing where one can't trust GT. --Infovarius (talk) 21:52, 23 June 2020 (UTC)

Hello. It seems to me that there is something wrong in the existence of those two different element, partly due to the way Roman tria nomina evolved over time. Only Flavius (Q32979165) (as a nomen (Q3760158)) should exist. The fact that we found it in first position in the naming of a number of Romans, even famous one as Constantine the Great (Q8413), is not a proof of its validity as a praenomen (Q1240901), but of the usage's decay of the latter, which had become largely redundant (more on it at [5], p. 130 sqq.). In fact, most if not all of the Qid tagged with Q26196891 are from the (very) Late Roman Empire. I propose that either Q26196891 be deleted or that the two elements be merged so as to only keep Flavius (Q32979165). --Jahl de Vautban (talk) 10:05, 11 April 2020 (UTC)

Help for Greek names

I am confuse. I have some questions about given names and family names. I am interesting about Greek names.

1) Do each Greek given name have to has different item? Even if there is the same name in another languge, but of course with different writing system?

2) All Greek give names must have:

3) Nikolas (Q16423029) is a name that also exist in Greek language (Νικόλας). Nikolas (Q16423029) has native label (P1705) -> Nicholas (multiple languages). Can I use it for a Greek person with the name Νικόλας? ( Nikolas (Q16423029) has writing system (P282) -> Latin script (Q8229) )

4) If the answer in number 3 is that I have to create a new item for Νικόλας (may we already have, I just use the name as an example), do I also have to create new items for short name Νίκος; For some persons the formal given names are Νικόλας but all their life were called Νίκος (and that is how the sources called them). And if yes, how to connect these items (Νικόλας and Νίκος)? (Also Νίκος can be a formal given name itself).

5) Some short names like Άκης may come out from Χρήστος, Δημήτρης, Μιχάλης etc. And we may don't know from which one. Should I use for all persons the item Άκης;

6) Nikolas (Q16423029) has writing system (P282) -> Latin script (Q8229), and language of work or name (P407) -> Dutch (Q7411) and English (Q1860). Is that acceptable because both language are using Latin script? For, names using Greek alphabet, language of work or name (P407) should always be Modern Greek (Q36510) ONLY? Because only Greek language is using Greek alphabet.

7) If I have to create item Νικόλας, should I connect it with Nikolas (Q16423029)? The first one is writing with Greek alphabet, the second one with Latin script. And should I connect them with said to be the same as (P460)? Or with P460 I can connect only given names with the same writing system?

8) What about the labels? Some times I see "(given name)" or "male given name (Νικόλας)". Why? Or do we have to write something different to the language that are using the same writing system as the given name?

9) A person's item has name in native language (P1559) in Greek. Then the given name (P735) value must also be a given name with writing system in Greek alphabet. Correct?

@Jura1: I pinged you as you told me. Thanks. Data Gamer play 16:23, 10 July 2020 (UTC)

Usually we start adding P735 based on what is in name in native language (P1559) (generally identical to the label in that language). This can be a short form of the name. Additional P735 can be added based on other forms.
The answer to #1 is yes. Much might not have been done for Greek names, so initially you might need to create a few. This is fairly easy with QuickStatements. --- Jura 16:48, 10 July 2020 (UTC)

I am still confused. Data Gamer play 16:54, 10 July 2020 (UTC)

@Jura1: Please see Nikolaos (Q1496652). Are the properties correct? It has:

Data Gamer play 18:50, 10 July 2020 (UTC)

1) yes and yes.
2) yes but I doubt which is better: Modern Greek (Q36510) or Greek (Q9129)...
3) mostly not. It is intended for persons which native name="Nikolas" in some Latin-script language. But sometimes I add it to Greek persons too if I can't tell exactly about native language, or they have long history in Latin-language countries (and probably known mostly in such countries). But this can be arguable.
4) yes, create for both Greek names (full and "diminutive"). Here I doubt about diminutive (e.g. I don't create items for diminutive Russian names as they are only forms of full names and can't be regarded as separate notion). But if you say that it can be used as a formal name itself, then I suppose we should. We should link them with said to be the same as (P460) as usual. But also we can add
⟨ Νίκος ⟩ instance of (P31) View with SQID ⟨ diminutive (Q108709)  View with Reasonator View with SQID ⟩
of (P642) View with SQID ⟨ Νικόλας ⟩
5) This example just ensures me that we have to create an item and use it as a P735. And we can link to full names as in previous point.
6) Just one thing: not only Greek language is using Greek alphabet. We also have big and important Ancient Greek (Q35497) and some other subclasses of Hellenic (Q2042538).
7) of course with P460. But also you can add (one or several variants of) transliteration or transcription (P2440) to other writing systems to both items.
8) I advise you to add importScript('User:Harmonia_Amanda/namescript.js'); to you "common.js". It does all the work with one click.
9) yes.

I've restored Nikolaos (Q1496652) as a Latin name. We should use Nikolaos (Q74695481) for Greek persons. This means we should change P735 for almost all https://w.wiki/WgK --Infovarius (talk) 17:01, 11 July 2020 (UTC)

Thanks! Data Gamer play 14:21, 12 July 2020 (UTC)

Safia/Safiya wrong merge

i think this merge was wrong. 2 spelling variants, and name and disambiguation are combined in one item now. can someone check and repair please? 2003:E5:3706:5600:447F:F4EB:B92:858F 01:33, 4 August 2020 (UTC)

see also Safiya (Q20001009). 2003:E5:3706:5600:447F:F4EB:B92:858F 01:39, 4 August 2020 (UTC)

✓ Done

Merge this male given names?

Are these two different data sets or should they be merged? Abram (Q323415) -> Abram (Q98146955). --HarryNº2 (talk) 19:32, 8 August 2020 (UTC)

Please no. --Infovarius (talk) 12:53, 11 August 2020 (UTC)
Why not? The Russian label in the data set for Abram (Q323415) is Абрам, what is the difference between the data set Abram (Q323415) (male given name) and Abram (Q98146955) (male given name (Абрам))? --HarryNº2 (talk) 18:01, 11 August 2020 (UTC)
native label statements are different. What the labels are in various languages is less relevant. --- Jura 07:50, 12 August 2020 (UTC)

Different items for given name, nickname and family name

Hello, I was wondering why there are different items for given name, nickname and family name. I understand why you would have seperate items for names of people and disambiguation articles, but not why you would have a seperate item for given name, nickname and family name. This could cause issues since name articles cover all uses of that name. So, having multiple items means that there is a risk for different language version to not all be linked at the same item. The only reason I can see is, if you would want to cover the history of a surname in a seperate article. Mikalagrand (talk) 12:10, 9 August 2020 (UTC)

Wikipedia seems to have plenty of separate articles, e.g., en:John (given name) and en:John (surname). If there's a Wikipedia article that discusses a name in both contexts, it would need to be linked to a similar item in Wikidata, although I can't find any examples off hand. Ghouston (talk) 12:29, 9 August 2020 (UTC)
@Mikalagrand: Hi. You appear to be considering this from the perspective of a Wikipedia user. Wikidata is a source of structured data and has to serve all Wikimedia projects, not just Wikipedia. As an example, merging given name and family name will break some category structures on Commons, which rely on the separate items. From Hill To Shore (talk) 12:40, 9 August 2020 (UTC)
Okay, thanks for the explanation. Mikalagrand (talk) 13:11, 9 August 2020 (UTC)

Austrian noble titles (pre 1919)

This recent revert by User:HarryNº2 for Tuma von Waldkampf (Q98361068): I’m fully aware of the rule about spelling variants being considered different family names. However, I consider it doubtful that this reasoning applies for most Austrian noble titles (pre 1919). Tuma von Waldkampf and Tuma-Waldkampf (and in this case Tuma v. Waldkampf) are not spelling variants of the same family name in the sense that one member of the family might be called this way and another one that way. They are the same family name that could be used interchangeably by the same person depending on context and personal choice. The each string should have a distinct item does not make any sense here. --Emu (talk) 23:44, 13 August 2020 (UTC)

What would the lemma be if an article were created on it? In addition, Tuma-Waldkampf is a double surename; v. is an abbreviation of von. Both belong in the person article, not in the data record for the surename. The linked articles Anton Tuma von Waldkampf (Q98361356) and Marianne Tuma von Waldkampf (Q98359766) don't even contain the name variants in the fied "Also known as". --HarryNº2 (talk) 00:10, 14 August 2020 (UTC)

Defective items for family names taken from 2010 U.S. Census data

It appears that all of the names listed in the 2010 United States Census surname index (Q92953148) were added in 2017 by @GZWDer (flood):. There are a number of issues that were inherited from how the U.S. Census Bureau produced that particular data set. The technical documentation can be found here [6], but in short, all spaces and punctuation were removed and any surname that was longer than 15 letters was truncated (this is not mentioned in the documentation and may come from a limit on the census form). Also, no information about capitalization or diacritic marks was retained. This led to the addition of such items as Lloydjones (Q37507662) instead of "Lloyd Jones" or "Lloyd-Jones", Martinezlopez (Q37536221) instead of "Martínez López", and Venkatasubraman (Q37064759) instead of "Venkatasubramanian".

I'm not sure of the best way to go about correcting these issues here on Wikidata. A lot of these could be like van der Wal and variants (Q65557890), where there exist several different variations of spacing, hyphenation, and/or capitalization. I also don't know if the names that are obviously truncated and are unlikely to have ever been used by any individual (such as Sanchezrodrigu (Q37522941)) are worth retaining in any form or if they should be deleted outright. Any thoughts on how to handle this would be much appreciated. --Quesotiotyo (talk) 03:27, 17 September 2020 (UTC)

Borbone-Parma

Can you please take a look at this data set: Borbone-Parma (Q99519032). I think something got mixed up there. --HarryNº2 (talk) 17:20, 22 September 2020 (UTC)

Using P527 for Japanese names

I thought it might be a good idea to use has part(s) (P527) on Japanese surnames. See [Tanaka]. This would allow us to query what characters are in the names, or how often are Japanese names with 3 characters, for example. I was hoping to hear what the community thinks of this approach. NMaia (talk) 15:11, 13 October 2020 (UTC)


Use of cli for item creation

At latinscriptfemalegivenname.js (for female given names in Latin script), I made a sample template available.

It's for use with WikibaseJS-cli. See there how to install the tool/use the template.

Clarenza (Q100967811) was made with it. --- Jura 08:22, 28 October 2020 (UTC)

It created Orenzio (Q104411155). --- Jura 13:26, 21 December 2020 (UTC)


With kunya (P8927), it seems that finally Arabic names will be expanded. Good news! Thanks for looking into this. --- Jura 11:04, 7 December 2020 (UTC)


@Abu aamir, Meno25, Michel Bakni:

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names --- Jura 11:04, 7 December 2020 (UTC)

How to expand in 2021 ?

What should we try to add or expand in 2021?

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names --- Jura 11:04, 7 December 2020 (UTC)

New template to explore given names

I'm often curious of a few queries about given names. A few weeks ago, I had the idea to create a sample notebook to explore a given name (User:PAC2/Portrait d'un prénom).

Yet I had another idea : create a simple template with predefined queries for given names which can be used on talk pages of items.

The outcome is here Template:Notebook Given name and is tested on Ada (Q346047). PAC2 (talk) 22:25, 11 December 2020 (UTC) PAC2 (talk) 22:31, 11 December 2020 (UTC)

@PAC2: I think is a very good template. I suggest to add it to all given names items talk page with bot. Data Gamer play 05:53, 1 January 2021 (UTC)

@Data Gamer: thanks for the suggestion.

In the meantime, I discovered {{TP given name}} developed by Jura1 which is more complete.

{{TP given name}} is deployed automatically in the header of talk pages using MediaWiki_talk:Talkpageheader (available in some languages only).

Another alternative would be to deploy it with {{Item documentation}}. I don't know which is the best strategy.

Of course, feel free to deploy {{Notebook Given name}} if you find it useful. PAC2 (talk) 16:13, 1 January 2021 (UTC)

I would like to have {{TP given name}} available in Greek language. Data Gamer play 18:30, 1 January 2021 (UTC)

Reports for Italian given names (or another language)

Please see Wikidata:WikiProject Names/reports/given names/Italian. --- Jura 13:26, 21 December 2020 (UTC)

Very useful. I wanted to do the same with Greek names, but I don't understand how it can be done. Data Gamer play 05:54, 1 January 2021 (UTC)

I discovered a few issues (see Topic:W02rd48hv9aya84v), so I ended up creating subpages at Special:PrefixIndex/Wikidata:WikiProject Names/reports/given names/Italian.
The main problem is that it's hard to debug without knowing which one doesn't work. The second is that queries with zero results stop the entire page from updating (meaning I left a few items there).
If you want, I can try to set it up for you. --- Jura 16:31, 1 January 2021 (UTC)

@Jura1: I have tried to do it. Please check it. Some problems I found and questions:

a) Some list are not updating. I get error.

b) "Male given names not ending with ς". I use greek ς . Is that correct? Same with "Female given names not ending with "α"". Moreover, for Greek names is more useful to check the last two letters of the name. For example, almost all Greek male names are ending with ος , ας and ης .

c) "Item should have one ονομασία στη μητρική γλώσσα (P1705) statement only" Should names in ancient Greek and in modern Greek have separate items? It's the same name. Sometimes, the only different it the diacritic (Q162940), because ancient Greek uses more than one (see Aristotelis (Q27648524)). That's make problem to "Greek label of item should match native label statement" value also.

d) "Greek label of item should match English label" Should I delete that section?

e) "Native label regex check" How it's work? Should we use Greek alphabet letters?

f) "Relevant categories" Don't get result. Have I done something wrong?

g) "Pages in categories, without P407" Same

h) "Pages in categories, without P31 for given name" same

i) "Coverage" Why should give the countries is the query? Why not to let the query show as the country of citizenship for persons with Greek name?

k) "Missing P735 (frequent)" That query finds the person that have P27 -> Q41 but not have P735? If yes, can you also add Cyprus (Q229)?

l) "Missing P735 (rare)" Why is that different from "Missing P735 (frequent)"?

m) "Values possibly missing" Can you also add Cyprus (Q229)?

n) "Language of given names used" I am confused. But also add Cyprus (Q229). Does it means that person with country of citizenship (P27) -> Greece (Q41) that have statement for given name (P735), that statement has no writing system the Greek alphabet?

Data Gamer play 18:26, 1 January 2021 (UTC)


Oh, that was quick. Seems you went through it quite in detail.

  • (a) I moved some to subpages: this way at least these update. Some of the checks give zero results (which is generally a good thing, but breaks the page). Either these need to go on individual subpages or the queries changed to always return something.
  • (b) Not all checks might make sense for Greek. Even for Italian, initially it found a couple of errors and now it also finds things that are just debatable.
  • (c) I'm not sure what to suggest. I'd probably tend to keep them separate. Roman praenomen and modern given names aren't on the same items.
  • (d) yes, done when checking the queries
  • (e) the idea is that that the letters would be limited and some errors could be found
  • (f) This needs categories like Category:Italian feminine given names (Q8558374) with P971 statements. I only found Category:Greek masculine given names (Q59610594) that links to Wiktionary. For Italian, I had to merge a couple of items before it worked.
  • (g)/(h) needs (f) to work
  • (i) "Completeness (Wikidata)" checks if all people of that nationality have a P735. The idea is that we might miss Italian given names for these.
  • (k) done. The script needed updating too (done).
  • (l) it's mostly bottom up while the other is top-down
  • (m) done
  • (n) Some Italian had odd items as values, e.g. "Georg" instead of "Giorgio", due to some imports in earlier times. This query found them and allowed to do some replacements. What's left should be names that people actually have, but are not consider "Italian" by whatever reference Italian Wikipedia is using to determine this (no corrections are actually needed for this). I don't think I gave the writing system much thought when doing the queries as everything should be in Latin script. One could obviously check both. --- Jura 19:23, 1 January 2021 (UTC)
    • (n) fixed the query and added the native label statement to the output. For Italian, already the Latin script labels were different, but for Greek a slightly different query is needed.
      BTW, I also add the two recent queries that check if items have labels in English and Italian/Greek. --- Jura 19:37, 1 January 2021 (UTC)

Thanks! Data Gamer play 22:18, 1 January 2021 (UTC)

Clarification re: indigenous North American names

I recently added a large batch of indigenous North American/First Nations artists and am looking for guidance re: native names as such. It's quite common for one person to have quite a lot of names — a native-language name used in childhood (often with the actual language/dialect of the name unclear), then another native-language name received as an adult, plus an additional English-language name (related or unrelated), with a large array of spellings of each.

For instance, Apache painter Allan Houser (Q175745) has a native name "Ha-Oz-Ous" with several spellings, then the English translation of that name that some publications refer to him by "Pulling Roots," then an English name Allan Cafran Houser (where Houser is an anglicization of his native name).

Or with Lone Wolf (Q486933), aka "Gui-pah-gho," a well-meaning editor has put Wolf (Q16093204) for family name (P734), but "Lone Wolf" is his singular name and Wolf is not a surname.

So my main questions would be

  • How do you indicate a one-of-a-kind name like Rain-in-the-Face (Q138712) ("Imomagaja")? It feels different than a given name but perhaps someone could clarify for me. In particular, it gets tricky with native languages/spellings/transliterations/translations, like White Bear (Q104564780) (Hopi, "Kucha Honowah"), White Bear (Q104564793) (Hopi, "Kutca Honauu"), White Bear (Q104564768) (Crow), and White Bear (Q104564763) (Arapaho). They are sort of the same name, but different in etymology and transliteration.
  • How do you use name in native language (P1559) when it is unclear what the native language is or is not a supported Wikimedia language (and when the listed name might be a random phonetic spelling of the actual native name, anyway)?
  • When someone has no surname, how do you make clear that Lone Wolf (Q486933)'s sort name is "Lone Wolf," not "Wolf, Lone"?
  • When someone has many valid names used interchangeably, how is this treated? It starts to get strange marking a particular name "preferred" — preferred by whom?

I did peruse the Wikiproject reference lists, but still am unclear on many of these particulars. If I missed a reference guide somewhere, let me know. Sweet kate (talk) 18:39, 1 January 2021 (UTC)

given name and unisex given name items (Latin script)

For some names, we have items that are not specifically for male or female names. This can happen because:

  • (A) a given spelling is used in one language for one gender, in another language for another gender
  • (B) a given spelling is used for one gender today, but for another gender in earlier times
  • (C) a given spelling is used for both genders today, in a given culture/country

Also:

  • (D) we don't know whether it's male, female or unisex.

Some of the above currently use "unisex given name" as description and unisex given name (Q3409032) in instance of (P31). However, we don't have reference for most of these P31 statements, nor is the description "unisex given name" ideal for (A) and (B).

To simplify this, I suggest we add given name with preferred rank to all items with unisex given name and adapt the descriptions to P31=given name. This has the advantage that references could easily be added to P31=unisex given name (if they exist) and descriptions are always correct: for (A), (B), (C) and (D). Further, items could also include P31=male given name (Q12308941) or P31=female given name (Q11879590) in normal rank with references, if there are no separate items. --- Jura 09:16, 9 January 2021 (UTC)

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names --- Jura 09:16, 9 January 2021 (UTC)

Perhaps it will be easier to maintain if we always use P31 given name and female, male, unisex, whatever as a qualifier. My €0.02 :D Klaas `Z4␟` V09:28, 9 January 2021 (UTC)
I don't understand sense of unisex given name (Q3409032) in way, when we know if human is male/female. e.g. Actual version of Milan (Q1076681) - Link to Czech Wikipedia page, where is "male given name", but mix of instance of (P31) in item. In instance of this on WD is: given name (Q202444) (why?), unisex given name (Q3409032) and male given name (Q12308941). So i am for item for Milan (male given name), Milan (female given name) and Milan (unisex given name, if we don't know). In this way we can validate structure (unisex on items, where isn't sex or gender (P21), etc.) Skim (talk) 11:34, 9 January 2021 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────

  • I don't really follow. I guess using P366 would make writing queries easier, because you don't need to deal with ranks. This only applies when you want to try and get the gender-usage information, since there would be P31 Q202444 in either case. Ghouston (talk) 23:33, 7 February 2021 (UTC)
    • The question is what to do when there are several items for the same name (see your comment at 21:02, 10 January 2021) that would all have the same P31/P366 statements. This is a problem we wont have if we stick to items with "given name", "male given name" or "female given name" only. Also it would avoid having to re-do thousands of items. --- Jura 08:16, 9 February 2021 (UTC)
      • When they were used for different genders, they'd have different P366 statements. However, there may also be cases of names which are pronounced differently but spelled the same, and have the same gender usage. Ghouston (talk) 10:10, 9 February 2021 (UTC)
      • "Marie" is pronounced a bit different in French than English. Does that mean there are two different names "Marie"? Of course, this also happens a lot even with regional accents. But maybe you'd pronounce the name of your French friend differently to the name of your English friend, so you'd be distinguishing two names? Isn't spelling just a crude representation of speech, which is "true" language? Ghouston (talk) 10:31, 9 February 2021 (UTC)
      • In practise, we are usually working from text and don't necessarily know how people pronounce their names. We have no way to decide whether different pronunciations constitute different names, or just different ways of pronouncing the same name. We also have separate items for names that are spelled differently but sound identical. Perhaps we just admit that we are using a text-based interpretation of names, and we just have one item for one spelling, in all cases, including Jean or Joan. Ghouston (talk) 21:37, 9 February 2021 (UTC)
      • But then, we may also have cases where two names are spelled the same in one language, but transliterated differently into another, such as the Russian label at Wikidata_talk:WikiProject_Names#Different_items_for_Spanish_/_Portuguese. Ghouston (talk) 21:45, 9 February 2021 (UTC)
      • and likewise, Jean (Q4160311) and Jean (Q7521081) have different Russian transliterations and sitelinks, so presumably must have different items. Ghouston (talk) 21:50, 9 February 2021 (UTC)
      • The difference between the two names in these cases is pronunciation, not specifically that they are "male" or "female" names (which could potentially vary by location or time). Ghouston (talk) 21:52, 9 February 2021 (UTC)
        • Sorry for the delay. I think there are cases where we have information that allows to determine more (e.g. we know the gender/language) than others for whatever reason (pronunciation as you seemed to have preferred initially or some other element). So what is your conclusion in relation to the proposal above? --- Jura 10:05, 22 February 2021 (UTC)
          • I think using ranks would be confusing to a lot of users, who probably wouldn't read discussions like this. There's something a bit simpler, which would be to use unisex given name (Q3409032) only for names that are gender neutral at a given place and time. For the names that differ in gender usage by time or place (but where there's no question of different pronunciations and separate items) then use both male given name (Q12308941) and female given name (Q11879590), maybe with qualifiers. Ghouston (talk) 00:05, 27 February 2021 (UTC)
            • Eventually people will figure out ranks. We do have a qualifier to describe why preferred rank is set. It could easily explain the why. "gender neutral at a given place and time" isn't that most of them? I think the confusing part is when that gets generalized. --- Jura 07:33, 2 March 2021 (UTC)
  • I would just merge these Mary and George items, since there's apparently only one name in each case. Ghouston (talk) 01:41, 4 March 2021 (UTC)
    • The handling of "Maria" (and few related names) is indeed suboptimal and eventually needs to be sorted out, but I think that can be done independently of the main question here. --- Jura 10:51, 16 March 2021 (UTC)