Property talk:P227

From Wikidata
Jump to: navigation, search

Documentation

GND ID
identifier from an international authority file of names, subjects, and organizations (please don't use type n = name, disambiguation)
Description Authority file for names of persons, subject headings and corporate bodies. For editions (single books), see: DNB editions (P1292).
Represents Integrated Authority File (Q36578)
Associated item German National Library (Q27302)
Has quality VIAF component (Q26921380)
Data type External identifier
Template parameter en:Template:Authority control: "GND" - Template:Authority control (Q3907614)
Domain any item
Allowed values (1|1[01])\d{7}[0-9X]|[47]\d{6}-\d|[1-9]\d{0,7}-[0-9X]|3\d{7}[0-9X]
Example universe (Q1)4079154-3
Jehan Sadat (Q212190)118604740
Source https://portal.dnb.de
Formatter URL http://d-nb.info/gnd/$1
Tracking: usage Category:Pages using Wikidata property P227 (Q8709075)
See also DNB editions (P1292)
Lists
Proposal discussion Property proposal/Archive/3#P227
Current uses 446,608
[create] Create a translatable help page (preferably in English) for this property to be included here


Format: value must be formatted using this pattern (PCRE syntax)
|((1|1[01])\d{7}[0-9X]|[47]\d{6}-\d|[1-9]\d{0,7}-[0-9X]|3\d{7}[0-9X])
List of this constraint violations: Database reports/Constraint violations/P227#Format, hourly updated report, SPARQL
Conflicts with “instance of (P31): Wikimedia disambiguation page (Q4167410), Wikimedia category (Q4167836), Wikimedia list article (Q13406463): this property must not be used with listed properties and values.
Exceptions are possible as rare values may exist. Known exceptions: Kingdom of Granada (Q1495642)
List of this constraint violations: Database reports/Constraint violations/P227#Conflicts with
Single value: this property generally contains a single value.
Exceptions are possible as rare values may exist. Known exceptions: European Union (Q458), Cupid (Q5011), Nanai people (Q504574), The Specials (Q19057), quantum physics (Q1144457), Academy of Arts of the GDR (Q15646111), The Threepenny Opera (Q212495), Leiden University (Q156598), Jerusalem (Q1218), aid (Q2827815), Marx-Engels-Werke (Q1906153), Marx-Engels-Gesamtausgabe (Q1669801), Pietro Coccoluto Ferrigni (Q31222), Hubert Beuve-Méry (Q84021), Angelus Silesius (Q60469), Jean Rounault (Q105628), Edith Pargeter (Q237235), Evelyn Beatrice Hall (Q263067), Dietrich Wilde (Q1224100), Hans-Holger Friedrich (Q1251322), Erich Czech-Jochberg (Q1352056), Horatius Haeberle (Q1627996), Jürgen Franke (Q1717160), Kurt Frieberger (Q1793540), Mic Donet (Q1926656), Patrizius (Q2058211), Edwy Searles Brooks (Q2522381), Irène Hamoir (Q3154648), Maxime Vuillaume (Q3302715), Auro D'Alba (Q3629853), Leopoldo Pullè (Q3830714), Trixini (Q14778863), Arvo Blechstein (Q16054255), Nikolaos Episkopopulos (Q16941268)
List of this constraint violations: Database reports/Constraint violations/P227#Single value, SPARQL
Distinct values: this property likely contains a value that is different from all other items.
Exceptions are possible as rare values may exist. Known exceptions: microscope (Q196538), optical microscope (Q912313), Old Testament (Q19786), Tanakh (Q83367), Technische Hochschule (Q346549), institute of technology (Q1371037), literature (Q8242), fiction (Q268416), Jesus Christ (Q302), Historical Jesus (Q51666), Brandenburg (Q1208), Province of Brandenburg (Q700264), Lusatian Mountains (Q695368), Zittau Mountains (Q206587), legal process (Q1301203), legal proceedings (Q8222382), Pitcairn Islands (Q35672), Pitcairn Island (Q1779748), no label (Q15407350), Revue québécoise de linguistique (Q15407351), Australia (Q408), Australia (Q3960), poetry slam (Q20852), slam poetry (Q2293670), London (Q84), Greater London (Q23306), Freeganism (Q520867), dumpster diving (Q1110145), Saint Irmgardis (Q444949), Irmgard von Köln (Q14540331), Hohensalza District (Q1787342), no label (Q1803227), Palau (Q695), Palau Islands (Q1588974), Kansai region (Q164256), Kansai (Q1111292), judiciary (Q105985), no label (Q448798), Mettmann (district) (Q6257), no label (Q1662807), no label (Q1502013), no label (Q2515177), Academy of Arts, Berlin (Q414110), Prussian Academy of Arts (Q514802), Kommunistische Volkszeitung (Q1780615), no label (Q1780476), Schwarzburg-Sondershausen (Q630163), no label (Q1454729), pastor (Q152002), parson (Q955464), classical studies (Q439072), classics (Q841090), Diepholz (Q5956), no label (Q1360467), Diepholz (Q1787256), Schwarzburg-Rudolstadt (Q695316), no label (Q1454727), no label (Q683834), bioorganic chemistry (Q864640), linguistic anthropology (Q772835), ethnolinguistics (Q853085), Styria (Q41358), Duchy of Styria (Q580447), pedology (Q215501), soil science (Q9161265), monarch (Q116), ruler (Q1097498), intellectual (Q58968), intelligentsia (Q381142), country (Q6256), state (Q7275), resistor (Q5321), electrical resistance (Q25358), impedance (Q179043), admittance (Q214518), Samter District (Q1232145), no label (Q1803430), no label (Q315027), no label (Q1571264), Sambia Peninsula (Q329676), no label (Q7380391), no label (Q1803272), no label (Q1787368), Mogilno District (Q831767), no label (Q1803339), Białystok Voivodeship (Q1305171), Białystok Voivodeship (Q5261695), Białystok Voivodeship (Q14756366), Saint Helena (Q34497), Saint Helena, Ascension and Tristan da Cunha (Q192184), Frankfurt School (Q151843), critical theory (Q301751), Saxe-Gotha-Altenburg (Q675085), no label (Q1454726), criticism (Q17955), critique (Q3059502), Free State of Saxe-Weimar-Eisenach (Q44352), Saxe-Weimar-Eisenach (Q155570), Kaliningrad (Q1829), Königsberg (Q4120832), Sicily (Q1460), Sicily (Q4951156), no label (Q15608499), no label (Q2456068), Austria (Q40), Cisleithania (Q533534), Nordstrand (Q21010), Nordstrand (Q15058181), Carpathian Ruthenia (Q1148511), Zakarpattia Oblast (Q170213), Tokyo (Q7473516), Tokyo (Q1490), Maggia (Q67996), Maggia (Q3905364), Taiwan (Q865), Taiwan Island (Q22502), Rhenish Hesse (Q707297), no label (Q17353989), Verden (Q5927), no label (Q13188569), Electorate of Baden (Q637238), Republic of Baden (Q690821), no label (Q15785077), no label (Q1296235), Schmalkalden (Q1787499), Ferrara (Q13362), Duchy of Ferrara (Q693570), SBZ-Archiv (Q19831595), Deutschland Archiv (Q1206262), County of Oldenburg (Q170390), Duchy of Oldenburg (Q697084), Rotenburg (Q5923), no label (Q1803420) Warendorf (Q2839), Warendorf District (Q16332967), Mont Saint-Michel (Q20883), Le Mont-Saint-Michel (Q20892), no label (Q1396026), no label (Q1787565), Valka (Q323774), Walk (Q991548), University of Bordeaux (1441-1970) (Q20791505), University of Bordeaux (Q13344), University of Strasbourg (Q157575), University of Strasbourg (1538-1970) (Q20808141), Neue Rheinische Zeitung (Q429850), no label (Q19311569), book series (Q277759), monographic series (Q1700470), theatre company (Q742421), musical ensemble (Q2088357), Ramadan (Q41662), Ramadan (Q8867089), art of sculpture (Q11634), no label (Q350268), no label (Q1623951), High Representative of the Union for Foreign Affairs and Security Policy (Q634291), Senate of the Republic of Italy (Q633872), Senate of the Kingdom of Italy (Q3510898), Rhinella marina (Q321087), Bufo marinus (Q13165156)
List of this constraint violations: Database reports/Constraint violations/P227#Unique value, SPARQL (every item), SPARQL (by value),
Qualifiers “name (P2561), official name (P1448), birth name (P1477), named as (P1810), pseudonym (P742), field of work (P101), occupation (P106), start time (P580), end time (P582), genre (P136), title (P1476), inception (P571), facet of (P1269), applies to part (P518), has quality (P1552): this property should be used only with listed qualifiers.
Exceptions are possible as rare values may exist.
List of this constraint violations: Database reports/Constraint violations/P227#Qualifiers

Usage note[edit]

The Integrated Authority File (GND) is managed by the German National Library in cooperation with various library networks in German-speaking Europe and other partners. Please look up GND at Online-GND or DNB-Portal. (VIAF is helpful but also often incorrect, outdated, and is mixing two identifier systems that in some cases produce dead links.)

GND identifier (Template:Entities)

  1. GND 1019646128: Stan Lauryssens (b. 1946), type Tp (person) = Yes
  2. GND 122968751: Stan Lauryssens (no info), type Tn (name, a placeholder) = No

VIAF ID (P214)

  1. VIAF 120062731 Stan Lauryssens = Yes
  2. VIAF 293348885 Stan Lauryssens (undifferentiated) = No

Known VIAF problems

  1. Johannes Fabian (Q15641418), VIAF 91414487 merged in January 2014 three GNDs, only one was correct:
    1. GND 107342049: Fabian, Johannes R., undifferentiated
    2. GND 122878310: Fabian, Johannes (* 1937), Amerikan. Anthropologe
    3. GND 172084180: Fabian, Johannes R., Dipl.-Ing.
  2. VIAF changes numbers with dashes: Brazilian Institute of Geography and Statistics (Q268072).
    1. GND 1026669-0 (correct)
    2. VIAF-GND 004164695 ("404 Not Found")
      • The strange thing is that when you goto http://d-nb.info/gnd/1026669-0, GND says "idn=004164695".
      • Not too strange, just annoying: 004164695 ist the DNB identifier, cf. http://d-nb.info/004164695 (without "gnd" in the URL). VIAF confuses those two identifier systems sometimes (for the first years this never was an issue nor could any issue be detected since the two identifiers always coincide when it comes to authority records for persons), e.g. the VIAF display does correctly link to http://d-nb.info/gnd/1026669-0, however the dumps contain the wrong number and also access via http://viaf.org/viaf/sourceID/DNB%7C1026669-0 does not work. I notified VIAF and DNB back in October 2012 about the issue, but it does not seem to have high priority over there. -- Gymel (talk) 13:42, 28 January 2015 (UTC)
  3. Same name + same year of birth = different person
    1. In some cases VIAF merges two persons because only one of them has a GND
    2. Please use: GND = "no value" (for the person without a GND) [1]
    3. For items often confused use: different from (P1889)
  4. For unknown reasons VIAF is not importing all GND ids
    1. Samuel Ramos (Q7412445) = GND 1022446479 (created 16-05-12)
    2. June 2015: 3 years later the GND id still not harvested by VIAF
      • DNB and VIAF have been made aware of the problem, there might have been a harvesting glitch during the first weeks of the GND going live in April 2012: GND records which never have been touched since then are still unknown to VIAF (as an estimate about 15.000 GND records for persons created in early summer 2012 are not represented in VIAF). [7. Jun. 2015‎ Gymel]
      • Update: GND 1022446479 added to VIAF:59099151 on 2015-07-12.
  5. In some cases VIAF clusters get deleted instead of merged
    1. Åke Blomström (Q270863): VIAF 228866914; taken care by KrBot
  6. In rare cases VIAF clusters are reused for a different person
    1. William of Ockham (Q43936): VIAF:41835567 in 2015
    2. Lorenzo Traversagni (Q18674108): VIAF:41835567 in 2016
    3. William of Ockham (Q43936): VIAF:262145669298005170004 (created 2016-02-28)

Duplicate of P107[edit]

This property seems to be a duplicate of P107 (GND entity type), or why are there now two different properties? --#Reaper (talk) 13:04, 16 March 2013 (UTC)

Property:P107 lists the main types of item. Is the item a person, organization, event, work, term, place, or disambiguation page? (See Wikidata:Infoboxes task force for use.) It's a kind of basic classification. So Marilyn Monroe (GND 118583549) and Vladimir Putin (GND 122188926) are both "type person", but they have their own GND numbers as identifier. --Kolja21 (talk) 02:08, 17 March 2013 (UTC)
Ah, I haven't seen that this property is from type string, the description reads like if I/you should enter "name", "work" and so on, not the ID of the GND-object. Thx. --#Reaper (talk) 12:34, 17 March 2013 (UTC)

STICKY: Explanation of format constraints[edit]

At its launch in April 2012 the GND established all existing identification numbers of its constituent files (PND, GKD, SWD, DMA-EST) as GND identification numbers. Records created since then follow the pattern for the former PND. Caveat: The checksum algorithms differ between the dashed and undashed types. Another caveat: The dash is essential, both 160220440 and 16022044-0 are valid GND numbers, denominating distinct entities.

  1. (1|1[01])\d{7}[0-9X]: (9 digits starting with "1" or 10 digits starting with "10" where the last "digit" may be "X") Former PND numbers, and all numbers for genuinely "GND-born" records (those created after 2012-04, always 10-digit form)
  2. [47]\d{6}-\d: 7 digits starting with "4" or "7", followed by dash and a strictly numerical check digit: Former SWD numbers. Scheme discontinued after 2012-04.
  3. [1-9]\d{0,7}-[0-9X]: one to eight digits not starting with "0", followed by dash and a check "digit" which may be "X": Former GKD numbers. Scheme discontinued after 2012-04.
  4. 3\d{7}[0-9X]: 9 digits starting with "3", last "digit" may be "X": Former DMA-EST numbers. Scheme discontinued after 2012-04.

At the time being there is a certain overlap between the formulations 2. and 3. admitting false negatives. And of course the pattern check does not perform a checksum test. -- Gymel (talk) 11:43, 11 May 2013 (UTC)

STICKY: Uniqueness constraint: List of persons with known conflicts between GND and Wikipediae[edit]

Below is a list of persons from GND database, where different to Wikipediae (a) GND does not distinguish between two (possibly) different persons and thus has only one database entry (b) GND identifies a pseudonym or fictional author (persona) with its creator, although the persona is more than a simple pen name. These rare exceptions can lead to violations of uniqueness constraint. Please see de:Benutzer:Gymel/Hartnäckige_PND-Dubletten for further details (in German). -- Make (talk) last update: 12:40, 27 November 2013 (UTC)

1st group [identities historically identified with one person]
  1. http://d-nb.info/gnd/118557513 Jesus Christus
    = Historical Jesus (Q51666) (historical Jesus of Nazareth) + Jesus Christ (Q302) (central person of Christianity from/in New Testament)
  2. http://d-nb.info/gnd/118557815 Johannes
    = John the Apostle (Q44015) + John the Evangelist (Q328804)
    2nd group [identity is subject of ongoing debate]
  3. NOTYETDISCOVERED but WONTFIX http://d-nb.info/gnd/118720260 Hans von Tübingen
  4. NOTYETDISCOVERED but WONTFIX http://d-nb.info/gnd/118815245 Hans Hirtz
  5. WONTFIX http://d-nb.info/gnd/11936929X Meister von Meßkirch : Q568760 Q1532784
  6. WONTFIX http://d-nb.info/gnd/119457733 Arnold : Q535832 Q694744
  7. http://d-nb.info/gnd/118746871 Irmgard
    = legendary Irmgardis von Süchteln for whom worship as patron saint of town Süchteln (Q314425) is first documented at the end of 15th century (1486, 1498) +(?) Irmgard von Köln + historical persons Irmtrudis, Irmgardis, ... who are documented as donors to the church in 11th century
    = (badly disambiguated) Saint Irmgardis (Q444949) + Irmgard von Köln (Q14540331)
    see www.rheinische-geschichte.lvr.de/persoenlichkeiten/I/Seiten/IrmgardisvonSüchteln.aspx (deutsch)
    3rd group [pseudonyms]
  8. http://d-nb.info/gnd/118677799 Lemony Snicket
    = Daniel Handler (Q1060636) (novelist, born 1970) creator of → Lemony Snicket (Q458346) (fictional person providing his pen name)
  9. http://d-nb.info/gnd/126472009 Bonifatius Kiesewetter
    = Waldemar Dyhrenfurth (Q1307672) (German jurist and author, 1849-1899) creator of → Bonifazius Kiesewetter (Q892566) (fictional person providing his pen name, later use by other authors)
  10. http://d-nb.info/gnd/115646108 Jason Dark
    = Helmut Rellergerd (Q1604049) (German writer, born 1945) creator of → Jason Dark (Q104029) (pen name used by different authors of publisher "Bastei", Rellergerd later was granted exclusive use of the pseudonym)
  11. http://d-nb.info/gnd/123068908
    = Kurt Ostbahn (Q584872) + Willi Resetarits (Q43776)
    group ? [unclear what is going on,work in progress]
  12. http://d-nb.info/gnd/118691910 Lucius Annaeus Florus ←→ http://d-nb.info/gnd/100136907 Florus ←→ http://d-nb.info/gnd/119410672 Florus

Leading or trailing space characters in values (resolved)[edit]

text separated here into a standalone section for archieving purposes -- Make (talk) 22:58, 26 May 2013 (UTC)

Wikidata:Database reports/Constraint violations/P227: Some of the numbers are correct. A helpful rule would be: "Only numbers starting with 1-9 (not 0) and dashes are allowed." Can someone translate this into format pattern? --Kolja21 (talk) 16:54, 18 May 2013 (UTC)
Examples from the list:
Both GND's are correct. --Kolja21 (talk) 16:58, 18 May 2013 (UTC)
I think I found out what went wrong: the tsring values contain either leading or trailing space characters. Unfortunately this is not visible on the item page or in the constraint violation report. Only if you look at the wikitext source of the report you can see the mistakes as %20 in URLs. To correct this on an item page, I had to use a 2-step somewhat hacker-like approach (since the erroneous space characters are not visible): click edit (value), add a space character at the start and at the end of the value string, click save, click edit, remove the extra characters just added, click save. But we have to wait for the next report to be sure this really works ... -- 22:41, 18 May 2013 (UTC) User:Make -- minor edits for clarity 08:07, 21 May 2013 (UTC)
And, has it worked? Three examples from the current list (19:35, 20. Mai 2013‎):
All three GND's are correct, but have a trailing space in the list. --Kolja21 (talk) 22:44, 20 May 2013 (UTC)
Looks like it really worked. On May 19th, I removed space characters (with the hacker-technique described above) from value strings for Q2066, Q124696, Q71154, Q70938, Q30917, Q11143, and Q11021. None of these items show up as "format violations" in the current report anymore. -- Make (talk) 08:07, 21 May 2013 (UTC)
I just removed the trailing space from Q1135083 see revision history for change in Bytes. You might want to try fixing some string values yourself to confirm that although the presentation on the item page is identical before and after, from the revision history you can see that indeed a character was removed. – I am not sure what to think of this. At least it is unfortunate that the presentation on the item page omits some content (namely leading/trailing space characters). Maybe there is a software bug with string input/printing behind this. -- Make (talk) 08:22, 21 May 2013 (UTC)
I left a note at Wikidata:Contact the development team#Trailing space. --Kolja21 (talk) 13:07, 21 May 2013 (UTC)
Yes, sorry, that's my fault. The way we added trimming is a bit hacky, which leads exactly to the issue you describe here. It will be improved, but this is probably a month or two down the line. The good news is: this kind of errors should be impossible to introduce anew. So there is only some legacy error. I am not sure, it could even be possible that pages that get edited at all loose this kind of legacy error, because the whole content gets changed, but as said, I am not sure. Whatever, in order to help here, I made an analysis, and tried to figure out a list of all places where this problem occurs. It seems to be in 148 values, listed in the following (item, property, value). I hope this helps, and again, sorry for my mistake! I anticipated it, but checked only for linebreaks, and fixed those manually before the patch, but not for simple whitespaces. --Denny (talk) 15:28, 21 May 2013 (UTC)
Thanks@all for helping to resolve this. I just finished fixing all %20 in claims for P227. Hope I didn't miss any. We'll see if all is good when the bot-update scheduled for the early hours of May 24th brings its findings. --- Make (talk) 20:36, 22 May 2013 (UTC)
✓ Done Finally all values with leading/trailing spaces are fixed. -- Make (talk) 22:58, 26 May 2013 (UTC)

Duplicates[edit]

Please use de:WP:GND/F to report duplicates. See de:Hilfe:GND#Personen for the difference between individualized and non-individualized (VIAF: "undifferentiated" = don't use) GNDs. --Kolja21 (talk) 14:15, 11 October 2013 (UTC)

Database reports/Constraint violations : GND identifier present but VIAF identifier missing[edit]

Hi! This might be a new type of property constraint violations.
Is it possible to list all pages where GND ID (P227) is present but VIAF ID (P214) is missing? Regards לערי ריינהארט (talk) 06:45, 20 October 2013 (UTC)

Thanks for the answers! In order to have fewer results one should limit the query to Wikidata pages that are linked to a specific language:

  1. having an article in yi.Wikipedia
  2. having an article in eo.Wikipedia
  3. having an article in ro.Wikipedia

Thanks for any answer! לערי ריינהארט (talk) 09:03, 21 October 2013 (UTC)

Magnus Manskes's tool would not work properly if no English label is present.
How can you query all Wikidata pages having LCAuth ID (P244) without English label?
לערי ריינהארט (talk) 09:13, 21 October 2013 (UTC)
Looks like not all items with GND ID have VIAF ID. For example I am failed to find VIAF ID for Bieszczady Mountains (Q125529), Rheinbach (Q12547), Age of Enlightenment (Q12539). — Ivan A. Krestinin (talk) 18:57, 26 October 2013 (UTC)
VIAF has problems with hyphens and changed Bieszczady Mountains (Q125529) GND 4006552-2 into VIAF-GND 040065529. But there are also numbers missing. RERO is incomplete and GND numbers of May 2012, when PND changed to GND, were apparently not reported. Expample: Samuel Ramos (Q7412445) GND 1022446479 (16-05-12). VIAF 59099151 has today, one and a half years later, still an outdated "undifferentiated" Tn linked to the Mexican writer. --Kolja21 (talk) 21:51, 27 October 2013 (UTC)
The "problems" VIAF has are that it (partially) confuses GND Ids with DNB Ids. Thus GND 4006552-2 cannot be resolved by "sourceID" but if you access VIAF 245618932 it links to the correct GND record, but it displays the wrong ID. -- : Gymel (talk) 00:39, 28 October 2013 (UTC)

Thanks for the answer! לערי ריינהארט (talk)

(GND identifier present AND its values has length 9 AND its value starts with (1 OR 2)) AND VIAF identifier missing[edit]

How many are these? Let's restrict to:

GND value matches pattern=[12]\d{7}[0-9X] AND VIAF is (empty OR NIL) 

i.e. no MINUS is present in GND value AND ... . Normally you should be able to identify the correlated VIAF id with a link as [2]. Note: The example contains a trailing X.
Note: https://viaf.org/viaf/search?query=cql.any+all+%22000423580%22+and+local.sources+any+%22dnb%22&sortKeys=holdingscount can identify via normalized GND 000423580 (The - is removed from 42358-0 . Then heading ZERO's are added to get a string of lemght nine.). This does not work in general. לערי ריינהארט (talk) 04:23, 6 November 2013 (UTC)

add GND identifier format constraint violations/P227 : values of length 9 never ever start with a digit different then 1 or 2[edit]

Hi! GND identifier vales the values never start with 0 or 9. If present here this is due to a bug of the AC tool. לערי ריינהארט (talk) 03:23, 1 November 2013 (UTC)

Update: GND identifier vales of length 9 never ever start with a digit different then 1 or 2. לערי ריינהארט (talk) 03:31, 6 November 2013 (UTC)
The current format constraint gives this regular expression:
|((1|10)\d{7}[0-9X]|[47]\d{6}-\d|[1-9]\d{0,7}-[0-9X]|3\d{7}[0-9X])
which is in fact redundant (the 3 unnecessary alternatives for the initial 1, [47], or 3 are already part of the alternative for the initial [1-9]) and fully equivalent to:
|(([1-9]|10)\d{7}[0-9X])
(Note: the initial "|" of both regexps indicates that the value may be empty, for meaning "no VIAF identifier currently exists for this topic", or "the VIAF identifier has been obsoleted/deprecated/removed" probably because the topic was ambiguous and did not identify really a single topic; the empty value for this property can be only set in Wikidata, provided you also add a qualifier such as "comment":"no value" and probably a "date" qualifier for this asserted comment; without the necessary qualifier(s) the property would be simply deleted from the item in Wikidata)
But what you are saying is that: if the identifier starts by 1 or 2, then it cannot have length 9 and would have length 10. This would give the following:
|([3-9]|([12]\d))\d{7}[0-9X])
Is that correct ? Can you point us to a reliable source (the DNB reference page explaining it)? Verdy p (talk) 19:47, 4 March 2016 (UTC)
I also note that GND identifiers of persons (starting by 1, and with 10 digits/letters), do not have any minus-hyphen sign before the last check digit in [0-9X], in order to preserve the maximum length of the full id to at most 11 characters (only IDs starting by 4 or 7 also have variable lengths, and all IDs except those starting by 1 or 2 accept the minus-hyphen as they have a maximum of 9 digits/letters). The actual format would be then more accurately:
|([12]\d{8}|[35689]\d{0,7}-|[47]\d{6}-)[0-9X]
Adding the minus-hyphen in the "long" GND ID of a person (staring by 1) causes the ID to be misinterpreted (with the last check digit discarded due to excessive length) and can sometime bring us to another unrelated authority record).
Beside that, it seems that the minus-hyphen is now optional in "short" IDs (those starting by [3-9])
|([12]\d{8}|[35689]\d{0,7}-?|[47]\d{6}-?)[0-9X]
Leading zeroes (just after the required leading [3-9] "class digit") for "short" IDs must apparently be discarded. I don't know if this is true for those starting by the required [47] "class digit", but it is strange that they accept less digits than others. If so we would get more simply:
|([12]\d{8}|[3-9][1-9]\d{0,7}-?)[0-9X]
(where the first alternative is for "long" IDs where zeroes after the "class digit" are not discarded and where the minus-hyphen must NOT be specified, the second alternative is for all "short" IDs that may include the minus-hyphen before the "check digit" and may have leading zeroes after the "class digit").
Verdy p (talk) 20:54, 4 March 2016 (UTC)
Further checking in the database with SPARQL, I found that Wikidata internally stores some GND identifiers by terminating them by an additional right-to-left mark (U+200F), even if they are not visible in the editing interface.
Then they don't match the regular exception. I think this is a bug in the Wikidata editing interface (which can insert them automatically when submitting to the database), or in its local implementation of SPARQL...
So I looked for other querying interfaces, I found that the RLM are effectively present... but not displayed in the database.
There are 9 occurences :
  • Q1032, GND identifier = "1018704-2<RLM>" (strlen=10 instead of 9)
  • Q7054, GND identifier = "4727207-7<RLM>" (strlen=10 instead of 9)
  • Q42108, GND identifier = "7678885-4‏<RLM>" (strlen=10 instead of 9)
  • Q21165243, GND identifier = "1058485881<RLM>" (strlen=11 instead of 10)
  • Q21638355, GND identifier = "1026070740<RLM>" (strlen=11 instead of 10)
  • Q21823350, GND identifier = "138778485<RLM>" (strlen=10 instead of 9)
  • Q21849794, GND identifier = "135929210<RLM>" (strlen=10 instead of 9)
  • Q22967417, GND identifier = "124887171<RLM>" (strlen=10 instead of 9)
  • Q22920213, GND identifier = "132592746‏<RLM>" (strlen=10 instead of 9)
Can a wikidata admin look at what is wrong there ? I tried to edit these identifiers, but they are displayed correctly in these pages, and the links also work correctly (none of them show the RLM). Removing them prior to adding them again (typing them manually to make sure there's no RLM in a copy-paste operation) does not change the result. I fear that this could affect many other item properties (or translated labels) in Wikidata.
Verdy p (talk) 23:23, 4 March 2016 (UTC)
Verdy p Sorry to say, but most of what you say is utter nonsense. Why don't you just read #STICKY: Explanation of format constraints above before speculating? -- Gymel (talk) 23:15, 4 March 2016 (UTC)
Non-sense ? Look more precisely... (http://tinyurl.com/hnarckw link to SPARQL query) You'll see I'm correct here. Those RLM are there and returned by SPARQL which does not match the expected regexps (unless I add "\u200F?" in the regexps !). Verdy p (talk) 23:25, 4 March 2016 (UTC)
Obviously at 23:15 I did not comment on your contribution from 23:23 but on what you had been writing above. -- Gymel (talk) 22:28, 5 March 2016 (UTC)
I looked at the history of those items, I saw that RLM were initially added the first time, then removed later (you can see that in diffs). Apparently, SPARQL does not consider the last version of a property, but randomy uses any version found (possibly in a cache, but this cache is extremely long to expire and get purged from its LRU list...) and stops there. In other words, SPARQL does not reflect the current state of the database (even if it indicates that it has all updates since the last one or two hours: these corrections were made long before). This may also explain why we see old data' everywhere when navigating in Wikidata.
Something is wrong in the management of internal cache for WDQS... or in how it retrieves the data (probably a filter of items by their most recent version is missing when looking for properties of an item). This not only affects interactive queries, but also the navigation on the website (old data displayed including on the Wikidata website itself), and also all data extractions (export as RDF, Turtle, etc.). Verdy p (talk) 00:00, 5 March 2016 (UTC)

Unique value constraint and gender specific values[edit]

Hi! @Gymel , @Kolja21 There are a lot of gender specific pairs:

What are the impacts of this:

a) for Wikidata
b) for the authority control templates using one parameter value only

related impacts[edit]

1) Is there a symmetrical counterpart of field of this occupation (P425) ?
2) Are part of (P361) and has as part (P527) to be used at pages as philology (Q40634) ? see: DNB search: Philologie

לערי ריינהארט (talk) 07:41, 12 March 2014 (UTC)

First of all, even when there is a process of identification between wikidata entries, wikipedia articles and GND records, there is no necessity to transport the relations between these objects between the different systems.
Second: Your examples illustrate the issue that it has probably not been very wise to extend the Normdaten/Authority-control templates from persons to concepts: For the latter in many cases it would have been more appropriate to supply wikipedia categories (instead of articles) with the identification with GND concepts
More specific: Since the GND as a 'document language' knows about the female terms of most professions and the German Wikipedia does not (uses redirects to the generic masculine) and other Wikipedias also don't (e.g. because their respective languages don't have female forms for many nouns) and Wikidata does not (yet) even cover wikipedia redirect pages there certainly is an "impact" everywhere.
For reasons not clear to me librarians tend to view everything in terms of part-whole relations. Wikidata objects however are instances or classes and their relations are recorded by distinct properties, cf. Help:Basic membership properties. Specifically philology (Q40634) relates by "subclass of" to sub- and superordinate concepts.
For GND persons (and to a lesser extent for corporate bodies) a "field of activity" has traditionally been of interest. How this was modeled in the data however has undergone several changes. IIRC currently the first profession given should be very broad and give an indication of the "field of activity". However the GND makes no attempt to assign a broad "field" to individual professions (but the underlying DDC-like "GND classification" may serve a similar purpose). -- Gymel (talk) 10:16, 12 March 2014 (UTC)
Thanks for the answer! These days I have seen some (broken) {{PLURAL:foo}} wiki code in the help for "aliases". {{GENDER:foo}} might come also.
Personally I think as a practical "workaround" one should only import / add male forms of GND authority identifiers. Not sure if beside actor (Q33999) there are more gender forms in English (except the girl / boy, sister / brother etc. pages. לערי ריינהארט (talk) 13:00, 12 March 2014 (UTC)
Quoting a recent contribution in a librarian's discussion list [3]: "Actor" is almost a unique case. [...] I’ve met females who act who call themselves actresses, and I’ve met those who see the term as demeaning and prefer actor. Also note "actor / actress" as the english label for actor (Q33999). Thus:
  • items are usually neutral or include all genders
  • since there exists a dedicated property sex or gender (P21), all other properties should be choosen "abstracted from gender", i.e. should take a neutral item as value (because of the previous point there is seldom a choice at all) to achieve "orthogonality"
For non-items or "external items" as in authority control it is probably the best to adapt that strategy: stick to the generic masculine and completely ignore female forms since they usually are not an alternative but a specialization / "narrower term" and therefore as match not as close as the male form. -- Gymel (talk) 22:33, 12 March 2014 (UTC)
Hmm - I agree that it makes sense to use gender-agnostic items in Wikidata. However, GND took another course. They definitivly do not use the female form as a specialization, but as an alternative - so on their side no "generic masculine" exists. Ignoring this has negative consequences for applications, which use the mapping: Using Wikidata as starting point and searching for persons described by a "male" GND profession, the application will miss the females. Using GND as a starting point, no mapping exists for the female form.
So what exactly are the problems in using an approach which maps a Wikidata item to two alternative GND targets, qualified by gender? (example in social scientist (Q15319501)) Jneubert (talk) 18:30, 13 December 2015 (UTC)
Jneubert Quite an old thread you have been commenting on... I think for GND one has to regard several sub-applications:
  1. "Als Homonymenzusatz bei Personenschlagwörtern zugelassen" means that the female form as a text fragment is permissible in constructing headings
  2. As subject heading when cataloging objects: Appropriate if and only if sex or gender aspects of specifically woman scientists are in the focus of the publication. This is clearly much narrower in scope than the male term (which is also to be applied if sex or gender questions are not involved)
  3. As indication of profession for other GND records (i.e. persons): I just checked, for the profession data element indeed chooses the "female" profession for female persons. How one will ever be able to select all social scientists with that approach unfortunately eludes me: The reciprocal relation between the two professions is quite unspecific.
So by listing both GND numbers in social scientist (Q15319501) we equate the two for our purposes (which is correct in a sense) since P227 indicates a 1:1 correspondence between Wikidata item and target record. On the other hand we can argue that there is a distinction between the two concepts and Wikidata is just unable (or unwilling) to represent one of them. Or perhaps the Wikidata item represents the union of the two and therefore none of them is appropriate for P227 of our item? -- Gymel (talk) 00:25, 14 December 2015 (UTC)
Note: social scientist (Q15319501) is for male and female like "Sozialwissenschaftler", GND 4140123-2 is used for males and females. Proof: Kristen Kreider, weiblich, Sozialwissenschaftler (GND 1068218916). --Kolja21 (talk) 13:51, 17 December 2015 (UTC)

constraint report relating to Wikimedia disambiguation page (Q4167410)[edit]

If instance of (P31) is a Wikimedia disambiguation page (Q4167410) the presence of GND ID (P227) is prohibited

see: Wikimedia disambiguation page (Q4167410) with GND identifier (P227) . Usualy there might be more possibilities:
a) Please identify the non - ambiguation page (WD item) where the property GND identifier should be moved;
"normally" no other statements should be left at the disambiguation page.
It can happen that a set of properties should be moved to another (a second) WD item, another set to a third WD item etc.
b) (recomended method:) Verify which language is a disambiguation page and separate it from the rest. Please use Gadget-labelLister.js can be activated at preferences#gadgets to remove all faulty (disambiguation) descriptions after the disambiguation page is separated: Verify the descriptions for the following languages: de, en, fr, es, pt, pt-br, ru, sv which where added by bots long time ago and take a short look at the other language descriptions.
c) If method b) can not be used because all linked WMF-project language pages are (local) disambiguation pages please create a new WD item and add a proper description in your native language and in English.

Thanks in advance! gangLeri לערי ריינהארט (talk) 13:29, 27 May 2014 (UTC)

Three items seem to persist at Wikidata:Database reports/Constraint violations/P227#.22Conflicts_with.22_violations and IMHO cannot be resolved here without at least some intervention in individual wikipedias:
  1. treaties of the European Union (Q11122): some Wikipedias seem to understand this as the treaty of Maastricht with its later amendments (treaty of Lisbon, ...), some others however also include the treaty of Rome (with its companion treaties) under this heading and therefore tend to declare themselves as disambiguation pages.
  2. asymmetry (Q752641): disambiguation property declared by french wikipedia by an article already narrowed to "geometrical" aspects. Comparing the extent of the articles in the other wikipedias there is not much common ground anyway.
  3. List of wars involving Israel (Q623900): Although German wikipedia enumerates the individual conflicts I would not consider it a typical list article. Separation from the other wikipedias however would not be an improvement I think. -- Gymel (talk) 08:26, 4 June 2014 (UTC)

Sorry, to late (I didn't read this talk page in the morning), these three cases have been already resolved. List of wars involving Israel (Q623900) = is a list of (P360) Arab–Israeli wars (Q17126147). The German article looks more like a list but the Arabic WP has a "real" article about this topic. --Kolja21 (talk) 00:59, 5 June 2014 (UTC)

What's type n?[edit]

Hi,
I wanted to add a GND identifier to an item (actually I did, see Gunnar Wöbke (Q1554803)), but the description here says "(please don't use type n = name, disambiguation)". This confuses me, what is meant with "type n = name, disambiguation"? From the examples it looks like type n is where it says in the RDF representation: "<rdf:type rdf:resource="http://d-nb.info/standards/elementset/gnd#UndifferentiatedPerson"/>"

For "real" persons it would say "DifferentiatedPerson" there, is that meant with not-type n? --Bthfan (talk) 22:11, 7 July 2014 (UTC)

@Bthfan: Exactly. The record stands for an unknown number of individuals known by that name and therefore cannot be used to identify a single person. -- Gymel (talk) 08:09, 10 July 2014 (UTC)

Isn't type n appropriate for Wikimedia disambiguation pages?[edit]

I quite agree that differentiated GNDs must not be used on "Wikimedia disambiguation page (Q4167410), Wikimedia category page (Q4167836), Wikimedia list article (Q13406463)".

However, undifferentiated GNDs (type=n) quite closely match Wikimedia disambiguation pages. Does the above prohibition mean that we DON'T want such GNDs in wikidata at all? --Vladimir Alexiev (talk) 07:53, 19 December 2014 (UTC)

de:Peter Müller would be an example: The disambiguation page lists some Peter-Paul and Peter Erasmus Müller who perhaps actually never were called "Peter" (the undifferentiated record would never tell, since "Peter Erasmus" is definitely not a form of reference for all Peter Müllers it is or should be prohibited). The set of persons in the disambiguation page may have an nonempty intersection with those meant by an authority record, but usually is neither a subset or superset: We may know about some Peter Müllers they don't know about (and perhaps never will) and vice versa. Furthermore, the disambiguation page lists the individuals, we know about individually (with the exception of redlinks perhaps), whereas the undiffereniated name records stands for the complementary set of (names of) persons the libraries do not know individually (at the moment), so usually you cannot navigate to (library items already assigned to) individual authority records by accessing the undifferentiaded record. Thus I would think both concepts have in common that they somehow deal with a group of people with common formal characteristics (their name), but the purpose and definition for this clustering generally do not have much in common. -- Gymel (talk) 09:46, 19 December 2014 (UTC)
@Vladimir Alexiev: Yes, we don't want such GNDs in Wikidata at all. Tn's are temporary numbers. They are placeholders and will be deleted or (this was common till 2012, now with so many libraries and archives taking part in the project it's rare) individualized and turned into a Tp. --Kolja21 (talk) 11:21, 19 December 2014 (UTC)

P31:Q5 and Type N[edit]

Hello everyone,

at the request of Kolja21 my bot fetched a list of (for the beginning a few) entries with instance of (P31)  human (Q5) and Type N GND ID (P227) information. You can find this list in the bot's user namespace User:KasparBot/GND Type N.

Regards, -- T.seppelt (talk) 06:24, 19 September 2015 (UTC)

SSL[edit]

Currently the formatter url uses http and redirects to a https connection. Is there a reason not to link the target directly? --- Jura 11:47, 3 December 2015 (UTC)

"Funktioniert nur eingeschränkt. Fehlermeldung Firefox: 'Dieser Verbindung wird nicht vertraut.'" = https makes problems with at least some browsers. It produces error messages. We had a simular discussion @User talk:Pasleim#Property:P1630 and a revert with the reason: "weak certificates". --Kolja21 (talk) 03:04, 5 December 2015 (UTC)
It's different. New formatter URL would be https://portal.dnb.de/opac.htm?method=simpleSearch&cqlMode=true&query=idn%3D$1
It works quite well. --- Jura 06:47, 5 December 2015 (UTC)
I think we should stick to the canonical URLs employing d-nb.info as advertized by "Link auf diesen Datensatz" (link to this record) and not substitute that by some search query for the identifier in the Library catalogue portal.dnb.de, even if that can be performed by https. -- Gymel (talk) 20:21, 6 December 2015 (UTC)
It's not "some search query", but the https-URL preferred by d-nb.info and where it redirects. It just saves users a non-SSL step in the chain. --- Jura 11:23, 7 December 2015 (UTC)
http://d-nb.info/gnd/2072525-5 redirects to https://portal.dnb.de/opac.htm?method=simpleSearch&cqlMode=true&query=idn%3D007223358. So it's a different domain and a different kind of identifier (ILTIS PPN vs. GND Id). Indeed https://portal.dnb.de/opac.htm?method=simpleSearch&cqlMode=true&query=idn%3D2072525-5 also gives the same result but IMHO this may change any time without notice (may I repeat that http://d-nb.info/gnd/2072525-5 is the only form advertised by the record itself). So if you insist on providing https URLs resorting to letting the identifers point to https://archive.org or https://google.com seems a safer bet than trying to outsmart data providers. -- Gymel (talk) 16:44, 7 December 2015 (UTC)
The best to do is to contact "d-nb.info" admins asking them to setup an equivalent HTTPS server in their domain.
They will still continue to redirect us to another HTTPS URL in another domain, which also uses another identifier.
So we would query "https://d-nb.info/gnd/2072525-5<nowiki>" with exactly the same response as "<nowiki>http://d-nb.info/gnd/2072525-5", but this time its resulting redirect would be secured (and not spoofable).
This could also secure us if we create in Wikimedia lots of links to d-nb.info via the GND identifiers: at least the resulting redirect to another domain would be less suspect (imagine that someone spoofs queries to the existing HTTP server or monitors it for unfair actions, by harvesting some routers...
Some wellknown ISPs are also unfairly redirecting some wellknown unsecured domains or modify the contents returned to include advertizing, or that are redirecting their users to an unrelated website....
With HTTPS we would know that we are effectively querying the actual webserver for the domain "d-nb.info" and not a malicious one.
An unfair ISP will not be able to redirect an HTTPS server: as soon as the SSL session starts being negociated, the ISP cannot spoof the authentic response (using fake/stolen SSL certificates?), it can only:
  • route the unmodified query to the real site and return unmodified results, or
  • block immediately the connection (returning an HTTPS error status), or
  • reroute the HTTPS query to another HTTP or HTTPS domain of his choice (using another certificate for this domain), such as a "parking page" or a "web search helper" page displaying various ads.
Explain this risk of website spoofing to d-nb.info admins: may be they will also setup HTTPS on their existing website, and it will be the preferred URL we will use here. All websites with large audience and contents related to lots of topics should use now SSS (See the campaign "HTTPS everywhere" that Wikimedia also strongly supports, notably since the "Snowden revelations" about NSA activities, including large-scale spoofing/monitoring of various wellknown websites). This should be the case of such reputed knowledge authorities (spoofing its website could be profitable and damaging to people, if this is used to get and verify details about someone listed in GND). Verdy p (talk) 20:19, 4 March 2016 (UTC)
The url suggestedmentioned by Gymel would probably work best as an intermediary solution.
--- Jura 06:35, 5 March 2016 (UTC)
Actually I suggested using Google for looking up GND identifiers if using https connections has top priority over all other concerns. -- Gymel (talk) 06:59, 5 March 2016 (UTC)
Sorry, changed it to "mentioned by Gymel".
--- Jura 07:01, 5 March 2016 (UTC)

Identical GND ID[edit]

New report by KrBot: Wikidata:Database reports/Identical GND ID. --Kolja21 (talk) 17:08, 7 December 2015 (UTC)

portal.dnb.de gives an error[edit]

For now i can formater URL gives:

Leider ist ein Fehler aufgetreten.
Unable to invoke request

@Gymel: Can we fix it? -- Sergey kudryavtsev (talk) 07:20, 15 May 2016 (UTC)

Sergey kudryavtsev No, their catalogue (including other services) seems to be completely offline since some time last night. Let's just hope they'll notice it before Tuesday (tomorrow is an holiday in Germany). -- Gymel (talk) 08:35, 15 May 2016 (UTC)

@Gymel: The portal.dnb.de working for now! -- Sergey kudryavtsev (talk) 13:24, 15 May 2016 (UTC)

@Sergey kudryavtsev, Gymel: The server is working, but some of the new GNDs added this weekend are missing.
Example: Jamala (Q2662517), GND 1100357025 "404 NOT FOUND" - PICA works fine
BTW: Monday is a national holiday, correct, but that's no reason to work on Tuesday ;) Tuesday is a local holiday called Wäldchestag (Q1605844) (Day of the forest). --Kolja21 (talk) 19:54, 15 May 2016 (UTC)
@Kolja21: Oh, Eurovision contest... ;-) -- Sergey kudryavtsev (talk) 06:42, 17 May 2016 (UTC)
PS: DNB already knowns Jamala (Q2662517) «gewann für die Ukraine den Eurovision Song Contest 2016 in Stockholm». -- Sergey kudryavtsev (talk) 06:50, 17 May 2016 (UTC)
@Sergey kudryavtsev: How comes that User:Kolja21 could know about the existence of 1100357025 when it was neither queryable nor visible? Hint: de:Portal:Bibliothek, Information, Dokumentation/Normdaten/GND-Kooperation ;-) -- Gymel (talk) 18:35, 17 May 2016 (UTC)

✓ Done Server is up again and running without errors. All new GNDs are online. --Kolja21 (talk) 11:41, 16 May 2016 (UTC)