Property talk:P356

From Wikidata
Jump to navigation Jump to search


serial code used to uniquely identify digital objects like academic papers
DescriptionDOI (Digital object identifier) reference for scientific publication
Representsdigital object identifier (Q25670)
Data typeExternal identifier
Template parameteren:Template:Cite_journal : |doi=
Domainscientific publication (note: this should be moved to the property statements)
Allowed values10\.\d{4,9}/[A-Z0-9_\W]+ (Syntax described at does not specify case. Uppercase recommended.)
/^10.\d{4,9}/[-._;()/:A-Z0-9]+$/i (The regular expression syntax described for "modern Crossref DOIs" by Andrew Gilmartin, a member of the U.S. Crossref team. Matches 74.4 million of the 74.9 million DOIs in Crossref.)
/^10.1002\/[^\s]+$/i (Syntax described by Andrew Gilmartin (member of the U.S. Crossref team) for early DOIs (catches approximately 300,000 more DOIs than the "modern Crossref DOI" regular expression). Escape character added to avoid malformed input error.)
ExampleEcological guild evolution and the discovery of the world's smallest vertebrate. (Q15567682)10.1371/JOURNAL.PONE.0029797 (RDF)
Formatter URL$1
See alsohandle (P1184), DOI prefix (P1662), EIDR identifier (P2704)
Proposal discussion[not applicable Proposal discussion]
Current uses17,495,411 out of 190,000,000 (9% complete)
Search for values
Explanations [Edit]
Format “10\.[0-9]{4,}(?:\.[0-9]+)*\/(?:(?![\"&\'])\S)+: value must be formatted using this pattern (PCRE syntax). (Help)
Exceptions are possible as rare values may exist.
List of this constraint violations: Database reports/Constraint violations/P356#Format, SPARQL, SPARQL (new)
Single value: this property generally contains a single value. (Help)
Exceptions are possible as rare values may exist. Known exceptions: Dinosaur with a heart of stone (Q28142656)
List of this constraint violations: Database reports/Constraint violations/P356#Single value, SPARQL, SPARQL (new)
Distinct values: this property likely contains a value that is different from all other items. (Help)
Exceptions are possible as rare values may exist.
List of this constraint violations: Database reports/Constraint violations/P356#Unique value, SPARQL (every item), SPARQL (by value), SPARQL (new)
Qualifiers “stated as (P1932), reason for deprecation (P2241), access status (P6954): this property should be used only with the listed qualifiers. (Help)
Exceptions are possible as rare values may exist.
List of this constraint violations: Database reports/Constraint violations/P356#Allowed qualifiers, SPARQL, SPARQL (new)
Conflicts with “instance of (P31): Wikimedia template (Q11266439): this property must not be used with the listed properties and values. (Help)
List of this constraint violations: Database reports/Constraint violations/P356#Conflicts with P31, hourly updated report, search, SPARQL, SPARQL (new)
Conflicts with “occupation (P106): this property must not be used with the listed properties and values. (Help)
Exceptions are possible as rare values may exist.
List of this constraint violations: Database reports/Constraint violations/P356#Conflicts with P106, search, SPARQL, SPARQL (new)
Conflicts with “subclass of (P279): this property must not be used with the listed properties and values. (Help)
Exceptions are possible as rare values may exist. Known exceptions: IEEE 754-2008: IEEE Standard for Floating-Point Arithmetic (Q951059), IEEE 754-1985: IEEE Standard for Binary Floating-Point Arithmetic (Q14954905)
List of this constraint violations: Database reports/Constraint violations/P356#Conflicts with P279, search, SPARQL, SPARQL (new)
Format “(?i)((?!\b(%)).)*: value must be formatted using this pattern (PCRE syntax). (Help)
Exceptions are possible as rare values may exist.
List of this constraint violations: Database reports/Constraint violations/P356#Format, SPARQL, SPARQL (new)
Pattern ^10\.5281/zenodo\.268334$ will be automatically replaced to 10.5281/zenodo.844869.
Testing: TODO list


@Laddo: Can you take another look at the format constraint? All of the statements still violate it. -Tobias1984 (talk) 10:26, 25 August 2014 (UTC)

@Tobias1984: Seems that I broke it last April! Let's see like that... LaddΩ chat ;) 22:06, 25 August 2014 (UTC)
@Laddo: Works again! Thanks a lot! Tobias1984 (talk) 18:49, 26 August 2014 (UTC)

Allowed values[edit]

I just changed the format constraint to just two characters for the suffix, since such examples do exist. The DOI Handbook speaks of "a character string of any length chosen by the registrant", but I have not yet seen a suffix of less than two characters. --Daniel Mietchen (talk) 16:13, 16 August 2016 (UTC)

Canonicalizing DOIs[edit]

Officially, the DOI is a case-insensitive format; 10.1000/abc and 10.1000/ABC refer to the same thing. This is problematic for Wikidata, however, since Wikidata and the Wikidata Query Service would consider those two things to be separate. This is why in the constraint violation report, most of the "single value" violations are just entries that have the same DOI twice, but with different capitalizations. To make things more consistent, I propose:

  1. All letters in DOIs should be lowercase.
  2. This should be enforced by a bot.
  3. Tools that work with DOIs should convert the letters in DOIs into lowercase upon input and output.

By standardizing around this, it makes DOI retrieval easier; we don't have to wonder if a DOI will be likethis or LikeThis. Thoughts? Harej (talk) 00:44, 15 January 2017 (UTC)

I agree entirely. A lowercase requirement should also be added to the format as a regular expression (P1793) statement and to the property proposal template above. --Daniel Mietchen (talk) 23:19, 15 January 2017 (UTC)
In general I agree, however I want to point out that according to the DOI Handbook "All DOI names are converted to upper case upon registration, which is a common practice for making any kind of service case insensitive.". The upper case formatting is also the common way of displaying DOIs as used by DataCite and their tools (e.g. cirneco). So I think we should follow those common practices and make rule 1: "All letters in DOIs should be uppercase." and rule 3: "Tools that work with DOIs should convert the letters in DOIs into uppercase upon input and output.". I hope this helps. --Frog23 (talk) 08:24, 16 January 2017 (UTC)
+1 Snipre (talk) 08:58, 16 January 2017 (UTC)
Converting to a canonical format is something that I can support. I read "All DOI names are converted to upper case upon registration" pointed to by Frog23, section 2.4 Case sensivity. On the other hand I see Elsevier, Wiley and Science (e.g., [1] and [2], [3]) using lowercase. So it is better to use lowercase? Copy-paste would be easier. Note that it is only ASCII [a-z] characters where case insensivity applies. non-ASCII case distinguishing should still be possible. — Finn Årup Nielsen (fnielsen) (talk) 11:11, 16 January 2017 (UTC)
That is what the DOI handbook says, Frog23, but I haven't seen many all-caps DOIs in practice; it's been either lowercase or camelcase. I am fine with all-uppercase if that's what everyone else agrees to. Harej (talk) 15:46, 16 January 2017 (UTC)
It seems that @Magnus Manske:'s sourcemd is using uppercase. — Finn Årup Nielsen (fnielsen) (talk) 20:14, 16 January 2017 (UTC)
And yet Crossref seem to normalize to lowercase. (Also, as Daniel Mietchen pointed out on Twitter, URLs in general are normalized to lowercase.) Harej (talk) 20:40, 16 January 2017 (UTC)
Can confirm that SourceMD converts to uppercase. Best that I can tell, most journal article items on Wikidata come from SourceMD. Between that, the DOI specification, and the recommendation of DataCite, I am leaning toward normalizing with uppercase letters. Harej (talk) 21:26, 16 January 2017 (UTC)
I changed SourceMD to uppercase after reading this thread, forgot to mention it here. --Magnus Manske (talk) 09:52, 17 January 2017 (UTC)

If there is no further discussion over the next few days, I will go ahead with standardizing around uppercase letters. Harej (talk) 05:26, 18 January 2017 (UTC)

Harej I changed my bot to make DOI uppercase Gstupp (talk) 01:44, 29 January 2017 (UTC)
@Harej, Gstupp: Please wait a moment. For a possible use of Wikidata items as source for {{cite journal}} (or equivalents in other languages) it would be perfect, if the DOIs are not changed from the form on the publisher's page. Your effort to normalize the DOIs according the specification is praisable, but for backwards compatibility I think we should stick to the form used by the publisher, even if it's formally wrong.--Kopiersperre (talk) 17:49, 20 February 2017 (UTC)
Both and redirect searches for 10.1002/ASI.23162 from the uppercase to the lowercase version. Are there any valid examples where this redirection does not happen? LeadSongDog (talk) 18:43, 14 June 2017 (UTC)

The two DOI registration agencies Crossref and DataCite updated their DOI display guidelines in 2017 [4] and [5]. There is no requirement to display DOIs in uppercase or lowercase, but the common practice is increasingly to user lowercase.

Adding DOIs for institutions from GRID[edit]

I am going to import FundRef identifiers, stated as DOIs with the appropriate DOI prefix (10.13039) from the GRID dataset. All items that have a GRID ID (P2427) and no DOI (P356) will receive a DOI (P356) if there is a unique FundRef id for that GRID id in the latest dump. The statements will have a reference, which will be the DOI of the dataset they come from. Let me know if you have any concerns. − Pintoch (talk) 19:49, 1 February 2017 (UTC)

Pintoch, I am not sure I follow. Are these DOIs for organizations? Aren't they typically assigned to documents? Harej (talk) 00:53, 2 February 2017 (UTC)
DOIs can be assigned to many sorts of things, including institutions. Here is an example: is the FundRef DOI for Khon Kaen University (Q368329). − Pintoch (talk) 08:07, 2 February 2017 (UTC)

Fixing a DOI in many references[edit]

Is it okay to use {{Autofix}} to change the DOI in a widely used reference? − Pintoch (talk) 09:02, 26 August 2017 (UTC)

I don't think it should be done in general, but in this specific case you already replaced everything imported from one publication with that of another publication and the first publication was withdrawn. So effectively all references point to the wrong publication.
--- Jura 11:09, 26 August 2017 (UTC)
Yes, many apologies for that. I can also change the DOIs myself if that is better. − Pintoch (talk) 11:23, 26 August 2017 (UTC)


The EIDR P2704 resolver is no general DOI resolver, e.g., fails, but in their own 10.5240 registry works. Please remove EIDR from the P356 formatter URLs. – 04:15, 18 September 2017 (UTC)

{{Edit request}} 21:56, 30 September 2017 (UTC)
It's currently not active. The regex should limit the scope.
--- Jura 17:08, 30 January 2018 (UTC)

Fixed URI[edit]

The URI for a RDF record is still "<some record>". You can verify that by running curl --location --header "Accept: text/turtle" | grep 10.1371. The formatter URI for RDF resource (P1921) sets the URI (wdtn:P356) and should like to the URI. This is just like Wikidata where we use https everywhere, but in the rdf use http for the URI. See Property_talk:P1921#Incorrect_URI's for background info. Multichill (talk) 13:38, 8 September 2018 (UTC)

DOI Format error is from original (and it works)[edit]

I think there's an applicable discussion on this, but I won't intrude. The DOI 10.1666/PLEO0022-3360(2007)081[0797:BPASAF]2.0.CO;2 manages to work from Bizarre Permian ammonoid subfamily Aulacogastrioceratinae from southeast China (Q57268695). I get a format constraint, but I don't know what to do. Trilotat (talk) 00:49, 13 October 2018 (UTC)

unhelpful duplicates[edit]

The distinct values constraint is flagging Factors in the prevention of wound dehiscence during pneumatic retinopexy. (Q43570596) and Fluorescein angiograms do not support choroidal ischemia. (Q43570600), because it seems the DOI is based on a page number and these two short "articles" are on the same page. I guess there may be quite a few of these cases, but I don't see that there's anything that can be done about them. They will just clog up the constraint failures list, which would otherwise be useful for finding items that should be merged. Any ideas? Ghouston (talk) 04:33, 12 February 2019 (UTC)

What to do if DOI doesn't exist at[edit]

How to address when a DOI is wrong (doesn't exist) and I'm unable to find the right DOI? In this case, PubMED points to it.

Thanks, Trilotat (talk) 15:11, 22 February 2019 (UTC)

I would add the PubMed ID as a source and mark the claim as deprecated. − Pintoch (talk) 15:26, 22 February 2019 (UTC)
@Pintoch: Sorry to be thick-headed, but can you demonstrate at Q48783702? I started to do it, but the only option to mark that deprecated DOI was reason for deprecation (P2241) which appears to require a QID. The PubMED ID also points to a bad DOI, so I'm not sure where to go with this. Merci. Trilotat (talk) 16:05, 22 February 2019 (UTC)
@Trilotat: Done! You might find Help:Ranking useful. Note that there is a difference between marking a claim as deprecated and adding a reason for deprecation as qualifier (it is a good idea to add reason for deprecation (P2241), but that by itself is not going to change the rank of the statement.) The problem with the current claim ranks is that they are not very visible in the interface, see for some discussion about that. − Pintoch (talk) 16:36, 22 February 2019 (UTC)
@Pintoch: Thanks! I think I understand the ranking. I have a question about how to apply it in the special case of retracted paper (Q45182324) if you want to pop over there to take a look. Trilotat (talk) 18:42, 22 February 2019 (UTC)
@Trilotat: I have made the edit that I suggested on Q48783702, what else do you want me to do? I do not have any edit to suggest on retracted paper (Q45182324). − Pintoch (talk) 18:54, 22 February 2019 (UTC)
@Pintoch: Nothing else, thanks. I was just noting that I had a question over at the talk page for Q45182324 where I wondering if it was necessary to bump up "retracted paper" over "scholarly article" within P31. I don't expect you to answer. I was was just sharing that I had that question there since you educated me about ranking. Trilotat (talk) 18:58, 22 February 2019 (UTC)
@Trilotat: ok thanks, I had not realized you were talking about the talk page of that item, sorry. − Pintoch (talk) 19:05, 22 February 2019 (UTC)

How to address a DOI that redirects to a different DOI?[edit]

@Pintoch: I understand that articles should normally have only one DOI. I have found that some items have a DOI that redirect to a different DOI, e.g. Q51394575. I marked as deprecated the DOI that redirects to the other DOI.

1. Am I correct to leave the "redirecting" DOI so to avoid someone adding another version of this same article based on that redirecting DOI?

2. Is deprecation the right way to distinguish? I didn't add a reference to the "redirecting" one since I wasn't sure what to use.

Thanks again. Trilotat (talk) 15:55, 26 February 2019 (UTC)

Hi Trilotat - to me, it looks right, but I have not worked much with publication items: you might want to ask Fnielsen, Daniel Mietchen or Egon_Willighagen who are more knowledgeable on this. (Does the sourcemd tool detect DOIs that are marked as deprecated and avoids creating a new item in that sort of case?) − Pintoch (talk) 17:23, 26 February 2019 (UTC)
Pulling in Magnus Manske... --Egon Willighagen (talk) 19:50, 26 February 2019 (UTC)
Fnielsen, Daniel Mietchen or Egon_Willighagen, I think SourceMD should NOT create a duplicate item if the DOI exists in deprecated form. I've created a proposal on Magnus Manske's BitBucket to resolve the issue at [6]. Do you think I've stated the issue effectively there? Trilotat (talk) 13:09, 1 March 2019 (UTC)

CiteseerX DOI parameters[edit]

I see a constraint violation on a recently updated reference I made on outer shell (Q61976836) where you can see the `doi=` parameter in the url for that reference. Is this violation expected? Thadguidry (talk) 23:39, 5 March 2019 (UTC)

I think the DOI is wrong. I followed the link and got an error page. Trilotat (talk) 00:39, 6 March 2019 (UTC)
@Thadguidry: CiteSeerX "dois" are completely different from actual DOIs. It's just an unfortunate use of the same terminology. No CiteSeerX doi should be used with DOI (P356). − Pintoch (talk) 08:57, 6 March 2019 (UTC)
@Pintoch: Ah, thanks, Antonin, I didn't know that. We'll, at least now we have this info recorded here to alert others that might come looking like I did. Thadguidry (talk) 14:19, 6 March 2019 (UTC)