Property talk:P356

From Wikidata
Jump to: navigation, search


serial code used to uniquely identify digital objects like academic papers
Description DOI (Digital object identifier) reference for scientific publication
Represents Digital Object Identifier (Q25670)
Data type External identifier
Template parameter en:Template:Cite_journal : |doi=
Domain scientific publication (note: this should be moved to the property statements)
Allowed values 10\.\d{4,9}/[A-Z0-9_\W]+ (Syntax described at does not specify case. Uppercase recommended.)
/^10.\d{4,9}/[-._;()/:A-Z0-9]+$/i (The regular expression syntax described for "modern Crossref DOIs" by Andrew Gilmartin, a member of the U.S. Crossref team. Matches 74.4 million of the 74.9 million DOIs in Crossref.)
/^10.1002\/[^\s]+$/i (Syntax described by Andrew Gilmartin (member of the U.S. Crossref team) for early DOIs (catches approximately 300,000 more DOIs than the "modern Crossref DOI" regular expression). Escape character added to avoid malformed input error.)
Example Ecological guild evolution and the discovery of the world's smallest vertebrate. (Q15567682)10.1371/journal.pone.0029797 (RDF)
Formatter URL$1
See also handle (P1184), DOI prefix (P1662)
Proposal discussion Originally created without a formal discussion
Current uses 11,398,175
Translate or enrich the help


Format “10\.[0-9]{4,}(?:\.[0-9]+)*\/(?:(?![\"&\'\]])\S)+”: value must be formatted using this pattern (PCRE syntax).
Exceptions are possible as rare values may exist.
List of this constraint violations: Database reports/Constraint violations/P356#Format, SPARQL
Single value: this property generally contains a single value.
Exceptions are possible as rare values may exist. Known exceptions: Dinosaur with a heart of stone (Q28142656)
List of this constraint violations: Database reports/Constraint violations/P356#single value, SPARQL
Distinct values: this property likely contains a value that is different from all other items.
Exceptions are possible as rare values may exist.
List of this constraint violations: Database reports/Constraint violations/P356#distinct values, SPARQL (every item), SPARQL (by value)
Qualifiers “stated as (P1932): this property should be used only with the listed qualifiers.
List of this constraint violations: Database reports/Constraint violations/P356#Allowed qualifiers, hourly updated report, SPARQL
Conflicts with “instance of (P31): Wikimedia template (Q11266439): this property must not be used with the listed properties and values.
List of this constraint violations: Database reports/Constraint violations/P356#Conflicts with P31, hourly updated report, SPARQL
Pattern ^10\.5281/zenodo\.268334$ will be automatically replaced to 10.5281/zenodo.844869.
Testing: TODO list


@Laddo: Can you take another look at the format constraint? All of the statements still violate it. -Tobias1984 (talk) 10:26, 25 August 2014 (UTC)

@Tobias1984: Seems that I broke it last April! Let's see like that... LaddΩ chat ;) 22:06, 25 August 2014 (UTC)
@Laddo: Works again! Thanks a lot! Tobias1984 (talk) 18:49, 26 August 2014 (UTC)

Allowed values[edit]

I just changed the format constraint to just two characters for the suffix, since such examples do exist. The DOI Handbook speaks of "a character string of any length chosen by the registrant", but I have not yet seen a suffix of less than two characters. --Daniel Mietchen (talk) 16:13, 16 August 2016 (UTC)

Canonicalizing DOIs[edit]

Officially, the DOI is a case-insensitive format; 10.1000/abc and 10.1000/ABC refer to the same thing. This is problematic for Wikidata, however, since Wikidata and the Wikidata Query Service would consider those two things to be separate. This is why in the constraint violation report, most of the "single value" violations are just entries that have the same DOI twice, but with different capitalizations. To make things more consistent, I propose:

  1. All letters in DOIs should be lowercase.
  2. This should be enforced by a bot.
  3. Tools that work with DOIs should convert the letters in DOIs into lowercase upon input and output.

By standardizing around this, it makes DOI retrieval easier; we don't have to wonder if a DOI will be likethis or LikeThis. Thoughts? Harej (talk) 00:44, 15 January 2017 (UTC)

I agree entirely. A lowercase requirement should also be added to the format as a regular expression (P1793) statement and to the property proposal template above. --Daniel Mietchen (talk) 23:19, 15 January 2017 (UTC)
In general I agree, however I want to point out that according to the DOI Handbook "All DOI names are converted to upper case upon registration, which is a common practice for making any kind of service case insensitive.". The upper case formatting is also the common way of displaying DOIs as used by DataCite and their tools (e.g. cirneco). So I think we should follow those common practices and make rule 1: "All letters in DOIs should be uppercase." and rule 3: "Tools that work with DOIs should convert the letters in DOIs into uppercase upon input and output.". I hope this helps. --Frog23 (talk) 08:24, 16 January 2017 (UTC)
+1 Snipre (talk) 08:58, 16 January 2017 (UTC)
Converting to a canonical format is something that I can support. I read "All DOI names are converted to upper case upon registration" pointed to by Frog23, section 2.4 Case sensivity. On the other hand I see Elsevier, Wiley and Science (e.g., [1] and [2], [3]) using lowercase. So it is better to use lowercase? Copy-paste would be easier. Note that it is only ASCII [a-z] characters where case insensivity applies. non-ASCII case distinguishing should still be possible. — Finn Årup Nielsen (fnielsen) (talk) 11:11, 16 January 2017 (UTC)
That is what the DOI handbook says, Frog23, but I haven't seen many all-caps DOIs in practice; it's been either lowercase or camelcase. I am fine with all-uppercase if that's what everyone else agrees to. Harej (talk) 15:46, 16 January 2017 (UTC)
It seems that @Magnus Manske:'s sourcemd is using uppercase. — Finn Årup Nielsen (fnielsen) (talk) 20:14, 16 January 2017 (UTC)
And yet Crossref seem to normalize to lowercase. (Also, as Daniel Mietchen pointed out on Twitter, URLs in general are normalized to lowercase.) Harej (talk) 20:40, 16 January 2017 (UTC)
Can confirm that SourceMD converts to uppercase. Best that I can tell, most journal article items on Wikidata come from SourceMD. Between that, the DOI specification, and the recommendation of DataCite, I am leaning toward normalizing with uppercase letters. Harej (talk) 21:26, 16 January 2017 (UTC)
I changed SourceMD to uppercase after reading this thread, forgot to mention it here. --Magnus Manske (talk) 09:52, 17 January 2017 (UTC)

If there is no further discussion over the next few days, I will go ahead with standardizing around uppercase letters. Harej (talk) 05:26, 18 January 2017 (UTC)

Harej I changed my bot to make DOI uppercase Gstupp (talk) 01:44, 29 January 2017 (UTC)
@Harej, Gstupp: Please wait a moment. For a possible use of Wikidata items as source for {{cite journal}} (or equivalents in other languages) it would be perfect, if the DOIs are not changed from the form on the publisher's page. Your effort to normalize the DOIs according the specification is praisable, but for backwards compatibility I think we should stick to the form used by the publisher, even if it's formally wrong.--Kopiersperre (talk) 17:49, 20 February 2017 (UTC)
Both and redirect searches for 10.1002/ASI.23162 from the uppercase to the lowercase version. Are there any valid examples where this redirection does not happen? LeadSongDog (talk) 18:43, 14 June 2017 (UTC)

Adding DOIs for institutions from GRID[edit]

I am going to import FundRef identifiers, stated as DOIs with the appropriate DOI prefix (10.13039) from the GRID dataset. All items that have a GRID ID (P2427) and no DOI (P356) will receive a DOI (P356) if there is a unique FundRef id for that GRID id in the latest dump. The statements will have a reference, which will be the DOI of the dataset they come from. Let me know if you have any concerns. − Pintoch (talk) 19:49, 1 February 2017 (UTC)

Pintoch, I am not sure I follow. Are these DOIs for organizations? Aren't they typically assigned to documents? Harej (talk) 00:53, 2 February 2017 (UTC)
DOIs can be assigned to many sorts of things, including institutions. Here is an example: is the FundRef DOI for Khon Kaen University (Q368329). − Pintoch (talk) 08:07, 2 February 2017 (UTC)

Fixing a DOI in many references[edit]

Is it okay to use {{Autofix}} to change the DOI in a widely used reference? − Pintoch (talk) 09:02, 26 August 2017 (UTC)

I don't think it should be done in general, but in this specific case you already replaced everything imported from one publication with that of another publication and the first publication was withdrawn. So effectively all references point to the wrong publication.
--- Jura 11:09, 26 August 2017 (UTC)
Yes, many apologies for that. I can also change the DOIs myself if that is better. − Pintoch (talk) 11:23, 26 August 2017 (UTC)


The EIDR P2704 resolver is no general DOI resolver, e.g., fails, but in their own 10.5240 registry works. Please remove EIDR from the P356 formatter URLs. – 04:15, 18 September 2017 (UTC)

{{Edit request}} 21:56, 30 September 2017 (UTC)
It's currently not active. The regex should limit the scope.
--- Jura 17:08, 30 January 2018 (UTC)