Wikidata talk:External identifiers

From Wikidata
Jump to navigation Jump to search

Proposal for expansion of properties[edit]

Hi All,

I would like to propose an update on the use of statements regarding external identifiers in order to make the future use of these IDs on Wikidata and on Wikipedia easier. Besides expanding the properties for IDs, the goal would be to start a dialogue and create guidelines for creating and describing external IDs.

Some of the proposed properties would be new, some already exist but would need to be agreed on to be used.

Though many of these could be added to the subject item of the property, these are often non-existent and I believe it would be easier for a future Wikipedia template to use these IDs if they contain all the information.

Existing but not used with IDs[edit]

Property Description Example Comments
title (P1476) Multilingual sites have different names for each language. (Svenskt kvinnobiografiskt lexikon vs Biographical Dictionary of Swedish Women) Currently there are a few examples where multi-lingual titles are defined in Wikidata item of this property (P1629) or in source website for the property (P1896) as qualifier. Doesn't have to be used, just put Labels in more languages. title (P1476) is needed only for "formal" title statements --Vladimir Alexiev (talk) 08:53, 27 November 2019 (UTC)[reply]
main subject (P921) main topic the database covers USHMM Holocaust Encyclopedia ID (P3724) --> main subject (P921): The Holocaust (Q2763)
copyright license (P275) license under which this copyrighted work is released World Encyclopedia of Puppetry Arts ID (P7012) --> copyright license (P275): Creative Commons Attribution-ShareAlike (Q6905942)
online access status (P6954) qualifier for an ID property indicating whether linked content is directly readable online Values: Currently has a constrain to be used for references as qualifier
last update (P5017) date a reference was modified, revised, or updated World Encyclopedia of Puppetry Arts ID (P7012)--> copyright license (P275): Creative Commons Attribution-ShareAlike (Q6905942) Currently has a constrain to be used for references as qualifier

To be created or modified[edit]

Property Description Comments
archive URL formatter entries archived in the Internet Archive (in case original site of ID goes offline or for IDs only found archived This way deleted sites wouldn’t be lost (this one for example has many items added on Mix’n’match: [1] vs [2]). It could also encourage creating properties for online encyclopedias which only exist in archived form. third-party formatter URL (P3303) could also used for this purpose. See also: archive URL (P1065), archive date (P2960)
update status If the site is active or an old relic.
Values: active (updated), abandoned (or sth), complete, offline, archived
catalogue type Values: encyclopedia, library authority file, virtual exhibition item, film database, digital library etc. This could be achieved by expanding on Wikidata property for an identifier (Q19847637) subclasses like Wikidata property related to encyclopedias (Q55452870)
data subject type Values: person, location, taxon, artwork, misc., etc. Could be achieved with a new property or Wikidata property for an identifier (Q19847637) subclasses
content type Values: text, data, image, film
text type In order to create a property which helps identifying sources which can serve as further reading in a Wikipedia article or as basis for new article creation. Usually these are encyclopedias, lexicons but some museum or gallery sites also have detailed artist biographies.
Values: informative, data
print version Wikidata item for the print version of an online encyclopedia third-party formatter URL (P3303) could be used
developer (or publisher?) Institution or person responsible for the site and its content.
Example: Encyclopedia of Alabama ID (P6010)Alabama Humanities Foundation (Q30257855), Auburn University (Q540672)
See also: maintained by (P126), operator (P137), sponsor (P859)
developer institution type Values: museum, university, library, private, community, company This could be used for quality assessment, it shows what kind of institution is responsible for the data.
external identifiers connected What other catalogues are integrated or referenced in the data on this site. For example the beacon links in Bach Digital ([3], the references in Deutsche Biographie ([4]) or the library authority IDs contained in VIAF. Perhaps this could help with data mining or Mix'n'match identification? See also: VIAF component (Q26921380)

--Adam Harangozó (talk) 14:39, 14 November 2019 (UTC)[reply]

  • @Adam Harangozó: I think most of these are props of the associated database, not the external-id. A nice exception is "archive URL formatter" but it relies on the WHOLE external site being archived on the same date, and I'm not sure archive.org can give such guarantees? In any case, you need to propose new properties one by one and they go through a discussion/vetting process --Vladimir Alexiev (talk) 08:58, 27 November 2019 (UTC)[reply]
@Vladimir Alexiev: I know but the associated database/external-id distinction is quite a mess at the moment, that's why I would like to create a set of guidelines for external IDs before proposing any new properties. For example most of the time there is no item for the associated database, then at gallery collections they are set for the gallery/museum but to those you can't add the properties listed above (National Gallery of Victoria artist ID (P2041)). I'm not sure if it would make sense to make a separate item for the online databases of galleries. Another example where there is a separate item for the ID and its dictionary: US Congress Bio identifier (Q20205343) This is why I think it would be better to use these statements in the properties themselves, then all the necessary information could be added and would be in one place (which would be useful for creating Wikipedia templates using Wikidata). Adam Harangozó (talk) 17:26, 28 November 2019 (UTC)[reply]
  • @Adam Harangozó: If you start adding detailed props about a database/website, you better make an entity for that database/website. If you just lump them together at the Identifier, it will be confusing, so people will (rightly) object to your property proposals. Eg if you describe "print version" to apply to encyclopedias, then you cannot apply it to any identifier because most identifiers are NOT about encyclopedias. --Vladimir Alexiev (talk) 09:05, 29 November 2019 (UTC)[reply]

Incorporation of external identifiers not yet meeting requirements[edit]

In South Africa the Education Department manages their schools with EMIS codes, but is currently publishing them all as XLS files. In the foreseeable future they will have something fulfilling the formatter URL requirement. Up until then it'd be great to keep track of diffs and reconcile with their IDs. How do you suggest to approach this? -- YaguraStation (talk) 22:40, 15 July 2020 (UTC)[reply]

Suggestion for required values for external identifiers[edit]

I suggest that we add that external identifiers always should have either

⟨ subject ⟩ instance of (P31) View with SQID ⟨ Wikidata property for an identifier that suggests notability (Q62589316)  View with Reasonator View with SQID ⟩

or

⟨ subject ⟩ instance of (P31) View with SQID ⟨ Wikidata property for an identifier that does not imply notability (Q62589320)  View with Reasonator View with SQID ⟩

That would make it much clearer when reviewing whether an item is notable or not when it has an identifier. Perhaps this should even be in the template for property proposals for external identifiers, so it gets added on creation and that any potential discussion has already taken place. Ainali (talk) 16:41, 10 May 2022 (UTC)[reply]

@Adam_Harangozó: thanks for the very useful page documenting external-ids!!!

You write in the intro

  • Wikidata External Identifier properties should have dedicated items to represent their values and they should link to those using class of property value (P10726).
  • But I think the prevailing current practice is to have an item that represents the database and link from it using "wikidata property".
  • This is promulgated by the prop proposal template, which calls this item "Represents"
  • Could you comment on my claim above, or edit the intro to reflect this practice?

Thanks! Vladimir Alexiev (talk) 16:01, 9 October 2023 (UTC)[reply]