Wikidata talk:Identifier migration/0

From Wikidata
Jump to navigation Jump to search

@Jura1: Can you explain your criteria for the many properties you added here? For example, I don't understand how P969 (P969) can be considered an identifier - there could be many distinct entities at a given location, and "1 Main St." would be the same value for many different entities. ArthurPSmith (talk) 21:29, 13 January 2016 (UTC)[reply]

It's made based on the absence of Wikidata property with datatype string that is not an external identifier (Q21099935). We had added these last time it was discussed with the devs. Uniqueness isn't a criteria for external identifiers. The property you mention shouldn't be a problem.
--- Jura 07:42, 14 January 2016 (UTC)[reply]
Yeah identifiers don't need to be unique but to me the example Arthur gave also sounds a bit strange as an identifier. --Lydia Pintscher (WMDE) (talk) 10:18, 14 January 2016 (UTC)[reply]
Lydia Pintscher (WMDE) - do we have a definition somewhere of what "external identifier" means, what we're looking for for something to qualify for this new data type? To me I think the characteristics would be: (1) It is assigned by some external authority, an organization or website, (2) it is stable: the identifier associated with an item should not change over time, (3) in the particular context of that external authority, perhaps with the assistance of some additional property or properties, the identifier string should uniquely identify the thing it identifies. How else is it an identifier? But maybe this has been written down somewhere else? Thanks! ArthurPSmith (talk) 22:08, 14 January 2016 (UTC)[reply]
Good question. I don't have a really good answer. It seems there are some edge-cases that people already brought up. I don't think we should make uniqueness a hard requirement. Maybe something like "something we can use to look up the same topic in other places"? --Lydia Pintscher (WMDE) (talk) 13:49, 15 January 2016 (UTC)[reply]
At some point, I was going to attempt to describe the various patterns, but the related property wasn't created in a timely manner. This is why the current solution was used. Some identifiers are unique within a given scope, e.g. mobile network code (P2259) or P969. Obviously, most identifiers could be used in sources or as qualifiers which wouldn't make them unique ("headquarters location", etc.).
--- Jura 15:08, 15 January 2016 (UTC)[reply]
I don't think Jura is saying anything different from what I did with regard to uniqueness above. There is one additional characteristic though I think is necessary: uniqueness of representation of the identifier, or at least a standard representation, as a string. Something like ISNI for instance is always supposed to be 4 sets of 4 digits (or possibly a final X character instead of the last digit) separated by 1 space character. Usernames are unique strings. Many of the identifiers on our list are simple numbers, represented as strings of digits without a leading zero; there's only one way to write it. But something like street address simply can not qualify as it can be written innumerable different ways (and in different languages!) to mean the same thing. ArthurPSmith (talk) 15:31, 15 January 2016 (UTC)[reply]
Normally a street address should uniquely identify a building within a place. Supposedly there are places where people can pick their street name and building number, but in many places these meets points (1), (2) and (3). That identifiers can be written in different ways, I think we already noticed that with ISNI.
--- Jura 15:45, 15 January 2016 (UTC)[reply]
Neither house number (P670) nor P969 (P969) seem like identifiers to me (in addition having a large amount of overlap between them. Tfmorris1 (talk) 16:34, 1 February 2016 (UTC)[reply]