Wikidata:Property proposal/name-suggestion-index identifier

From Wikidata
Jump to navigation Jump to search

name-suggestion-index identifier[edit]

Return to Wikidata:Property proposal/Authority control

   Under discussion
Descriptionidentifier for a brand in OpenStreetMap's name-suggestion-index
RepresentsName Suggestion Index (Q62108705)
Data typeExternal identifier
Domainretail chain (Q507619)
Allowed values\w+/\w+\|.+
Example 1McDonald's (Q38076)amenity/fast_food|McDonald's
Example 2Royal Dutch Shell (Q154950)amenity/fuel|Shell, shop/convenience|Shell, amenity/fuel|เชลล์
Example 3United States Postal Service (Q668687)amenity/post_office|United States Post Office, amenity/post_box|USPS
Source[1]
External linksUse in sister projects: [ar][de][en][es][fr][he][it][ja][ko][nl][pl][pt][ru][sv][vi][zh][commons][species][wd].
Planned useUpon approval, this property will be mentioned in name-suggestion-index's contributing guide and the brand:wikidata key's official documentation, and NSI contributors will immediately begin adding the property to some existing items that have been deleted by mistake in the past.
Number of IDs in source5,055 identifiers, 4,516 that have corresponding Wikidata items as of 9f3f1a70dd38869bdab54a7f29e33073f4ce1fc3
Expected completenessalways incomplete (Q21873886)
See alsoOSM tag or key (P1282), OSM relation ID (P402),

Motivation[edit]

name-suggestion-index is the OpenStreetMap project's de facto authority for brand-related tagging information. Entries in NSI are presented to mappers as presets to choose from when mapping, alongside unbranded presets like "road", "lake", "restaurant", or "ATM". Most entries were created because NSI scripts flagged certain names as being common supermarket names (for instance) in the main OSM database.

Most entries are for brands that already have Wikidata items, so linking OpenStreetMap with Wikidata is just a matter of adding a brand:wikidata tag to the entry. It isn't feasible to link Wikidata to every instance of a chain store location in OSM, but it is possible to link to the brand's entry in NSI. It would only be feasible for this user script to query OSM for chain store locations to plot on a map when it knows what kind of business OSM considers the chain to be.

The idea of creating an identifier property for NSI has come up a couple times in the context of undeletion discussions (another instance). I don't think the presence of this property on an item would establish notability by itself, given that OSM isn't considered an authority on store locations. But it could give administrators a little more clarity when assessing whether a business-related item is merely undeveloped or whether it's spam.

 – Minh Nguyễn 💬 02:11, 9 November 2019 (UTC)

Discussion[edit]

  • Pictogram voting comment.svg Comment you added links to your examples, and that is obviously helpful, but we should keep in mind that in the current form Wikidata will not be able to generate these links (since it only supports inserting the full statement value in a formatter URL). There are a few options:
    • Use a URL datatype and store the whole URLs as values (they can still be restricted to a particular format via a constraint)
    • Set up a proxy which accepts values in the format you suggest and translates them to your service (ArthurPSmith can help) - this should only be done if the format you are proposing is already attested somewhere else
    • Find another format for which there already exists a service which accepts these values as part of its URLs
Let me know if any of this is unclear. − Pintoch (talk) 13:45, 9 November 2019 (UTC)
Thanks for the suggestions Pintoch! NSI has been using this identifier format on its pages but not in its URLs. I've proposed a change to NSI that would allow us to use a format of https://nsi.guide/?id=$1. – Minh Nguyễn 💬 17:27, 9 November 2019 (UTC)
I think it might make sense to leave the string itself unlinked, but put the URI (either of the k&v type or the id type) as a reference. After all, NSI is an authority of what identifiers it uses :) Arlo Barnes (talk) 22:14, 10 November 2019 (UTC)
  • Symbol support vote.svg Support Seems valuable, to be able to access this classification from Wikidata. I would prefer the External ID datatype, especially if NSI URLS can be adapted to accept such values -- linked data is so helpful. And no good reason not to do this upfront, rather than through some semi-hidden reference mechanism. Would the applies to name (P5168) qualifier generally be used to indicate the particular name being identified, or would this be redundant given the structure of the identifier? Jheald (talk) 17:57, 11 November 2019 (UTC)
  • @Jheald: Yes, if I understand the qualifier correctly, it would be helpful in the case of international brands like Royal Dutch Shell (Q154950) above. – Minh Nguyễn 💬 02:36, 23 November 2019 (UTC)
  • Pictogram voting comment.svg Comment @Mxn: Routinely having multiple different strings to "identify" the same entity is not really good practice for an "external identifier", though having the same string "identify" multiple entities is more fatal. I think as Pintoch suggested above you would probably be better off making this a URL datatype, or a string datatype and not worry about linking in the Wikidata UI... ArthurPSmith (talk) 00:16, 24 November 2019 (UTC)
    @ArthurPSmith: I'm afraid I don't follow... even with a URL datatype, a given entity may have multiple NSI URLs, because NSI identifiers partly consist of the local name, which can vary from country to country or from language to language (or writing system to writing system). Would that be a problem? – Minh Nguyễn 💬 01:10, 28 November 2019 (UTC)
    @Mxn: Just that there is usually no expectation for a URL datatype to be single-valued, while an external id generally is (it is the "identifier" for the entity in the database). ArthurPSmith (talk) 12:59, 30 November 2019 (UTC)
  • Pictogram voting comment.svg Comment And for your future RegEx, I propose: https:\/\/nsi\.guide\/index\.html\?k=\S{1,48}&v=\S{1,48}#\S{1,128} The one in the proposal doesn't work. Cordially. —Eihel (talk) 02:05, 27 November 2019 (UTC)
    Note that the regex in the proposal assumes that [2] will be merged before this proposed property comes into use. – Minh Nguyễn 💬 01:10, 28 November 2019 (UTC)
    @Mxn: An unescaped delimiter must be escaped with a backslash. So, what do you think about \w+\/\w+\|\.+ ? —Eihel (talk) 23:09, 13 December 2019 (UTC)
  • Symbol support vote.svg Support I think it's a good idea. Although, I as a collaborator of the NSI project who has had some of their Wikidata entries deleted I might be a little bias. But, I think it should happen anyway. --Adamant1 (talk) 04:47, 8 December 2019 (UTC)
  • Symbol support vote.svg Support --Tinker Bell 21:08, 13 December 2019 (UTC)