Wikidata:Property proposal/URL match pattern

From Wikidata
Jump to navigation Jump to search

URL match pattern[edit]

Return to Wikidata:Property proposal/Generic

   Under discussion
Descriptiona regex pattern of URL that an external ID may be extracted
Data typeString
Example 1IMDb ID (P345) → (one of multiple values) https:\/\/www\.imdb\.com\/(title|name|news)\/([a-z0-9]+)(\/.*)? <replacement value> \2
Example 2PubMed ID (P698) → https:\/\/pubmed\.ncbi\.nlm\.nih\.gov\/(\d+)(-[^\/]*)?\/ <replacement value> \1
Example 3ISNI (P213) → https?:\/\/www\.isni\.org\/(\d{4})(| |%20)(\d{4})(| |%20)(\d{4})(| |%20)(\d{4}) <replacement value> \1 \3 \5 \7
Example 4ZVG number (P679) → http:\/\/gestis-en\.itrust\.de\/nxt\/gateway\.dll\/gestis_en\/0+([1-9]\d+)\.xml.* <replacement value> \1
Example 5CricketArchive player ID (P2698) → https:\/\/cricketarchive\.com\/Archive\/Players\/\d+\/\d+\/(\d+)\.html <replacement value> \1
Example 6Fandom article ID (P6262) → https:\/\/([a-z0-9\.-]+)\.(wikia|fandom)\.com\/wiki\/(.*) <replacement value> \1:\3
Example profile ID (P2600) → https:\/\/www\.geni\.com\/(profile|people)\/[^\/]+\/(\d+)(#.*)? <replacement value> \2
See alsoformatter URL (P1630)

URL match replacement value[edit]

   Under discussion
Description(qualifier only) see above
Data typeString
Example 1see above
Example 2MISSING
Example 3MISSING


This will provide a way to extract property and ID from a given URL. A future tool or gadget may benefit from this. GZWDer (talk) 23:46, 26 February 2020 (UTC)


Pictogram voting comment.svg Comment Here's an example of how this would look on Fandom article ID (P6262):

URL match pattern
Normal rank https:\/\/([a-z0-9\.-]+)\.(wikia|fandom)\.com\/wiki\/(.*) Arbcom ru editing.svg edit
URL match replacement value \1:\3
▼ 0 reference
+ add reference

+ add value

If a tool wanted to automatically generate a Fandom article ID (P6262) from the URL for example, it would match the regex specified with property against that URL. There are three caputring groups in the regex. The first one is ([a-z0-9\.-]+), and matches "minecraft", the second one is (wikia|fandom) and matches "fandom", and the third one is (.*) and matches "Sheep". The URL match replacement value allows these capturing groups to be put together. \1:\3 turns into minecraft:Sheep, since \N is replaced with the value of the nth capturing group. --SixTwoEight (talk) 01:52, 4 March 2020 (UTC)