Wikidata:Property proposal/Wayback ID

From Wikidata
Jump to navigation Jump to search

Wayback ID[edit]

Originally proposed at Wikidata:Property proposal/Generic

   Not done
DescriptionID for web page archived in Internet Archive's service Wayback Machine
RepresentsWayback Machine (Q648266)
Data typeExternal identifier
Domainproperty
Allowed valuesyyyymmddhhmmss/URL
Allowed unitsnumbers and letter used usually in URL
Example 1Facebook (Q355) → 20190716020349/https://www.facebook.com/ (for qualifiers time stamp 20190716020349 and URL https://www.facebook.com/; please create option for these two qualifiers)
Example 2MISSING
Example 3MISSING
Sourcehttps://web.archive.org
Planned useadd manually to important websites, bot action required in future
Number of IDs in sourcemore than 371 billion web pages saved over time (possible values, only some will be added, as needed, by date)
Expected completenessalways incomplete
Formatter URLhttps://archive.org/web/$1 ''or'' https://web.archive.org/$1

Motivation[edit]

Property needed to add archived version of many pages when they are down. Very bright future I suppose for this property as servers cannot be maintained forever so sites go down and can be linked through Wayback when needed. Obsuser (talk) 03:05, 16 July 2019 (UTC)

Discussion[edit]

  • Symbol oppose vote.svg Oppose; archive URL (P1065) already exists, and full work available at (P953) could also be used. A copy of Facebook's home page is also not a sufficient substitute for the actual Facebook website, so I don't think the example would be an appropriate use of this hypothetical property. Jc86035 (talk) 10:24, 16 July 2019 (UTC)
    archive URL (P1065) and full work available at (P953) can hypothetically be used to link Discogs too, for example, and we have separated properties however. It's not copy, it's page source of original with no alteration (it's arhived original). --Obsuser (talk) 10:44, 16 July 2019 (UTC)
    @Obsuser: It's sort of pointless to make this an identifier, though, since anyone can save a page that's currently online to the Wayback Machine, and there could be dozens to thousands of duplicates for each work. Given that we would normally be obligated to add as many valid identifiers as possible, you could keep adding valid identifiers indefinitely just by saving news articles every hour. I don't think this would be useful data.
    Maybe you could argue for splitting archive URL (P1065) into a few different properties for each archival service, but it would break anything that already uses the existing property for what would essentially be a cosmetic change, and you could use a few lines of Lua code to separate archive URL (P1065) values by the archival service used if that were ever needed in a Wikipedia template. Jc86035 (talk) 13:55, 21 July 2019 (UTC)
  • Symbol oppose vote.svg Oppose per Jc86035. Mahir256 (talk) 17:20, 16 July 2019 (UTC)
  • Symbol oppose vote.svg Oppose, until archive URL (P1065) seems sufficient. Nomen ad hoc (talk) 18:23, 16 July 2019 (UTC).
  • Symbol oppose vote.svg Oppose. per Jc86035 - Premeditated (talk) 09:13, 18 July 2019 (UTC)
  • Symbol oppose vote.svg Oppose per Nonem ad hoc and above --DannyS712 (talk) 01:07, 23 July 2019 (UTC)