Wikidata:Bot requests

From Wikidata
Jump to navigation Jump to search
Bot requests
If you have a bot request, add a new section using the button and tell exactly what you want. To reduce the process time, first discuss the legitimacy of your request with the community in the Project chat or in the Wikiprojects's talk page. Please refer to previous discussions justifying the task in your request.

For botflag requests, see Wikidata:Requests for permissions.

Tools available to all users which can be used to accomplish the work without the need for a bot:

  1. PetScan for creating items from Wikimedia pages and/or adding same statements to items
  2. QuickStatements for creating items and/or adding different statements to items
  3. Harvest Templates for importing statements from Wikimedia projects
  4. Descriptioner for adding descriptions to many items
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2018/09.
Filing cabinet icon.svg
SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 2 days.

Project
chat

Lexicographical
data

Administrators'
noticeboard

Development
team

Translators'
noticeboard

Request
a query

Requests
for deletions

Requests
for comment

Bot
requests

Requests
for permissions

Property
proposal

Properties
for deletion

Partnerships
and imports

Interwiki
conflicts

Bureaucrats'
noticeboard

You may find these related resources helpful:

High-contrast-document-save.svg Dataset Imports    High-contrast-view-refresh.svg Why import data into Wikidata.    Light-Bulb by Till Teenck.svg Learn how to import data    Noun project 1248.svg Bot requests    Question Noun project 2185.svg Ask a data import question

Resolving Wikinews categories from "wikimedia category"[edit]

Request date: 12 August 2017, by: Billinghurst (talkcontribslogs)

Link to discussions justifying the request
Task description

Numbers of Wikinews items which are categories have been labelled as instance of (P31) -> Wikimedia category (Q4167836) which I am told is incorrect. It would be worthwhile running a query where this exists and removing these statements, and the corresponding labels where they exist in interwikilink isolation on an item, and generating a list where they exist among a bundle of sister interwikilinks. This will enable an easier merge of those independent items to their corresponding items that exist, and a tidy up of items that are confused.

Discussion

Lymantria (talkcontribslogs) and I were at cross-purposes as items with that statement were being merged to items. We had a confluence of rule issues, merging of wikimedia category items to items, versus the merging of Wikinews categories to items. Now better understood.

Request process

Redirects after archival[edit]

Request date: 11 September 2017, by: Jsamwrites (talkcontribslogs)

Link to discussions justifying the request
Task description

Retain the links to the original discussion section on the discussion pages, even after archival by allowing redirection.

Licence of data to import (if relevant)
Discussion


Request process

Copy taxon common name to alias[edit]

We need a bot, please, to copy values from taxon common name (P1843), to aliases in the appropriate languages, like this edit. A periodic update would be good, too. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:00, 15 November 2017 (UTC)

At least two steps should happen before this:
  1. copy to labels where missing
  2. update labels/aliases with different capitalization (upper to lowercase)
At the moment, there are around 300,000 (!) aliases to be added. Matěj Suchánek (talk) 18:14, 14 December 2017 (UTC)
This is not a good idea at all. Please don't. - Brya (talk) 17:35, 27 December 2017 (UTC)
Yet another vague, negative comment, with no justification offered. Meanwhile, here's a quote from Help:Aliases (emboldening added): "The label on a Wikidata entry is the most common name that the entity would be known by to readers. All of the other common names that an entry might go by, including alternative names; acronyms and abbreviations; and alternative translations and transliterations, should be recorded as aliases". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:13, 27 December 2017 (UTC)
This Help:Aliases was written a very long time ago, at a time when Wikidata was envisioned as just a depository of sitelinks. - Brya (talk) 18:23, 27 December 2017 (UTC)
Indeed it was written a long time ago - and it remains current. But Wikidata was never "envisioned as just a depository of sitelinks". Still no justification offered. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:21, 27 December 2017 (UTC)
Plenty of users still envision Wikidata as just a depository of sitelinks, even today. And clearly Help:Aliases was written from that perspective. - Brya (talk) 05:08, 28 December 2017 (UTC)
"Plenty of users..." While you appear to be making things up, or at the very least making claims without substantiating them, there can be no useful dialogue with you. And please stop double-indenting your comments; as explained to you previously, it breaks the underlying HTML markup. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:44, 28 December 2017 (UTC)
To anybody sane, HTML is a tool to be used towards a purpose. Regarding HTML as a purpose onto itself is going beyond mere idiosynchrasy. - Brya (talk) 17:53, 28 December 2017 (UTC)
Sounds like a good idea to me. I think it's important to link the common names to the proper species name. I am not sure about the specific implementation. For example, I don't have a strong preference to have the taxon name or the common name as label, but the other should be an alias for sure. That makes it a lot easier to recognize the proper entry. --Egon Willighagen (talk) 14:34, 29 December 2017 (UTC)
Just to cite a current problem involving the "interpretation" of a common name: Wikidata:Project_chat#Xantus's_Murrelet. As far as I know we had a major import from Wikispecies without any references and some referenced additions via IOC, IUCN or Wörterbuch der Säugetiernamen - Dictionary of Mammal Names (Q27310853) done by my bot. I object to add any unreferenced common names as aliases. If have no problem with the latter onces, if there is an agreement here to do so. --Succu (talk) 21:33, 29 December 2017 (UTC)
Yes, there are at least two problem sets of common names
  • those moved from Wikispecies {by Andy Mabbett). After this had been done, at Wikispecies they did not want common names from Wikidata imported into Wikispecies because of the dubiousness of this content.
  • those from ARKive; proposed by Andy Mabbett as a property as being a "repository of high-quality, high-value media" but used by some user to "import common names" from. Many of these are not just dubious, but outright bogus. - Brya (talk) 05:39, 30 December 2017 (UTC)
Your indentation fixed, again. The idea that Wikispecies did not want to use content from, er, Wikispecies is, of course, laughable. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:39, 30 December 2017 (UTC)
Laughable? I doubt you can show us a discussion where people at Wikispecies are eager to drop there content related to taxon common name (P1843) and reimport it from here. --Succu (talk) 20:15, 30 December 2017 (UTC)

No cogent reason why taxon common names are not aliases has been given; can someone action this, please? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:26, 25 February 2018 (UTC)

Two prerequisites are given by Matěj. Do you agree with them? --Succu (talk) 22:32, 25 February 2018 (UTC)
The former, certainly; the latter could be done, but otherwise we already have bot which does that on a regular basis. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:13, 1 March 2018 (UTC)

The taxonomy and what common names difine are are not necessarily the same. Your assumption is that you can retrofit common names on what science thinks up. Do you have prove for that? Or does it take a single example to undo this folly? Thanks, GerardM (talk) 07:23, 2 March 2018 (UTC)

Oh go on: please show us an example of an item with a valid entry for taxon common name (P1843) that is not valid as an alias for that item. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:40, 5 April 2018 (UTC)
No example? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:16, 9 April 2018 (UTC)

Can someone action this request, please? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:26, 16 June 2018 (UTC)

Fetch coordinates from OSM[edit]

We have many items with an OSM relation ID (P402), and no coordinates:

SELECT ?item ?itemLabel WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
  ?item wdt:P402 ?OpenStreetMap_Relation_identifier.
  MINUS { ?item wdt:P625 ?coordinate_location. }
}
LIMIT 1000

Try it!

Can someone please fetch the coordinates from OSM? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:59, 27 November 2017 (UTC)

an OSM relation id sometimes denotes a rather big area (a municipality, a nature reserve, etc.). What coordinates to take? --Herzi Pinki (talk) 19:19, 27 November 2017 (UTC)
The centroid. If that's not possible, the centre of the bounding box. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:39, 27 November 2017 (UTC)
@Pigsonthewing: It's a bit late to comment and I'm sure you're aware of this, but this won't work for public transport and route relations because they're either lines or made up of other relations, and won't work for objects with multiple locations like Toys "R" Us (Q696334). It's probably not a good idea to add coordinates to those with an automated process unless the resulting coordinates accurately represent a point that is actually on the line, or represent all points of a multiple-node relation (and neither of these may be desirable). Jc86035 (talk) 13:06, 27 December 2017 (UTC)
Sets of objects with multiple locations should not be in an OSM relation. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:38, 27 December 2017 (UTC)
What about ODBL license of OSM data? Mateusz Konieczny (talk) 14:45, 27 December 2017 (UTC)
From what I checked: importing from OSM would be violation of its license, therefore I propose to reject this proposal and remind importers to check copyright status of imported data Mateusz Konieczny (talk) 09:34, 10 May 2018 (UTC)
You assume that OSM is correct to assert copyright ownership over individual facts, or that its database rights in the EU apply to Wikidata as a US-hosted initiative; it is far from clear that this is the case for either scenario. See also Wikidata:Project chat#‎OpenStreetMap. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:56, 10 May 2018 (UTC)
It appears that Wikimedia legal team disagrees with you For EU databases, bots or other automated ways of extracting data should also be avoided because of the Directive’s prohibition on “repeated and systematic extraction” of even insubstantial amounts of data from https://meta.wikimedia.org/wiki/Wikilegal/Database_Rights#Conclusion Mateusz Konieczny (talk) 08:18, 11 May 2018 (UTC)
I suggest you read the many disclaimers at the start of that page; and always refer to them when citing it. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:32, 16 June 2018 (UTC)

Rank P150 (sub-admin regions) as preferred[edit]

Request date: 31 January 2018, by: Yurik (talkcontribslogs)

Link to discussions justifying the request
  • This was mentioned to me by @Jura1:. I'm not sure if this is controversial enough to warrant a big discussion.
Task description

Set rank=preferred on all contains administrative territorial entity (P150) statements that do not have end time (P582) qualifier if some of the entity's P150 have end time, and some do not. Without this, querying for wdt:P150 returns both the old and current results together. In some cases, P150 with q:P582 were set incorrectly as deprecated, leaving the current ones as "normal rank". These should also be fixed. CC: @Laboramus:.

Discussion
Request process

Semi-automated import of information from Commons categories containing a "Category definition: Object" template[edit]

Request date: 5 February 2018, by: Rama (talkcontribslogs)

Link to discussions justifying the request
Task description

Commons categories about one specific object (such as a work of art, archaeological item, etc.) can be described with a "Category definition: Object" template [1]. This information is essentially a duplicate of what is or should be on Wikidata.

To prove this point, I have drafted a "User:Rama/Catdef" template that uses Lua to import all relevant information from Wikidata and reproduces all the features of "Category definition: Object", while requiring only the Q-Number as parameter (see Category:The_Seated_Scribe for instance). This template has the advantage of requesting Wikidata labels to render the information, and is thus much more multi-lingual than the hand-labeled version (try fr, de, ja, etc.).

I am now proposing to deploy another script to do the same thing the other way round: import data from the Commons templates into relevant fields of Wikidata. Since the variety of ways a human can label or mislabel information in a template such as "Category definition: Object", I think that the script should be a helper tool to import data: it is to be ran on one category at a time, with a human checking the result, and correcting and completing the Wikidata entry as required. For now, I have been testing and refining my script over subcategories of [2] Category:Ship models in the Musée national de la Marine. You can see the result in the first 25 categories or so, and the corresponding Wikidata entries.

The tool is presently in the form of a Python script with a simple command-line interface:

./read_commons_template.py Category:Scale_model_of_Corse-MnM_29_MG_78 reads the information from Commons, parses it, renders the various fields in the console for debugging purposes, and creates the required Wikibase objects (e.g: text field for inventory numbers, Q-Items for artists and collections, WbQuantity for dimensions, WbTime for dates, etc.)
./read_commons_template.py Category:Scale_model_of_Corse-MnM_29_MG_78 --commit does all of the above, creates a new Q-Item on Wikidata, and commits all the information in relevant fields.

Ideally, when all the desired features will be implemented and tested, this script might be useful as a tool where one could enter the

Licence of data to import (if relevant)

The information is already on Wikimedia Commons and is common public knowledge.

Discussion


Request process

Remove statement with Gregorian date earlier than 1584 (Q26961029)[edit]

SELECT ?item ?property (YEAR(?year) as ?yr)
{
    hint:Query hint:optimizer "None".
    ?a pq:P31 wd:Q26961029 .
    ?item ?p ?a .
    ?a ?psv ?x .
    ?x wikibase:timeValue ?year .
    ?x wikibase:timePrecision 7 .        
    ?x wikibase:timeCalendarModel wd:Q1985727 .        
    ?property wikibase:statementValue ?psv    .   
    ?property wikibase:claim ?p     .  
}
LIMIT 15704

Try it!

The above dates have year precision and Proleptic Gregorian calendar (Q1985727) as calendar model. I think they could be converted to Julian and the qualifier statement with Gregorian date earlier than 1584 (Q26961029) removed.
--- Jura 09:19, 24 February 2018 (UTC)

Symbol support vote.svg Support --Marsupium (talk) 23:01, 28 April 2018 (UTC)
Presumably some such statements are actually intended to be Gregorian year, no? --Yair rand (talk) 02:56, 5 September 2018 (UTC)

Item documentation[edit]

For every item which has something on its talk page a bot could usefully prepend {{Item documentation}} as in this edit.

It would be good if one of the active maintenance bots could then do this as new talk pages are created. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:54, 4 March 2018 (UTC)

Remove "Wikimedia list article"@en from items that don't have P31=Q13406463[edit]

SELECT *
WHERE
{
    ?item schema:description "Wikimedia list article"@en 
    MINUS { ?item wdt:P31 wd:Q13406463 }
}

Try it!

Currently 12514 items. Sample items: Q99676, Q115702, Q117073, Q117208, Q127301.

The above either got a description before P31 was changed or prematurely. It would be good if maintenance was done on these. The same could probably be done for other languages considered important.

This is similar to Wikidata:Bot_requests/Archive/2018/02#Nations at games.
--- Jura 14:41, 7 March 2018 (UTC)

I have a couple of suggestions for that query:
1) Some items are instances of a subclass of Q13406463, but not Q13406463 itself (e.g. List of people with surname Carey (Q23044947) has P31:list of persons (Q19692233)). It might not be wise to remove the description from these.
2) If the English label starts with the word 'List', even if it doesn't have P31/P279*=Q13406463, the item is most likely a list (e.g. List of Trinidadian football transfers 2013–14 (Q16258717)). I browsed through some of these, and it seems that they are likely have problems, but those are best solved manually.
If we take the two above mentioned points into account, we have the following query:
SELECT ?item ?itemLabel
WHERE
{
    ?item schema:description "Wikimedia list article"@en .
    MINUS { ?item wdt:P31/wdt:P279* wd:Q13406463 }
    FILTER NOT EXISTS {
      ?item rdfs:label ?enLabel.
      FILTER(LANG(?enLabel) = 'en') 
      FILTER(STRSTARTS(lcase(STR(?enLabel)), 'list'))
    }
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
Try it!
The query currently returns 4254 items. --Shinnin (talk) 19:12, 25 March 2018 (UTC)
Good point .. I hadn't recalled lists having subclasses. So, yes, the 4000 "lists of persons" don't need the description removed. At some point, it seems that subclasses get carried away and include things like "Belgium in the Eurovision Song Contest" (60 instances), "aspect of history", etc. P279 at article about events in a specific year or time period (Q18340514) seems to be the source of that. Looking one of the samples from your query [3] reminded me where some of the incorrect P31 came from. Seems that cleanup hasn't gone all the way through. I will try to sort out the subclasses. I think the query should work (for English). Sorting out the subclasses will add more.
--- Jura 20:06, 25 March 2018 (UTC)

add freq=10 to lists[edit]

An update of the lists at Special:PrefixIndex/Wikidata:WikiProject_Medicine/Hospitals_by_country/ is currently attempted daily. I think every 10 days would be sufficient.

Please add |freq=10 to Template:Wikidata list. @Tobias1984:, you create them some time ago. Is this ok with you?
--- Jura 00:56, 8 March 2018 (UTC)

Oppose. Looking at some examples, there are few changes. So, there are not that many updates. But if there is a change, then it should be carried out soon. A vandal could act short before the update and then it will be visible for 10 days. Why that? Daily is a good compromise. 77.179.94.217 17:15, 14 April 2018 (UTC)
Where is the compromise? It's a daily run for 250 lists, most with not much content. Maybe the whole set should go.
--- Jura 17:18, 14 April 2018 (UTC)

Supposedly one shouldn't follow up on T's comments. Maybe Tobias1984 wants to add something.
--- Jura 17:26, 14 April 2018 (UTC)
Who is T and where is their comment? Please try to write in less encrypted way. --Pasleim (talk) 15:46, 15 April 2018 (UTC)
Tama .. what's his name? I think. Nothing to do with Tobias1984.
--- Jura 15:48, 15 April 2018 (UTC)
Symbol oppose vote.svg Oppose neither a motivation nor a beneficial outcome provided --Pasleim (talk) 16:49, 15 April 2018 (UTC)
It saves 100s of queries every day. There isn't really a benefit in running this on a daily basis. Many reference lists run monthly (e.g. those of Sum of All Paintings) or weekly. Obviously, if any maintenace was done on the lists, whatever other frequency is preferred should do. Unfortunately, I might have been the only one having done it recently.
--- Jura 16:58, 15 April 2018 (UTC)
Given the ~5 millions daily query requests, saving 200 requests seems marginal to me. --Pasleim (talk) 17:28, 15 April 2018 (UTC)
It's just one group .. we can't fix everything at once. The feature was introduced for that very reason.
--- Jura 11:19, 18 April 2018 (UTC)

Crossref Journals[edit]

Request date: 27 March 2018, by: Mahdimoqri (talkcontribslogs)

Link to discussions justifying the request
Task description
  • Add missing journals from Crossref
Licence of data to import (if relevant)
Discussion


Request process

"place on Earth" descriptions[edit]

Can someone remove the descriptions added by the batches mentioned on AN. The users shouldn't have edited and the descriptions are fairly useless. Please delete all in a single edit.
--- Jura 08:38, 15 May 2018 (UTC)

"AN" link is dead. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:41, 29 May 2018 (UTC)
Not anymore. Matěj Suchánek (talk) 08:45, 22 July 2018 (UTC)
Dead, as of just now Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:12, 1 September 2018 (UTC)

URLs in descriptions cleanup[edit]

Some imports have URLs in descriptions, e.g. Q19652983. I think they should be removed or added as statements. @Magnus Manske:
--- Jura 09:00, 21 May 2018 (UTC)

In the given example the URL slug should be moved to a dedicated identifier property. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:40, 29 May 2018 (UTC)

Migrating all LeMonde.fr urls to https[edit]

Request date: 27 May 2018, by: Teolemon (talkcontribslogs)


Task description

The major French newspaper just migrated to HTTPS by default a couple of days ago. Can a handler get his bot to process all the http://www.lemonde.fr urls to turn them into https://www.lemonde.fr ? Teolemon (talk) 11:06, 27 May 2018 (UTC)


Discussion


Request process

Populating Commons maps category (P3722)[edit]

Request date: 30 May 2018, by: Thierry Caro (talkcontribslogs)

Link to discussions justifying the request
  • None.
Task description

Take all instances of subclasses of geographical object (Q618123). Look for those that have a Commons category (P373) statement and visit the corresponding Commons category. If it includes another category that has Maps of and then its name as its own name, import this value as Commons maps category (P3722) to the item. This would be useful to the French Wikipedia, where we now have no label (Q54473574) automatically populated through Template:Geographical links (Q28528875).

Licence of data to import (if relevant)
Discussion
Request process

Labels for Wiktionary categories[edit]

I am running into a multitude of Wiktionary categories which has sitelink in language A but not label in A. Can anyone please add labels automatically? And please don't omit expressions in parentheses! @ValterVB:? --Infovarius (talk) 13:22, 30 May 2018 (UTC)

Even AutoEdit doesn't work for Wiktionary sitelinks... --Infovarius (talk) 13:31, 30 May 2018 (UTC)
And labelless items are created by PetScan (can it be changed?) so their number is constantly increasing and there is need in a regular work. --Infovarius (talk) 13:35, 30 May 2018 (UTC)
@Infovarius: The dumps have become too big (too much time to download and unzip), so I stopped to download them and update label description, or do other weekly task that using dump. If differential dump will be available, probably I restart my periodical task. Alternatively, I can do it with a SPARQL query (if I find a suitable one) --ValterVB (talk) 17:25, 30 May 2018 (UTC)
@ValterVB: Can you please run along ru-pages without ru-label? 3750 now:
SELECT ?item ?sitelink ?itemLabel ?title WHERE {
  ?sitelink schema:isPartOf <https://ru.wiktionary.org/>;
     schema:about ?item; schema:name ?title 
    SERVICE wikibase:label { bd:serviceParam wikibase:language "ru,[AUTO_LANGUAGE],en" } .
   FILTER(NOT EXISTS {
   ?item rdfs:label ?lang_label.
   FILTER(LANG(?lang_label) = "ru") #with missing Russian label
 })
} ORDER BY ?itemLabel

Try it!

 – The preceding unsigned comment was added by Infovarius (talk • contribs).

  • @Infovarius: Interesting discovery. I started adding some of the missing labels and inserted "schema:name" in your query.
    --- Jura 16:02, 1 June 2018 (UTC)

All ✓ Done except for 4 item for conflict label+description:

--ValterVB (talk) 09:39, 3 June 2018 (UTC)

Thanks for work, Jura and Valter! May I ask for harder task: to add aliases from Wiktionary page titles if they are different from the item label? --Infovarius (talk) 21:07, 3 June 2018 (UTC)

category's main topic (P301) for languages[edit]

The following lack P301 http://petscan.wmflabs.org/?psid=4625759 . Most have an English Wiktionary category page with a description that includes a link to an English Wikipedia article, a language code or even the QID to use. This is mostly added through a series of templates there.
--- Jura 11:08, 3 June 2018 (UTC)


CAPS cleanup[edit]

There a series of entries like Q55227955 created by bot that have the names inverted and family names in CAPS. In the meantime, these labels have been copied to some other languages.
--- Jura 09:12, 5 July 2018 (UTC)

  • Here is a list: [4] (currently 295) and some that already got merged [5] (currently 8).
    --- Jura 08:49, 7 July 2018 (UTC)

Migrate to P1480[edit]

Request process

Request date: 12 July 2018, by: Swpb (talkcontribslogs)

Link to discussions justifying the request
Task description

Migrate all statements xnature of statement (P5102)  disputed (Q18912752) to xsourcing circumstances (P1480)  disputed (Q18912752) (see query). Swpb (talk) 17:47, 12 July 2018 (UTC)

Discussion
  • Symbol support vote.svg Support --Jarekt (talk) 14:05, 17 September 2018 (UTC)
Request process

Normalize references[edit]

Request date: 13 July 2018, by: Marsupium (talkcontribslogs)

Link to discussions justifying the request
Task description

Often one source website or database is indicated inconsistently in various manners. To improve this situation some queries and following edits on references could be made. This is a task that would best be done continuously and gradually adapted to more cases. Thus, perhaps this task fits well in the work field of DeltaBot, User:Pasleim?

  1. Add ID property (and if feasible also stated in (P248)) to references with according reference URL (P854) where missing.
  2. Add stated in (P248) to references with according ID property where missing.
  3. (For later: Merge references where a source website or database is used twice (accidentally).)

The issue exits for ULAN ID (P245), RKDartists ID (P650) and probably many more source websites or databases. I have examined ULAN ID (P245) and those would be the queries and edits to be done:

  1. SELECT ?entity ?prop ?ref ?id WHERE {
      ?entity ?prop ?statement.
      ?statement prov:wasDerivedFrom ?ref.
      ?ref pr:P854 ?refURL.
      MINUS { ?ref pr:P245 []. }
      FILTER REGEX(STR(?refURL), "^https?://(vocab.getty.edu/(page/)?ulan/|www.getty.edu/vow/ULANFullDisplay.*?&subjectid=)")
      BIND(REPLACE(STR(?refURL),"^https?://(vocab.getty.edu/(page/)?ulan/|www.getty.edu/vow/ULANFullDisplay.*?&subjectid=)","") AS ?id)
    }
    LIMIT 500
    
    Try it! → add    / ULAN ID (P245)?id to the reference (now >1k cases)
  2. SELECT ?entity ?prop ?ref WHERE {
      ?entity ?prop [ prov:wasDerivedFrom ?ref ].
      ?ref pr:P245 [].
      MINUS { ?ref pr:P248 wd:Q2494649. }
    }
    
    Try it! → add    / stated in (P248)Union List of Artist Names (Q2494649) to the reference (now ca. 70 cases)

Thanks for any comments!

Discussion


Request process

heavy coord job[edit]

Request date: 14 July 2018, by: Conny (talkcontribslogs)

Link to discussions justifying the request
Task description

Wikidata Items with pictures from commons and the same coordinates on both projects should in Wikidata display the information, is the picture camera or object possition.

Discussion

@Conny: It's unclear what you want. Please explain it in more detail and provide an example edit. On Property talk:P625#camera/object_position I voiced my doubt if this should be on Wikidata. Multichill (talk) 14:06, 17 July 2018 (UTC)

@Multichill: We can exchange here again :) . My sample is on linked talkpage. Thank you, Conny (talk) 04:17, 18 July 2018 (UTC).
Request process

Normalize dates with low precision[edit]

Request date: 17 July 2018, by: Jc86035 (talkcontribslogs)

Link to discussions justifying the request
Task description

Currently all times between 1 January 1901 and 31 December 2000 with a precision of century are displayed as "20. century". If one enters "20. century" into a date field the date is stored as the start of 2000 (+2000-00-00T00:00:00/7), which is internally interpreted as 2000–2099 even though the obvious intent of the user interface is to indicate 1901–2000. Since the documentation conflicts with the user interface, there is no correct way to interpret this information, and some external tools which interpret the data "correctly" do not reflect editors' intent.

This is not ideal, and since it looks like this isn't going to be fixed in Wikibase (I have posted at Wikidata:Contact the development team but haven't got a response yet, and the Phabricator bugs have been open for years), I think a bot should convert every date value with a precision of decade, century or millenium to the midpoint of the period indicated by the label in English, so that there is less room for misinterpretation. For example, something that says "20. century" should ideally be converted to +1951-00-00T00:00:00/7 (or alternatively to +1950-00-00T00:00:00/7), so that it is read as 1901–2000 by humans looking at the item, as 1901–2000 (or 1900–1999) by some external tools, and as 1900–1999 by other external tools.

Classes of dates this would apply to:

  • Decades (maybe) – e.g. dates within 1950–1959 to +1954-00-00T00:00:00/8
  • Centuries – e.g. dates within 1901–2000 to +1951-00-00T00:00:00/7 or +1950-00-00T00:00:00/7 (depending on what everyone prefers)
  • Millennia – e.g. dates within 1001–2000 to +1501-00-00T00:00:00/6 or +1500-00-00T00:00:00/6

For everything less accurate (and over five digits), the value is displayed as "X years" (e.g. −10000-00-00T00:00:00Z/5 is displayed "10000 years BCE"). Incorrect precisions for years under five digits could be otherwise fixed, but it looks like the user interface just doesn't bother parsing them because people don't name groups of a myriad years.

While this is obviously not perfect and not the best solution, it is better than waiting an indefinite time for the WMF to get around to it; and if the user interface is corrected then most of the data will have to be modified anyway. Values which have ambiguous meaning (i.e. those which can be identified as not having been added with the wikidata.org user interface) should be checked before normalization by means of communication with the user who added them. Jc86035 (talk) 11:33, 17 July 2018 (UTC) (edited 14:13, 17 July 2018 (UTC) and 16:16, 17 July 2018 (UTC))

Discussion

I think the reasoning applies to centuries and millenia. I'd have to think about it a bit longer for decades. While I'm thinking, perhaps Jc86035 would clarify the request by explicitly stating every precision the request applies to. Also, when discussing precision, I find it inadvisable to use the terms "lower" or "greater". Terms such as "better" and "worse" or "looser" and "tighter" seem better to me.

I suppose this bot would have to be run on a regular schedule until the problem is fixed by the developers (or the Wikidata project is shuttered, whichever comes first). Jc3s5h (talk) 12:19, 17 July 2018 (UTC)

@Jc3s5h: I think running it at least once a day would be good. I've edited the proposal so that it only applies to decades, centuries and millennia, because Wikibase handles less precise dates differently (in general the time handling seems very nonstandard to me, probably because most people don't need to represent 13.7 billion years ago in the same time format as yesterday). Jc86035 (talk) 14:48, 17 July 2018 (UTC)
  • Symbol support vote.svg Support (edit conflict) I think it is a good idea. I also found it confusing on ocassion. If there are templates and other tools that interpret Wikidata dates incorectly that is their bug and it is beyond our control to debug each tool that uses Wikidata. However I think it is a good idea to default to the mid-point in the time period for some of the confusing ones. ( I would not do it for years, but for decaded, centuries, millenia, that would be fine. --Jarekt (talk) 12:28, 17 July 2018 (UTC)
  • Symbol oppose vote.svg Oppose looks like a work around instead of solving the real problem. Multichill (talk) 14:02, 17 July 2018 (UTC)
    @Multichill: Yes, obviously this isn't the best solution, but the Phabricator bug is three years old now so it's not like Wikibase's date handling is suddenly going to be improved after years of nothing, so we may as well deal with it regardless of the pace of software development. The longer the issue sits around, the more data everyone has to revalidate after the issue is fixed. (They don't have enough staff to deal with comparatively minor things like this. Dozens of bugs e.g. in Kartographer have been closed as wontfix just because "there's no product manager".) Jc86035 (talk) 14:23, 17 July 2018 (UTC)
    Furthermore, currently ISO 8601 necessitates us using things like earliest date (P1319) and latest date (P1326) if there's any sort of non-trivial uncertainty range, yet Wikibase stores the user's initial value anyway. Wikibase does a lot of odd things like the aforementioned non-standard century handling and allowing "0 BCE" as a valid date. I don't think they have the resources to fix stuff like this. Jc86035 (talk) 14:43, 17 July 2018 (UTC)
  • Question: As an example, if the bot finds the datestamp +1900-01-01T00:00:00Z with precision 7, should it convert it to +1950-01-01T00:00:00Z or +1850-01-01T00:00:00Z? I do not believe a bot can trace back to when an entry was last changed and see if it was changed interactively or through the API. If through the API, I think the year should be 1950. If interactively, I think it should be 1850. Perhaps we could somehow examine the history of contributions, and see how many dates are entered interactively vs. the API. If one is vastly more frequent than the the other, we could go with whatever predominates.
Whatever we do, we should stay away from any year before AD 101. As Jc86035 points out, the issues are compounded for BCE, and there are also some tricky points before 101. Jc3s5h (talk) 21:39, 17 July 2018 (UTC)
  • @Jc3s5h: Could you not find it through differing edit summaries (e.g. lots of additions by one user with #quickstatements)? I think it would be difficult but it would be possible with something like the WikiBlame tool. Jc86035 (talk) 06:06, 18 July 2018 (UTC)
    I would like to see a substantial sample, which would have to be gathered automatically. For example, all the date edits make on Wikidata for an entire day on each of 100 randomly chosen days. Jc3s5h (talk) 11:56, 18 July 2018 (UTC)
Request process

Request date: 21 July 2018, by: Микола Василечко (talkcontribslogs)

Link to discussions justifying the request
Task description
Licence of data to import (if relevant)
Discussion


Request process

Replace[edit]

In ukrainian: сторінка значень в проекті Вікімедіасторінка значень проекту Вікімедіа --Микола Василечко (talk) 20:05, 21 July 2018 (UTC)

That's 1,000,000+ replacements. Was there any discussion justifying this task? Or why should it be done? Matěj Suchánek (talk) 08:21, 22 July 2018 (UTC)
I'm thinking... Wouldn't such massive changes be better done in DB directly... Say "update ... set ukdesc='сторінка значень проекту Вікімедіа' where ukdesc='сторінка значень в проекті Вікімедіа'". --Edgars2007 (talk) 12:15, 22 July 2018 (UTC)
Maybe but I doubt this would be allowed. Don't forget you need to have a way to update JSON of items, propagate this to WDQS etc... Matěj Suchánek (talk) 12:56, 22 July 2018 (UTC)

Fix labels for P31=Q5[edit]

A series of items about people were created recently that included the enwiki disambiguation. Sample: Q55647666. This should be removed from the label.
--- Jura 15:17, 24 July 2018 (UTC)

Harvesting data from MusicBrainz[edit]

I got a list with 143,503 artists that have an entry at MusicBrainz. A bot can harvest data (links and misc info like birthdate) from the MusicBrainz database and add it to the WikiData. An example of such harvesting can be found here - adding an AllMusic ID found in the MusicBrainz database. My list is in CSV format and it looks like this:

  • http://www.wikidata.org/entity/Q71616,Max von Schenkendorf,d120de82-a486-4ec0-ab64-0c2c3a5b46f8
  • http://www.wikidata.org/entity/Q71626,DJ Dean,b768f66f-3260-41ec-941d-e7c82b1cb87b

and it is actually the result of this script - initial discussion. -- OneMusicDream (talk) 23:48, 30 July 2018 (UTC)

@OneMusicDream: Because MusicBrainz is a wiki (a user editable website), with no sourcing, you may hit WD:BLP issues importing personal data like birth dates. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:21, 3 August 2018 (UTC)
@Pigsonthewing: Thanks for your reply. Now I was thinking about this: If the MusicBrainz birth dates have a rate of errors like X%, and X is acceptably low, then maybe it's worth having a huge amount of data instead of not having it. Such errors will be corrected in time. If it's not worth collecting it, then I think that collecting at least the external links (or just a part of them, like only Discogs and Allmusic links) about those artists is probably worth doing. Partial collecting. And then maybe in the future doing more partial collecting. And so on. OneMusicDream (talk) 16:41, 5 August 2018 (UTC)

Cleanup bot description[edit]

Can someone fix the following descriptions:

SELECT *
{
    ?item schema:description "Non-profit organisation in the USA"@en 
}

Try it! . Currently some 1300
--- Jura 22:09, 3 August 2018 (UTC)

 Doing… [7] Matěj Suchánek (talk) 07:45, 4 August 2018 (UTC)
  • Thanks for fixing the caps. As it's in the US, maybe "organization" should be used too.
    --- Jura 07:58, 4 August 2018 (UTC)

Database of Classical Scholars[edit]

Request date: 9 August 2018, by: Jonathan Groß (talkcontribslogs)

Task description

The Database of Classical Scholars has shifted their database to a new home, and in the process changed all IDs. While this might be regrettable in itself, the editors of the database are very forthcoming in trying to alleviate the problem. They sent me a chart (link to GoogleDrive) which, among others, contains in columns G and H the OLD and NEW IDs.

The main task would be to fetch all items with Database of Classical Scholars ID (P1935) and replace old IDs with new ones. The links would also have to be adapted to the new format: Contrary to before, links to entries in the database are no longer just IDs but also require a combination of surname and given name(s). The chart has these values in columns B to D. The new IDs consist of ID[four arabic digits]-SURNAME[in caps]-Firstname-Lastname.

In addition, one could check corresponding pages to these items on dewiki and enwiki which are using Template:DBCS (Q26006668): In case one of those templates has an old ID, replace it with the new one.

Many thanks in advance! Jonathan Groß (talk) 14:18, 9 August 2018 (UTC)

Discussion


Request process

Accepted by (Magnus Manske (talk) 14:31, 9 August 2018 (UTC)) and under process

Actually, this might not work as expected. For the new ID 8494, the URL is https://dbcs.rutgers.edu/all-scholars/8494-ABBOTT-Kenneth-Morgan but how can we link there with just the ID? https://dbcs.rutgers.edu/all-scholars/8494 doesn't work, and neither does https://dbcs.rutgers.edu/index.php?page=person&id=8494 . Not touching this until we clear this up. --Magnus Manske (talk) 14:38, 9 August 2018 (UTC)
Yes, that's a problem. I suggested to the editors that they install some sort of redirects, so that just-ID-links work as well. Let's see what they say. I'll get back to you then. Jonathan Groß (talk) 14:43, 9 August 2018 (UTC)
Given that the old URLs are broken anyway, I made a new Mix'n'match catalog, matched the entries according to the spreadsheet, and am now changing the IDs to the new ones. I preserved the old catalog (deactivated), in case we need it again. --Magnus Manske (talk) 15:10, 9 August 2018 (UTC)
Thanks a lot! Jonathan Groß (talk) 07:17, 10 August 2018 (UTC)

Hungarian citizenship[edit]

Request date: 12 August 2018, by: Bencemac (talkcontribslogs)

Task description

I would like to ask the following changes where instance of (P31) is human (Q5) and country of citizenship (P27) is Hungary (Q28). I tried to make their queries but I was not able to; please forgive me, I am bad at Wikidata Query (yet). If it is possible to do, I would have similar request in the future (before 1946). Bencemac (talk) 17:59, 12 August 2018 (UTC)

1. Born after 1989-10-23 (optional)

  • the person is still alive:
  • the person is dead

2. Born after 1949-08-20

  • the person died before 1989-10-23:
  • the person died after 1989-10-23:
  • the person is still alive:

3. Born after 1946-02-01

  • the person died before 1949-08-20:
  • the person died before 1989-10-23:
  • the person died after 1989-10-23:
  • the person is still alive:
Discussion
SELECT ?person ?personLabel ?birthdate ?deathdate 
WHERE {
  ?person wdt:P31 wd:Q5;
          wdt:P27 wd:Q28;
          wdt:P569 ?birthdate;
          wdt:P570 ?deathdate.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

Try it!

SELECT ?person ?personLabel ?birthdate ?deathdate 
WHERE {
  ?person wdt:P31 wd:Q5;
          wdt:P27 wd:Q16410;
          wdt:P569 ?birthdate;
          wdt:P570 ?deathdate.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

Try it!

@Bencemac: You can use the above queries to extract data from WD and export them into excel where you can perform your modification and then later import the changes into WD. But a query can't modify the data. You need another tool for that. Snipre (talk) 20:17, 28 August 2018 (UTC)

@Snipre: Thanks, but I do not want to do it myself. I have never done this before and I am afraid of doing something bad, so it would need an experienced user. Bencemac (talk) 14:39, 29 August 2018 (UTC)
Request process

Clean up multiple coordinates in wikiitems[edit]

Request date: 17 August 2018, by: Bouzinac (talkcontribslogs)

Link to discussions justifying the request
Topic:Uizegviq5kuoe2xx
Task description

For every wiki_item that has multiple coordinate location (P625) at same normal rank and no preferred rank, make only one coordinate location (P625) as "preferred". Order of preference : to be determined by what the bot is capable to do.

Licence of data to import (if relevant)
Discussion
IMO the proper solution is to decide which one is correct (or import it from elsewhere). It makes no sense to me to have a "more and less preferred" coordinates. Matěj Suchánek (talk) 11:06, 17 August 2018 (UTC)
  • From the discussion on French project chat, the request was limited to airports and the idea is to not set preferred ranks to coordinates of some wikis. Given that cebwiki coordinates are known to be rounded, I think it is reasonable to do that for these. Obviously, if there is a large difference between the values, one should probably check them manually.
    --- Jura 11:16, 17 August 2018 (UTC)
Request process

Incorrectly imported text fields[edit]

Request date: 17 August 2018, by: Jc86035 (talkcontribslogs)

Task description

Special:Search/all: u00 reveals that there are about 1,700 items with auto-generated labels/descriptions/other text fields which contain escape codes for Unicode characters. Special:Search/all: &amp; insource:/&amp;/ has some others, and there are probably more searches which would turn up errors like these. (Some of the descriptions, like the one for Tito Pérez (Q29521467), are also clearly for a different language to the one specified – for that item, the "en" description is clearly not in English.) I think these should be fixed, preferably by looking at the batch of edits in which the bad data was imported and fixing each batch of items. Jc86035 (talk) 18:13, 17 August 2018 (UTC)

Discussion
Yay for better indexing. I will probably restore Wikidata:Requests for permissions/Bot/MatSuBot 6. Matěj Suchánek (talk) 10:47, 18 August 2018 (UTC)
Request process

Importing data from epcrugby.com (Property P3666)[edit]

Request date: 27 August 2018, by: Blackcat (talkcontribslogs)

Hello, both the URL formatter and the data value for EPCR player ID (P3666) have changed. The most important is the latter, because the first can be easily changed. Basically now we have a different id for each player (i.e. for Martin Castrogiovanni (Q1039026) the old one was www.epcrugby.fr/info/archives_jouers.php?player=143&includeref=dynamic and the new one is https://www.epcrugby.com/player?PlayGuid=MC322193) which currently makes the property unusable.

Task description

In this sandbox i dumped all the item that use the said property. Is there a way to acquire the new values from EPCR web sites?

Licence of data to import (if relevant)
Discussion

@Blackcat: What's about the licence of the data in epcrugby.com ? Snipre (talk) 19:52, 28 August 2018 (UTC)

Honestly I don't know @Snipre:, anyway we don't have to import any data, only acquire the new ID for each player who played in the European rugby cups; we already have a property for that. -- Blackcat (talk) 20:19, 28 August 2018 (UTC)
When I locked at the first page, I found © 2018 Content European Professional Club Rugby, Statistical Data © European Professional Club Rugby. Even extracting the ID can be a problem if there is a systematic extraction by a bot. The best is to ask the website to explicitly free the IDs. Snipre (talk) 20:26, 28 August 2018 (UTC)
Request process

Bot to share PRELIB data[edit]

Request date: 29 August 2018, by: Hopala! (talkcontribslogs)

Link to discussions justifying the request
Task description

Hello, I am the database administrator of PRELIB, a database about literature in Breton edited by the CRBC, a research library in Brest (France) specializing in Brittany and Celtic countries. We already share data with IdRef (which identifiers are in use on Wikidata) and try to align our data with Wikidata (see for instance the page about Pierre-Jakez Hélias which references its equivalent in Wikidata). However, many people referenced in our database are not yet on Wikidata. We would be pleased to have the ability to create new wikidata item with pywikibot with our data. What is more, would it be possible to have a bot for test.wikidata.org (on which we already have a user account, Bihanrobot) ? Thanks,

Licence of data to import (if relevant)

The data is from books or publications by our researchers.

Discussion

@Hopala!: You need to create a bot and then to ask for a bot flag. Or if you provide the data set in excel or cvs, you can ask another bot operator to perform for you the data import. But first is more important to map your data model with the one of WD in order to be sure that your data import will will the data structure of WD. Snipre (talk) 08:36, 30 August 2018 (UTC)

Thanks @Snipre:. Bot created and permission requested. I would rather perform the import myself at the moment. I started to map my data with WD items but is it possible to test any Pywikibot scripts on test.wikidata.org ? --Hopala! (talk) 13:49, 30 August 2018 (UTC)
Request process

elevation above sea level (P2044) values imported from ceb-Wiki[edit]

Request date: 6 September 2018, by: Ahoerstemeier (talkcontribslogs)

Link to discussions justifying the request
  • Many items have their elevation imported from the Cebuano-Wikipedia. However, the way the bot created the values is very faulty, especially due to inaccurate coordinates the value can differ by up to 500m! Thus most of the values are utter nonsense, some are a rough approximation, but certainly not good data. To make things worse - the qualifier with imported from Wikimedia project (P143) often wasn't added. For an extreme example see Knittelkar Spitze (Q1777201).
Task description

Firstly, a bot has to add all the missing imported from Wikimedia project (P143) omitted in the original infobox harvesting. Secondly, especially for mountains and hills, the value has to be set to deprecated state, to avoid it to poison our good date.

Licence of data to import (if relevant)
Discussion


Request process

Fixing URLs in Sources[edit]

Request date: 12 September 2018, by: MichaelSchoenitzer (talkcontribslogs)

Task description

After an update of the software hosting the site all the links to tags in the gnome-gitlab don't work anymore. They are used heavily in sources for version-numbers. Can someone update the 240 links with an bot/script? From

https://git.gnome.org/browse/([^/]*)/tag/?h=(.*)

to

https://gitlab.gnome.org/GNOME/$1/tags/$2

Here's a query searching for the cases:

select ?item ?st ?url where {
  ?item p:P348 ?st.
  ?st prov:wasDerivedFrom ?src.
  ?src pr:P854 ?url.
  FILTER CONTAINS(STR(?url), "https://git.gnome.org/browse")
  }

Try it!

Discussion
Request process

Import and maintain nominal GDP for countries from the World Bank Data API[edit]

Request date: 17 September 2018, by: WDBot (talkcontribslogs)

Link to discussions justifying the request
Task description

A bot to load nominal GDP from the WorldBank API and write it to WikiData countries (property https://www.wikidata.org/wiki/Property:P2131).

  1. load the country information (retrieved from query.wikidata.org and copy-pasted in the script)
  2. iterate over each country
  3. check if data - Nominal GDP in US-Dollar - on WorldBank is available - if not go to the next country
  4. load the first value of wb data
    1. check over all nominal gdp properties if the value is available
    2. skip if value is available skip
    3. write if value is not available

Code: link to code[[8]]

You can find test edits here:

  • Bulgaria (example when there is no data): [[9]]
  • Germany (example with only one missing value for the year 2000): [[10]]
Licence of data to import (if relevant)

CC BY-4.0 - see here https://datacatalog.worldbank.org/public-licenses#cc-by

Discussion
I think the references should also have a property pointing to either the World Bank or its database, rather than only pointing to the URL. Maybe publisher (P123) or published in (P1433). --Yair rand (talk) 21:07, 17 September 2018 (UTC)
Hi Yair rand and thank you for your feedback. I have adjusted the script to write "publisher" too. Here you can see the example for France and USA on test.wikidata.org. You can see the new script here. --WDBot (talk) 20:24, 18 September 2018 (UTC)
If you need the approval for the bot, you're looking for Wikidata:Requests for permissions/Bot — regards, Revi 06:27, 19 September 2018 (UTC)
Request process