Wikidata:Bot requests

From Wikidata
Jump to navigation Jump to search
Bot requests
If you have a bot request, add a new section using the button and tell exactly what you want. To reduce the process time, first discuss the legitimacy of your request with the community in the Project chat or in the Wikiprojects's talk page. Please refer to previous discussions justifying the task in your request.

For botflag requests, see Wikidata:Requests for permissions.

Tools available to all users which can be used to accomplish the work without the need for a bot:

  1. PetScan for creating items from Wikimedia pages and/or adding same statements to items
  2. QuickStatements for creating items and/or adding different statements to items
  3. Harvest Templates for importing statements from Wikimedia projects
  4. Descriptioner for adding descriptions to many items
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2018/07.
Filing cabinet icon.svg
SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 2 days.

Project
chat

Lexicographical
data

Administrators'
noticeboard

Development
team

Translators'
noticeboard

Request
a query

Requests
for deletions

Requests
for comment

Bot
requests

Requests
for permissions

Property
proposal

Properties
for deletion

Partnerships
and imports

Interwiki
conflicts

Bureaucrats'
noticeboard

You may find these related resources helpful:

High-contrast-document-save.svg Dataset Imports    High-contrast-view-refresh.svg Why import data into Wikidata.    Light-Bulb by Till Teenck.svg Learn how to import data    Noun project 1248.svg Bot requests    Question Noun project 2185.svg Ask a data import question

items for segments[edit]

For many anthology film (Q336144), it can be worth creating an item for each segment (sample: Q16672466#P527). Such items can include details on director/cast/etc as applicable (sample: Q26156116).

The list of anthology films includes already existing items.

This task is similar to #duos_without_parts above.
--- Jura 10:14, 22 April 2017 (UTC)

What source could the bot use? Matěj Suchánek (talk) 07:25, 25 August 2017 (UTC)
Good question. I checked the first 12 films on list above and about half had a WP article detailing the episodes.
For automated import, the structure of these articles might not be sufficiently standard:
  • section header per episode (ru), (pl)
  • table with names of episodes (de),
  • section with more details for each episode (es), (ru), (ca)
Maybe a user would need to gather the list for each film and a tool would create segment items. I guess I could try that on a spreadsheet.
--- Jura 08:49, 25 August 2017 (UTC)
  • I think it's still worth doing.
    --- Jura 10:43, 1 March 2018 (UTC)

Import lighthouses from enwiki[edit]

Per discussion at Wikidata_talk:WikiProject_Lighthouses#enwiki_bot_import.3F, please import the remaining lighthouses at

http://petscan.wmflabs.org/?psid=1187483 (from w:Category:Pages using infobox Lighthouse needing Wikidata item)

by creating new items and adding the new qid to the enwiki template. There is a mapping of properties at Wikidata:WikiProject_Lighthouses/tools#Mapping_of_infobox_properties_for_lighthouses. Some fields may not be suitable for bot import and could be skipped.

All these lighthouses are in more general articles about the region/island/place. Articles may include another infobox about that.
--- Jura 06:17, 30 July 2017 (UTC)

  • I think it's still worth doing.
    --- Jura 10:43, 1 March 2018 (UTC)

Resolving Wikinews categories from "wikimedia category"[edit]

Request date: 12 August 2017, by: Billinghurst (talkcontribslogs)

Link to discussions justifying the request
Task description

Numbers of Wikinews items which are categories have been labelled as instance of (P31) -> Wikimedia category (Q4167836) which I am told is incorrect. It would be worthwhile running a query where this exists and removing these statements, and the corresponding labels where they exist in interwikilink isolation on an item, and generating a list where they exist among a bundle of sister interwikilinks. This will enable an easier merge of those independent items to their corresponding items that exist, and a tidy up of items that are confused.

Discussion

Lymantria (talkcontribslogs) and I were at cross-purposes as items with that statement were being merged to items. We had a confluence of rule issues, merging of wikimedia category items to items, versus the merging of Wikinews categories to items. Now better understood.

Request process

Redirects after archival[edit]

Request date: 11 September 2017, by: Jsamwrites (talkcontribslogs)

Link to discussions justifying the request
Task description

Retain the links to the original discussion section on the discussion pages, even after archival by allowing redirection.

Licence of data to import (if relevant)
Discussion


Request process

Copy taxon common name to alias[edit]

We need a bot, please, to copy values from taxon common name (P1843), to aliases in the appropriate languages, like this edit. A periodic update would be good, too. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:00, 15 November 2017 (UTC)

At least two steps should happen before this:
  1. copy to labels where missing
  2. update labels/aliases with different capitalization (upper to lowercase)
At the moment, there are around 300,000 (!) aliases to be added. Matěj Suchánek (talk) 18:14, 14 December 2017 (UTC)
This is not a good idea at all. Please don't. - Brya (talk) 17:35, 27 December 2017 (UTC)
Yet another vague, negative comment, with no justification offered. Meanwhile, here's a quote from Help:Aliases (emboldening added): "The label on a Wikidata entry is the most common name that the entity would be known by to readers. All of the other common names that an entry might go by, including alternative names; acronyms and abbreviations; and alternative translations and transliterations, should be recorded as aliases". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:13, 27 December 2017 (UTC)
This Help:Aliases was written a very long time ago, at a time when Wikidata was envisioned as just a depository of sitelinks. - Brya (talk) 18:23, 27 December 2017 (UTC)
Indeed it was written a long time ago - and it remains current. But Wikidata was never "envisioned as just a depository of sitelinks". Still no justification offered. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:21, 27 December 2017 (UTC)
Plenty of users still envision Wikidata as just a depository of sitelinks, even today. And clearly Help:Aliases was written from that perspective. - Brya (talk) 05:08, 28 December 2017 (UTC)
"Plenty of users..." While you appear to be making things up, or at the very least making claims without substantiating them, there can be no useful dialogue with you. And please stop double-indenting your comments; as explained to you previously, it breaks the underlying HTML markup. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:44, 28 December 2017 (UTC)
To anybody sane, HTML is a tool to be used towards a purpose. Regarding HTML as a purpose onto itself is going beyond mere idiosynchrasy. - Brya (talk) 17:53, 28 December 2017 (UTC)
Sounds like a good idea to me. I think it's important to link the common names to the proper species name. I am not sure about the specific implementation. For example, I don't have a strong preference to have the taxon name or the common name as label, but the other should be an alias for sure. That makes it a lot easier to recognize the proper entry. --Egon Willighagen (talk) 14:34, 29 December 2017 (UTC)
Just to cite a current problem involving the "interpretation" of a common name: Wikidata:Project_chat#Xantus's_Murrelet. As far as I know we had a major import from Wikispecies without any references and some referenced additions via IOC, IUCN or Wörterbuch der Säugetiernamen - Dictionary of Mammal Names (Q27310853) done by my bot. I object to add any unreferenced common names as aliases. If have no problem with the latter onces, if there is an agreement here to do so. --Succu (talk) 21:33, 29 December 2017 (UTC)
Yes, there are at least two problem sets of common names
  • those moved from Wikispecies {by Andy Mabbett). After this had been done, at Wikispecies they did not want common names from Wikidata imported into Wikispecies because of the dubiousness of this content.
  • those from ARKive; proposed by Andy Mabbett as a property as being a "repository of high-quality, high-value media" but used by some user to "import common names" from. Many of these are not just dubious, but outright bogus. - Brya (talk) 05:39, 30 December 2017 (UTC)
Your indentation fixed, again. The idea that Wikispecies did not want to use content from, er, Wikispecies is, of course, laughable. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:39, 30 December 2017 (UTC)
Laughable? I doubt you can show us a discussion where people at Wikispecies are eager to drop there content related to taxon common name (P1843) and reimport it from here. --Succu (talk) 20:15, 30 December 2017 (UTC)

No cogent reason why taxon common names are not aliases has been given; can someone action this, please? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:26, 25 February 2018 (UTC)

Two prerequisites are given by Matěj. Do you agree with them? --Succu (talk) 22:32, 25 February 2018 (UTC)
The former, certainly; the latter could be done, but otherwise we already have bot which does that on a regular basis. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:13, 1 March 2018 (UTC)

The taxonomy and what common names difine are are not necessarily the same. Your assumption is that you can retrofit common names on what science thinks up. Do you have prove for that? Or does it take a single example to undo this folly? Thanks, GerardM (talk) 07:23, 2 March 2018 (UTC)

Oh go on: please show us an example of an item with a valid entry for taxon common name (P1843) that is not valid as an alias for that item. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:40, 5 April 2018 (UTC)
No example? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:16, 9 April 2018 (UTC)

Can someone action this request, please? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:26, 16 June 2018 (UTC)

Importing IRFU player numbers into P3848 property[edit]

Request date: 19 November 2017, by: Blackcat (talkcontribslogs)

Link to discussions justifying the request
Task description

Irish Rugby Football Union men's player ID (P3848) has recently been created. I created here a table with the data collected on irishrugby.ie containing players name and player number. I wanted to know whether a bot can import the player ID in the respective wikidata item.

Licence of data to import (if relevant)
Discussion

Hi Blackcat (talkcontribslogs),

I think this task can be achieved using a bot. There are several questions in my mind:

--Tozibb (talk) 09:51, 22 November 2017 (UTC)

Hello Tozibb, the source is here. .. Blackcat (talk) 13:31, 22 November 2017 (UTC)
@Blackcat: Do you have a mapping of player names on irishrugby.ie and Q-Id's on Wikidata? --Pasleim (talk) 08:27, 17 December 2017 (UTC)
Hello @Pasleim:, I am not sure to have understood your question, could you please be so kind to show me an example? -- Blackcat (talk) 13:13, 17 December 2017 (UTC)
On the page you provided we get for each player name his irishrugby-ID. The problem is that there could be mulitple persons with the same name. So to add this irishrugby-IDs to Wikidata we need to know in addition which ID they have here on Wikidata. --Pasleim (talk) 15:00, 18 December 2017 (UTC)
It sounds like this might be better done in Mix'n;Match. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:33, 30 December 2017 (UTC)
@Blackcat: Does M'n'M meet your needs? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:05, 2 February 2018 (UTC)
@Pigsonthewing: Hello, sorry for the delay, I had some tasks in rl that kept me busy :-) I haven't had a look at it, let me see what's about. -- Blackcat (talk) 12:14, 2 February 2018 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── @Blackcat: Are we done here? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:29, 16 June 2018 (UTC)

@Pigsonthewing:, I had a look at it but I am not very skilled at it. I raised the question here just because I thought there was someone that could solve this issue.. :) -- Blackcat (talk) 20:04, 16 June 2018 (UTC)
@Gerwoman:, who can hopefully help with scraping the website. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:19, 16 June 2018 (UTC)
@Blackcat, Pigsonthewing: Catalog created. 1327 Enjoy! --Gerwoman (talk) 09:21, 17 June 2018 (UTC)
@Gerwoman:, thanks and congratulations! -- Blackcat (talk) 09:49, 17 June 2018 (UTC)

Fetch coordinates from OSM[edit]

We have many items with an OSM relation ID (P402), and no coordinates:

SELECT ?item ?itemLabel WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
  ?item wdt:P402 ?OpenStreetMap_Relation_identifier.
  MINUS { ?item wdt:P625 ?coordinate_location. }
}
LIMIT 1000

Try it!

Can someone please fetch the coordinates from OSM? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:59, 27 November 2017 (UTC)

an OSM relation id sometimes denotes a rather big area (a municipality, a nature reserve, etc.). What coordinates to take? --Herzi Pinki (talk) 19:19, 27 November 2017 (UTC)
The centroid. If that's not possible, the centre of the bounding box. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:39, 27 November 2017 (UTC)
@Pigsonthewing: It's a bit late to comment and I'm sure you're aware of this, but this won't work for public transport and route relations because they're either lines or made up of other relations, and won't work for objects with multiple locations like Toys "R" Us (Q696334). It's probably not a good idea to add coordinates to those with an automated process unless the resulting coordinates accurately represent a point that is actually on the line, or represent all points of a multiple-node relation (and neither of these may be desirable). Jc86035 (talk) 13:06, 27 December 2017 (UTC)
Sets of objects with multiple locations should not be in an OSM relation. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:38, 27 December 2017 (UTC)
What about ODBL license of OSM data? Mateusz Konieczny (talk) 14:45, 27 December 2017 (UTC)
From what I checked: importing from OSM would be violation of its license, therefore I propose to reject this proposal and remind importers to check copyright status of imported data Mateusz Konieczny (talk) 09:34, 10 May 2018 (UTC)
You assume that OSM is correct to assert copyright ownership over individual facts, or that its database rights in the EU apply to Wikidata as a US-hosted initiative; it is far from clear that this is the case for either scenario. See also Wikidata:Project chat#‎OpenStreetMap. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:56, 10 May 2018 (UTC)
It appears that Wikimedia legal team disagrees with you For EU databases, bots or other automated ways of extracting data should also be avoided because of the Directive’s prohibition on “repeated and systematic extraction” of even insubstantial amounts of data from https://meta.wikimedia.org/wiki/Wikilegal/Database_Rights#Conclusion Mateusz Konieczny (talk) 08:18, 11 May 2018 (UTC)
I suggest you read the many disclaimers at the start of that page; and always refer to them when citing it. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:32, 16 June 2018 (UTC)

Automatic creation of labels in Arabic dialects for Wikidata entities from existing labels in Modern Standard Arabic for these items[edit]

Request date: 29 December 2017, by: Csisc (talkcontribslogs)

Link to discussions justifying the request
Task description

There is no label in most of the Arabic dialects for entities in Wikidata. However, there are several simple rules that can be easily applied using Pywikibot to add labels in Arabic dialects to entities in Wikidata:

  • The label of a proper entity (Person, Place, Trademark...) in Modern Standard Arabic is the same as the one of such an entity in the following Arabic dialects: South Levantine Arabic (ajp), Gulf Arabic (afb), Hejazi Arabic (acw), Najdi Arabic (ars), Hadhrami Arabic (ayh), Sanaani Arabic (ayn), Ta'izzi-Adeni Arabic (acq), and Mesopotamian Arabic (acm).
  • Labels of places and people (if they do not hold another citizenship) from Palestine, Jordan, Syria, Iraq, Kuwait, Yemen, Oman, Bahrain, Qatar, UAE, Saudi Arabia, Sudan, Djibouti, Comoros, Somalia, and Mauritania are the same in all Arabic dialects as in Modern Standard Arabic.
Discussion

Problem solved. This can be done using Wikidata query service and QuickStatements. --Csisc (talk) 00:21, 20 February 2018 (UTC)

Request process

Rank P150 (sub-admin regions) as preferred[edit]

Request date: 31 January 2018, by: Yurik (talkcontribslogs)

Link to discussions justifying the request
  • This was mentioned to me by @Jura1:. I'm not sure if this is controversial enough to warrant a big discussion.
Task description

Set rank=preferred on all contains administrative territorial entity (P150) statements that do not have end time (P582) qualifier if some of the entity's P150 have end time, and some do not. Without this, querying for wdt:P150 returns both the old and current results together. In some cases, P150 with q:P582 were set incorrectly as deprecated, leaving the current ones as "normal rank". These should also be fixed. CC: @Laboramus:.

Discussion
Request process

Semi-automated import of information from Commons categories containing a "Category definition: Object" template[edit]

Request date: 5 February 2018, by: Rama (talkcontribslogs)

Link to discussions justifying the request
Task description

Commons categories about one specific object (such as a work of art, archaeological item, etc.) can be described with a "Category definition: Object" template [1]. This information is essentially a duplicate of what is or should be on Wikidata.

To prove this point, I have drafted a "User:Rama/Catdef" template that uses Lua to import all relevant information from Wikidata and reproduces all the features of "Category definition: Object", while requiring only the Q-Number as parameter (see Category:The_Seated_Scribe for instance). This template has the advantage of requesting Wikidata labels to render the information, and is thus much more multi-lingual than the hand-labeled version (try fr, de, ja, etc.).

I am now proposing to deploy another script to do the same thing the other way round: import data from the Commons templates into relevant fields of Wikidata. Since the variety of ways a human can label or mislabel information in a template such as "Category definition: Object", I think that the script should be a helper tool to import data: it is to be ran on one category at a time, with a human checking the result, and correcting and completing the Wikidata entry as required. For now, I have been testing and refining my script over subcategories of [2] Category:Ship models in the Musée national de la Marine. You can see the result in the first 25 categories or so, and the corresponding Wikidata entries.

The tool is presently in the form of a Python script with a simple command-line interface:

./read_commons_template.py Category:Scale_model_of_Corse-MnM_29_MG_78 reads the information from Commons, parses it, renders the various fields in the console for debugging purposes, and creates the required Wikibase objects (e.g: text field for inventory numbers, Q-Items for artists and collections, WbQuantity for dimensions, WbTime for dates, etc.)
./read_commons_template.py Category:Scale_model_of_Corse-MnM_29_MG_78 --commit does all of the above, creates a new Q-Item on Wikidata, and commits all the information in relevant fields.

Ideally, when all the desired features will be implemented and tested, this script might be useful as a tool where one could enter the

Licence of data to import (if relevant)

The information is already on Wikimedia Commons and is common public knowledge.

Discussion


Request process

Repair "en-ca"-labels for categories[edit]

#Some samples (incomplete)
SELECT DISTINCT ?item ?itemLabel ?l 
{
	?item wdt:P31 wd:Q4167836 ; wdt:P301 [] ; rdfs:label ?l . FILTER( lang(?l)="en-ca" && !STrstarts(str( ?l ),"Category:" )  )
	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
LIMIT 10

Try it!

There seems to be a couple of items with "en-ca" labels that lack the "Category:" prefix. This didn't matter much when internal search wasn't as great as it's now, but currently these pages get listed even when one searches in English, e.g. "Raphinae" lists Category:Raphinae (Q8634300).

This can be fixed by adding the missing "Category:". Sample edit: [3]

Some of the en-gb ones seem to have been fixed already.
--- Jura 19:29, 9 February 2018 (UTC)

Re-import P569 from dewiki[edit]

Checking P569 of some people, I found that many with a date of birth (P569)-date precision of "century" have more precise dates available. Several of these seem to be imports from dewiki.

Maybe attempting to re-import such dates could help: sample query, sample edit.
--- Jura 08:35, 16 February 2018 (UTC)

Remove statement with Gregorian date earlier than 1584 (Q26961029)[edit]

SELECT ?item ?property (YEAR(?year) as ?yr)
{
    hint:Query hint:optimizer "None".
    ?a pq:P31 wd:Q26961029 .
    ?item ?p ?a .
    ?a ?psv ?x .
    ?x wikibase:timeValue ?year .
    ?x wikibase:timePrecision 7 .        
    ?x wikibase:timeCalendarModel wd:Q1985727 .        
    ?property wikibase:statementValue ?psv    .   
    ?property wikibase:claim ?p     .  
}
LIMIT 15704

Try it!

The above dates have year precision and Proleptic Gregorian calendar (Q1985727) as calendar model. I think they could be converted to Julian and the qualifier statement with Gregorian date earlier than 1584 (Q26961029) removed.
--- Jura 09:19, 24 February 2018 (UTC)

Symbol support vote.svg Support --Marsupium (talk) 23:01, 28 April 2018 (UTC)

Item documentation[edit]

For every item which has something on its talk page a bot could usefully prepend {{Item documentation}} as in this edit.

It would be good if one of the active maintenance bots could then do this as new talk pages are created. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:54, 4 March 2018 (UTC)

Remove "Wikimedia list article"@en from items that don't have P31=Q13406463[edit]

SELECT *
WHERE
{
    ?item schema:description "Wikimedia list article"@en 
    MINUS { ?item wdt:P31 wd:Q13406463 }
}

Try it!

Currently 12514 items. Sample items: Q99676, Q115702, Q117073, Q117208, Q127301.

The above either got a description before P31 was changed or prematurely. It would be good if maintenance was done on these. The same could probably be done for other languages considered important.

This is similar to Wikidata:Bot_requests/Archive/2018/02#Nations at games.
--- Jura 14:41, 7 March 2018 (UTC)

I have a couple of suggestions for that query:
1) Some items are instances of a subclass of Q13406463, but not Q13406463 itself (e.g. List of people with surname Carey (Q23044947) has P31:list of persons (Q19692233)). It might not be wise to remove the description from these.
2) If the English label starts with the word 'List', even if it doesn't have P31/P279*=Q13406463, the item is most likely a list (e.g. List of Trinidadian football transfers 2013–14 (Q16258717)). I browsed through some of these, and it seems that they are likely have problems, but those are best solved manually.
If we take the two above mentioned points into account, we have the following query:
SELECT ?item ?itemLabel
WHERE
{
    ?item schema:description "Wikimedia list article"@en .
    MINUS { ?item wdt:P31/wdt:P279* wd:Q13406463 }
    FILTER NOT EXISTS {
      ?item rdfs:label ?enLabel.
      FILTER(LANG(?enLabel) = 'en') 
      FILTER(STRSTARTS(lcase(STR(?enLabel)), 'list'))
    }
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
Try it!
The query currently returns 4254 items. --Shinnin (talk) 19:12, 25 March 2018 (UTC)
Good point .. I hadn't recalled lists having subclasses. So, yes, the 4000 "lists of persons" don't need the description removed. At some point, it seems that subclasses get carried away and include things like "Belgium in the Eurovision Song Contest" (60 instances), "aspect of history", etc. P279 at article about events in a specific year or time period (Q18340514) seems to be the source of that. Looking one of the samples from your query [4] reminded me where some of the incorrect P31 came from. Seems that cleanup hasn't gone all the way through. I will try to sort out the subclasses. I think the query should work (for English). Sorting out the subclasses will add more.
--- Jura 20:06, 25 March 2018 (UTC)

add freq=10 to lists[edit]

An update of the lists at Special:PrefixIndex/Wikidata:WikiProject_Medicine/Hospitals_by_country/ is currently attempted daily. I think every 10 days would be sufficient.

Please add |freq=10 to Template:Wikidata list. @Tobias1984:, you create them some time ago. Is this ok with you?
--- Jura 00:56, 8 March 2018 (UTC)

Oppose. Looking at some examples, there are few changes. So, there are not that many updates. But if there is a change, then it should be carried out soon. A vandal could act short before the update and then it will be visible for 10 days. Why that? Daily is a good compromise. 77.179.94.217 17:15, 14 April 2018 (UTC)
Where is the compromise? It's a daily run for 250 lists, most with not much content. Maybe the whole set should go.
--- Jura 17:18, 14 April 2018 (UTC)

Supposedly one shouldn't follow up on T's comments. Maybe Tobias1984 wants to add something.
--- Jura 17:26, 14 April 2018 (UTC)
Who is T and where is their comment? Please try to write in less encrypted way. --Pasleim (talk) 15:46, 15 April 2018 (UTC)
Tama .. what's his name? I think. Nothing to do with Tobias1984.
--- Jura 15:48, 15 April 2018 (UTC)
Symbol oppose vote.svg Oppose neither a motivation nor a beneficial outcome provided --Pasleim (talk) 16:49, 15 April 2018 (UTC)
It saves 100s of queries every day. There isn't really a benefit in running this on a daily basis. Many reference lists run monthly (e.g. those of Sum of All Paintings) or weekly. Obviously, if any maintenace was done on the lists, whatever other frequency is preferred should do. Unfortunately, I might have been the only one having done it recently.
--- Jura 16:58, 15 April 2018 (UTC)
Given the ~5 millions daily query requests, saving 200 requests seems marginal to me. --Pasleim (talk) 17:28, 15 April 2018 (UTC)
It's just one group .. we can't fix everything at once. The feature was introduced for that very reason.
--- Jura 11:19, 18 April 2018 (UTC)

Crossref Journals[edit]

Request date: 27 March 2018, by: Mahdimoqri (talkcontribslogs)

Link to discussions justifying the request
Task description
  • Add missing journals from Crossref
Licence of data to import (if relevant)
Discussion


Request process

"place on Earth" descriptions[edit]

Can someone remove the descriptions added by the batches mentioned on AN. The users shouldn't have edited and the descriptions are fairly useless. Please delete all in a single edit.
--- Jura 08:38, 15 May 2018 (UTC)

"AN" link is dead. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:41, 29 May 2018 (UTC)

URLs in descriptions cleanup[edit]

Some imports have URLs in descriptions, e.g. Q19652983. I think they should be removed or added as statements. @Magnus Manske:
--- Jura 09:00, 21 May 2018 (UTC)

In the given example the URL slug should be moved to a dedicated identifier property. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:40, 29 May 2018 (UTC)

Import English labels from Commons categories[edit]

Commons category names are normally in English by convention, and there are quite a few cases where we have Commons categories (sitelinked or through Commons category (P373)) but no English labels yet. So quite a few useful labels could be added in the same way that bots here already add them based on the Wikipedia article titles. Would anyone that has an existing bot code for label imports be interested in having a look into this? Thanks. Mike Peel (talk) 23:06, 23 May 2018 (UTC)

I ended up coding this myself, see the request for approva at Wikidata:Requests for permissions/Bot/Pi bot 7. Thanks. Mike Peel (talk) 11:03, 18 July 2018 (UTC)

Migrating all LeMonde.fr urls to https[edit]

Request date: 27 May 2018, by: Teolemon (talkcontribslogs)


Task description

The major French newspaper just migrated to HTTPS by default a couple of days ago. Can a handler get his bot to process all the http://www.lemonde.fr urls to turn them into https://www.lemonde.fr ? Teolemon (talk) 11:06, 27 May 2018 (UTC)


Discussion


Request process

Populating Commons maps category (P3722)[edit]

Request date: 30 May 2018, by: Thierry Caro (talkcontribslogs)

Link to discussions justifying the request
  • None.
Task description

Take all instances of subclasses of geographical object (Q618123). Look for those that have a Commons category (P373) statement and visit the corresponding Commons category. If it includes another category that has Maps of and then its name as its own name, import this value as Commons maps category (P3722) to the item. This would be useful to the French Wikipedia, where we now have no label (Q54473574) automatically populated through Template:Geographical links (Q28528875).

Licence of data to import (if relevant)
Discussion
Request process

Labels for Wiktionary categories[edit]

I am running into a multitude of Wiktionary categories which has sitelink in language A but not label in A. Can anyone please add labels automatically? And please don't omit expressions in parentheses! @ValterVB:? --Infovarius (talk) 13:22, 30 May 2018 (UTC)

Even AutoEdit doesn't work for Wiktionary sitelinks... --Infovarius (talk) 13:31, 30 May 2018 (UTC)
And labelless items are created by PetScan (can it be changed?) so their number is constantly increasing and there is need in a regular work. --Infovarius (talk) 13:35, 30 May 2018 (UTC)
@Infovarius: The dumps have become too big (too much time to download and unzip), so I stopped to download them and update label description, or do other weekly task that using dump. If differential dump will be available, probably I restart my periodical task. Alternatively, I can do it with a SPARQL query (if I find a suitable one) --ValterVB (talk) 17:25, 30 May 2018 (UTC)
@ValterVB: Can you please run along ru-pages without ru-label? 3750 now:
SELECT ?item ?sitelink ?itemLabel ?title WHERE {
  ?sitelink schema:isPartOf <https://ru.wiktionary.org/>;
     schema:about ?item; schema:name ?title 
    SERVICE wikibase:label { bd:serviceParam wikibase:language "ru,[AUTO_LANGUAGE],en" } .
   FILTER(NOT EXISTS {
   ?item rdfs:label ?lang_label.
   FILTER(LANG(?lang_label) = "ru") #with missing Russian label
 })
} ORDER BY ?itemLabel

Try it!

 – The preceding unsigned comment was added by Infovarius (talk • contribs).

  • @Infovarius: Interesting discovery. I started adding some of the missing labels and inserted "schema:name" in your query.
    --- Jura 16:02, 1 June 2018 (UTC)

All ✓ Done except for 4 item for conflict label+description:

--ValterVB (talk) 09:39, 3 June 2018 (UTC)

Thanks for work, Jura and Valter! May I ask for harder task: to add aliases from Wiktionary page titles if they are different from the item label? --Infovarius (talk) 21:07, 3 June 2018 (UTC)

category's main topic (P301) for languages[edit]

The following lack P301 http://petscan.wmflabs.org/?psid=4625759 . Most have an English Wiktionary category page with a description that includes a link to an English Wikipedia article, a language code or even the QID to use. This is mostly added through a series of templates there.
--- Jura 11:08, 3 June 2018 (UTC)


Linking categories in Spanish Wikisource[edit]

Request date: 8 June 2018, by: Ninovolador (talkcontribslogs) (spanish WS admin)

Link to discussions justifying the request
Task description

I would want to link sistematically some categories on Spanish Wikisource. Most subcategories under s:es:Category:Años are in the format Category:NXXXX, Category:FXXXX or Category:PXXXX, where XXXX is the number of the year. N means "births" F "deaths" and P "published". I want to link sistematically every subcategory to its equivalent.

So, basically, my request is to link (just an example, this categories are already linked):

And so for every subcategory under s:es:Category:Años.

Licence of data to import (if relevant)
Discussion


Request process

@Ninovolador: done this. There are some leftovers however (which are less than approx. 600 initial categories :) ). I did the mapping against enwiki (not -source), FYI. --Edgars2007 (talk) 14:55, 15 July 2018 (UTC)

@Edgars2007: Thank you very much!! -- Ninovolador (talk) 03:00, 16 July 2018 (UTC)

Some LCCN statements to be deprecated[edit]

Request date: 16 June 2018, by: Jheald (talkcontribslogs)

Link to discussions justifying the request
Task description

I would be grateful if somebody could change the rank to "deprecated" for 2714 statements for property LCOC LCCN (bibliographic) (P1144).

These are LCCN values that are given on the Biodiversity Heritage Library (Q172266), but on the LoC website turn out to be either not recognised; or to refer to a different book altogether; or to be redirects pointing to a preferred new value.

I have added the statements with Quick Statements, with an appropriate reason for deprecation (P2241), but QS is unable to change their rank to 'deprecated', so I would be grateful if somebody could do this for me, and finish the job.

Note that some of the statements aren't being returned reliably by WDQS (thread), so I have uploaded the full list to this Wikimedia pastebin: https://tools.wmflabs.org/paste/view/d4beab96

Thanks in advance! Best regards, Jheald (talk) 17:30, 16 June 2018 (UTC)

Licence of data to import (if relevant)
Discussion


Request process

@Jheald: should be done. If you can re-check (via query), it wouldn't hurt. --Edgars2007 (talk) 15:05, 15 July 2018 (UTC)

@Edgars2007: Thanks! WDQS found two more, so I have done those by hand. It's possible there may be a handful more -- WDQS only finds 2705 statements with reason-for-deprecation at the moment tinyurl.com/ybdsnbtx, and the same number with deprecated rank tinyurl.com/yd8yy9tm whereas there were 2714 in the pastebin; but it has been a bit flaky all the way through on this set of edits. I'll see if I can identify the remaining nine, and check them manually. Thanks again! Jheald (talk) 15:51, 15 July 2018 (UTC)
There were some already removed LCOC LCCN (bibliographic) (P1144), so it probably could be ok. --Edgars2007 (talk) 15:56, 15 July 2018 (UTC)

CAPS cleanup[edit]

There a series of entries like Q55227955 created by bot that have the names inverted and family names in CAPS. In the meantime, these labels have been copied to some other languages.
--- Jura 09:12, 5 July 2018 (UTC)

  • Here is a list: [5] (currently 295) and some that already got merged [6] (currently 8).
    --- Jura 08:49, 7 July 2018 (UTC)

Import area codes P473 from CSV file[edit]

Request date: 9 July 2018, by: Krauss (talkcontribslogs)

Link to discussions justifying the request
Task description

Get from city table as github.com/datasets-br/city-codes/data/br-city-codes.csv
(see columns wdId as Wikidata Item and ddd as city area code).

Example: Itu/SP have wdId=Q957653 and ddd=11 at the br-city-codes table, so I (by hand) set statement Q957653#P473. The correct is to do by bot.

Licence of data to import (if relevant)

CC0

Discussion

@Krauss: Sounds like this can be done with Help:QuickStatements. --Marsupium (talk) 11:21, 13 July 2018 (UTC)

@Marsupium: Thanks, I need to do my homework (can you help?), but YES (!) it is a good solution. --Krauss (talk) 01:29, 15 July 2018 (UTC)
@Marsupium: and others, I edited (!) with simple qid,P1585/* comment */ CSV file... but when submit, there are errors:
  • Q818261 is a "wrong" entity! The correct "Alagoinhas/BA" is Q22050101! but (as Q22050101 > Q818261) how to delete or redirect it? Must move the older to the new or the new to the older?
  • Q975677 is a "wrong" entity! The correct "Antônio Carlos/SC" is Q22063985! but (as Q22063985 > Q975677) how to delete or redirect it? Must move the older to the new or the new to the older?
  • ... many similar errors
@Marsupium: at first, Wikipedia articles has to get merged. And only AFTER that you can merge these Wikidata items. Please don't change labels/descriptions in the way you did it. --Edgars2007 (talk) 13:12, 15 July 2018 (UTC)
Sorry, Marsupium, I'm little bit tired to do things right :D The ping was meant for @Krauss:. --Edgars2007 (talk) 13:13, 15 July 2018 (UTC)
Thanks @Edgars2007: I will not move or delete ~50 Wikidata items: please answer the question about "how to fix the Wikidata bug". I am asserting that "Q22050101 and Q818261 are duplicated" and that "Q22050101 is correct", so we can fix Q818261 --Krauss (talk) 14:38, 15 July 2018 (UTC)

PS: the discussiom about duplicates is a side effect of my "whant to import area codes P473" initiative, because I can't insert statements in wrong (eg. non-city entities). There are ~50 errors in ~5500 cities of Brasil. --Krauss (talk) 14:43, 15 July 2018 (UTC)

I already answered to that question. The Wikipedia articles from the "wrong" Wikidata item has to be merged into those Wikipedia articles which are in "right" Wikidata item. But for now they have to be marked as dublicates. You can use this gadget for that. --Edgars2007 (talk) 14:51, 15 July 2018 (UTC)

Thanks discussions @Edgars2007: now I am back to the main task, I was using https://tools.wmflabs.org/quickstatements/#/batch/3277... But there are 26 errorss in 26 itens, and no description of the error. What is wrong with it??

qid,P473
Q304652,62 /*Abadia de Goiás/GO */
Q582223,34 /*Abadia dos Dourados/MG */
Q304716,62 /*Abadiânia/GO */
Q1615444,37 /*Abaeté/MG */
Q1615298,91 /*Abaetetuba/PA */
Q1796441,88 /*Abaiara/CE */

When I do "by hand" there are no errors. See eg. Q942803.

@Krauss: In both QuickStatements formats the strings have to be in double quotes (see Help:QuickStatements#Add simple statement) like this. In CSV format (the meaning of double quotes in CSV seems to interfere here, but I don't understand how, still it works), for some reason it seems to work with four double quotes before and one after. And in CSV format the comments need their own "#" column, so like this:
qid,P473,#
Q304652,""""62",Abadia de Goiás/GO
Q582223,""""34",Abadia dos Dourados/MG
BTW: Would probably be good to add a source to the statements …
Cheers, --Marsupium (talk) 20:19, 15 July 2018 (UTC), edited 20:53, 15 July 2018 (UTC)
@Marsupium: Thanks! It is an CSV strange syntax (what make sense is "strin and the CSV quoting as """strin"... The extra quoting is not any standsard CSV, see W3C reccomendation od 2016 or RFC 4180 of 2005... Ok, lets dance with the music,
See my batches: #3291 br-ddd-b01a2, #3293 br-ddd-b01, #3294 br-ddd-b02, #3295 br-ddd-b03, #3296 br-ddd-b04, #3297 br-ddd-b05, #3298 br-ddd-b06. Is working, with ~5500 items!

... Only ~120 of ~5500 with errors after a lot of work, QuickStatements work fine (!!), good result. But lets understand the little erros: I am not understanding:

Hi @Marsupium: again a problem with QuickStatements, see batch/3312, where input syntax was perfect and 2 itens was done... But all other with "error" have no error (!), I do by hand and is perfect, see a complete list with links. --Krauss (talk) 21:52, 16 July 2018 (UTC)

@Krauss: I pity for that, but I'm afraid that I don't know anything more than I've already written. I have to admit that unfortunately QuickStatements is often quite idiosyncratic. Sorry! --Marsupium (talk) 21:00, 17 July 2018 (UTC)
@Marsupium: and @Edgars2007: Thanks all help and discussion !
The "Import area codes P473 from CSV file" is completed.

END OF REQUEST.

Migrate to P1480[edit]

Request process

Request date: 12 July 2018, by: Swpb (talkcontribslogs)

Link to discussions justifying the request
Task description

Migrate all statements xnature of statement (P5102)  disputed (Q18912752) to xsourcing circumstances (P1480)  disputed (Q18912752) (see query). Swpb (talk) 17:47, 12 July 2018 (UTC)

Discussion


Request process

Normalize references[edit]

Request date: 13 July 2018, by: Marsupium (talkcontribslogs)

Link to discussions justifying the request
Task description

Often one source website or database is indicated inconsistently in various manners. To improve this situation some queries and following edits on references could be made. This is a task that would best be done continuously and gradually adapted to more cases. Thus, perhaps this task fits well in the work field of DeltaBot, User:Pasleim?

  1. Add ID property (and if feasible also stated in (P248)) to references with according reference URL (P854) where missing.
  2. Add stated in (P248) to references with according ID property where missing.
  3. (For later: Merge references where a source website or database is used twice (accidentally).)

The issue exits for ULAN ID (P245), RKDartists ID (P650) and probably many more source websites or databases. I have examined ULAN ID (P245) and those would be the queries and edits to be done:

  1. SELECT ?entity ?prop ?ref ?id WHERE {
      ?entity ?prop ?statement.
      ?statement prov:wasDerivedFrom ?ref.
      ?ref pr:P854 ?refURL.
      MINUS { ?ref pr:P245 []. }
      FILTER REGEX(STR(?refURL), "^https?://(vocab.getty.edu/(page/)?ulan/|www.getty.edu/vow/ULANFullDisplay.*?&subjectid=)")
      BIND(REPLACE(STR(?refURL),"^https?://(vocab.getty.edu/(page/)?ulan/|www.getty.edu/vow/ULANFullDisplay.*?&subjectid=)","") AS ?id)
    }
    LIMIT 500
    
    Try it! → add    / ULAN ID (P245)?id to the reference (now >1k cases)
  2. SELECT ?entity ?prop ?ref WHERE {
      ?entity ?prop [ prov:wasDerivedFrom ?ref ].
      ?ref pr:P245 [].
      MINUS { ?ref pr:P248 wd:Q2494649. }
    }
    
    Try it! → add    / stated in (P248)Union List of Artist Names (Q2494649) to the reference (now ca. 70 cases)

Thanks for any comments!

Discussion


Request process

heavy coord job[edit]

Request date: 14 July 2018, by: Conny (talkcontribslogs)

Link to discussions justifying the request
Task description

Wikidata Items with pictures from commons and the same coordinates on both projects should in Wikidata display the information, is the picture camera or object possition.

Discussion

@Conny: It's unclear what you want. Please explain it in more detail and provide an example edit. On Property talk:P625#camera/object_position I voiced my doubt if this should be on Wikidata. Multichill (talk) 14:06, 17 July 2018 (UTC)

@Multichill: We can exchange here again :) . My sample is on linked talkpage. Thank you, Conny (talk) 04:17, 18 July 2018 (UTC).
Request process

Normalize dates with low precision[edit]

Request date: 17 July 2018, by: Jc86035 (talkcontribslogs)

Link to discussions justifying the request
Task description

Currently all times between 1 January 1901 and 31 December 2000 with a precision of century are displayed as "20. century". If one enters "20. century" into a date field the date is stored as the start of 2000 (+2000-00-00T00:00:00/7), which is internally interpreted as 2000–2099 even though the obvious intent of the user interface is to indicate 1901–2000. Since the documentation conflicts with the user interface, there is no correct way to interpret this information, and some external tools which interpret the data "correctly" do not reflect editors' intent.

This is not ideal, and since it looks like this isn't going to be fixed in Wikibase (I have posted at Wikidata:Contact the development team but haven't got a response yet, and the Phabricator bugs have been open for years), I think a bot should convert every date value with a precision of decade, century or millenium to the midpoint of the period indicated by the label in English, so that there is less room for misinterpretation. For example, something that says "20. century" should ideally be converted to +1951-00-00T00:00:00/7 (or alternatively to +1950-00-00T00:00:00/7), so that it is read as 1901–2000 by humans looking at the item, as 1901–2000 (or 1900–1999) by some external tools, and as 1900–1999 by other external tools.

Classes of dates this would apply to:

  • Decades (maybe) – e.g. dates within 1950–1959 to +1954-00-00T00:00:00/8
  • Centuries – e.g. dates within 1901–2000 to +1951-00-00T00:00:00/7 or +1950-00-00T00:00:00/7 (depending on what everyone prefers)
  • Millennia – e.g. dates within 1001–2000 to +1501-00-00T00:00:00/6 or +1500-00-00T00:00:00/6

For everything less accurate (and over five digits), the value is displayed as "X years" (e.g. −10000-00-00T00:00:00Z/5 is displayed "10000 years BCE"). Incorrect precisions for years under five digits could be otherwise fixed, but it looks like the user interface just doesn't bother parsing them because people don't name groups of a myriad years.

While this is obviously not perfect and not the best solution, it is better than waiting an indefinite time for the WMF to get around to it; and if the user interface is corrected then most of the data will have to be modified anyway. Values which have ambiguous meaning (i.e. those which can be identified as not having been added with the wikidata.org user interface) should be checked before normalization by means of communication with the user who added them. Jc86035 (talk) 11:33, 17 July 2018 (UTC) (edited 14:13, 17 July 2018 (UTC) and 16:16, 17 July 2018 (UTC))

Discussion

I think the reasoning applies to centuries and millenia. I'd have to think about it a bit longer for decades. While I'm thinking, perhaps Jc86035 would clarify the request by explicitly stating every precision the request applies to. Also, when discussing precision, I find it inadvisable to use the terms "lower" or "greater". Terms such as "better" and "worse" or "looser" and "tighter" seem better to me.

I suppose this bot would have to be run on a regular schedule until the problem is fixed by the developers (or the Wikidata project is shuttered, whichever comes first). Jc3s5h (talk) 12:19, 17 July 2018 (UTC)

@Jc3s5h: I think running it at least once a day would be good. I've edited the proposal so that it only applies to decades, centuries and millennia, because Wikibase handles less precise dates differently (in general the time handling seems very nonstandard to me, probably because most people don't need to represent 13.7 billion years ago in the same time format as yesterday). Jc86035 (talk) 14:48, 17 July 2018 (UTC)
  • Symbol support vote.svg Support (edit conflict) I think it is a good idea. I also found it confusing on ocassion. If there are templates and other tools that interpret Wikidata dates incorectly that is their bug and it is beyond our control to debug each tool that uses Wikidata. However I think it is a good idea to default to the mid-point in the time period for some of the confusing ones. ( I would not do it for years, but for decaded, centuries, millenia, that would be fine. --Jarekt (talk) 12:28, 17 July 2018 (UTC)
  • Symbol oppose vote.svg Oppose looks like a work around instead of solving the real problem. Multichill (talk) 14:02, 17 July 2018 (UTC)
    @Multichill: Yes, obviously this isn't the best solution, but the Phabricator bug is three years old now so it's not like Wikibase's date handling is suddenly going to be improved after years of nothing, so we may as well deal with it regardless of the pace of software development. The longer the issue sits around, the more data everyone has to revalidate after the issue is fixed. (They don't have enough staff to deal with comparatively minor things like this. Dozens of bugs e.g. in Kartographer have been closed as wontfix just because "there's no product manager".) Jc86035 (talk) 14:23, 17 July 2018 (UTC)
    Furthermore, currently ISO 8601 necessitates us using things like earliest date (P1319) and latest date (P1326) if there's any sort of non-trivial uncertainty range, yet Wikibase stores the user's initial value anyway. Wikibase does a lot of odd things like the aforementioned non-standard century handling and allowing "0 BCE" as a valid date. I don't think they have the resources to fix stuff like this. Jc86035 (talk) 14:43, 17 July 2018 (UTC)
  • Question: As an example, if the bot finds the datestamp +1900-01-01T00:00:00Z with precision 7, should it convert it to +1950-01-01T00:00:00Z or +1850-01-01T00:00:00Z? I do not believe a bot can trace back to when an entry was last changed and see if it was changed interactively or through the API. If through the API, I think the year should be 1950. If interactively, I think it should be 1850. Perhaps we could somehow examine the history of contributions, and see how many dates are entered interactively vs. the API. If one is vastly more frequent than the the other, we could go with whatever predominates.
Whatever we do, we should stay away from any year before AD 101. As Jc86035 points out, the issues are compounded for BCE, and there are also some tricky points before 101. Jc3s5h (talk) 21:39, 17 July 2018 (UTC)
  • @Jc3s5h: Could you not find it through differing edit summaries (e.g. lots of additions by one user with #quickstatements)? I think it would be difficult but it would be possible with something like the WikiBlame tool. Jc86035 (talk) 06:06, 18 July 2018 (UTC)
    I would like to see a substantial sample, which would have to be gathered automatically. For example, all the date edits make on Wikidata for an entire day on each of 100 randomly chosen days. Jc3s5h (talk) 11:56, 18 July 2018 (UTC)
Request process