Wikidata:Bot requests

From Wikidata
Jump to: navigation, search





for permissions


for deletions


for deletion

for comment

and imports

a query


Bot requests
If you have a bot request, create a new section here and tell exactly what you want. You should discuss your request first and wait for the decision of the community. Please refer to previous discussion. If you want to request sitelink moves, see list of delinkers.

For botflag requests, see Wikidata:Requests for permissions.

Tools available to all users which can be used to accomplish the work without the need for a bot:

  1. PetScan for creating items from Wikimedia pages and/or adding same statements to items
  2. QuickStatements for creating items and/or adding different statements to items
  3. Harvest Templates for importing statements from Wikimedia projects
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2016/08.
Filing cabinet icon.svg
SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 2 days.


Import date of birth (P569)/date of death (P570) from Wikipedia[edit]

Lang 2007
→ ja 334
[en] 326 (~5%)
⇒ ru 124
⇒ uk 121
→ zh 116
pt 115
es 103
→ ar 100
fr 91
hu 83
tr 79
→ ko 56
id 52
et 49
fi 44
→ el 40
→ th 34

Wikidata:Database reports/Deaths at Wikipedia lists items with dates of death at Wikipedia (10-15% of all). Some dates in articles of most languages are likely to be suitable for import by bot. For other dates, the only formatted part may be the year of death category. --- Jura 08:06, 2 August 2015 (UTC)

@Multichill: didn't you had a script for this? It only works when there is a strict format, yes. Sjoerd de Bruin (talk) 18:24, 3 August 2015 (UTC)
Symbol oppose vote oversat.svg Strong oppose for second time imports same data from the same Wikipedia. Any kind of automatic and repeatable Wikipedia->Wikidata copy work makes all others Wikipedia vulnerable to mistakes (and vandalism) in single. -- Vlsergey (talk) 19:55, 3 August 2015 (UTC)
None of these pages currently have P570 defined, thus it's not a matter of re-import. Many articles may only exist in 1 language. --- Jura 21:11, 3 August 2015 (UTC)
1. "Reimport" is not about statements, but about project+property. Having p570 imported from any wiki, it shall not be reimported. Especially not on scheduled/automated basis. Arguments above. 2. I'm okay with single time import of P569/P570 from those projects. -- Vlsergey (talk) 15:03, 4 August 2015 (UTC)
I agree that it shouldn't be done for the current year on an automated basis. If you look at "date added" column on the lists, you will notice that most entries are fairly old. --- Jura 08:30, 5 August 2015 (UTC)
Looking at en:Patri J. Pugliese it seems that the formatted version is fairly recent (2014), en:Victoria Arellano has persondata since 2010, pt:Joaquim Raimundo Ferreira Chaves since 2011. en:Mark Abramson since February 2013, but only the DOB got imported. tr:Yasemin Esmergül has the dates in the article lead. In any case, we can validate the year for P570. Maybe someone can assess ja,zh,uk, etc. To the right, the most frequent ones on the list for 2007. --- Jura 21:11, 3 August 2015 (UTC)
Persondata in en.WP is deprecated and should not be relied on. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:06, 4 August 2015 (UTC)
Can you provide references for your claims? Thanks. --- Jura 10:26, 4 August 2015 (UTC)
Discussion of persondata: RfC: Should Persondata template be deprecated and methodically removed from articles? Jc3s5h (talk) 11:33, 4 August 2015 (UTC)
The conclusion mentioned in the link only supports Pigsonthewing's first claim. How about the second? --- Jura 11:37, 4 August 2015 (UTC)
Q8078 refers. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:41, 4 August 2015 (UTC)
Funny. Wasn't it depreciated because Wikidata could hold the data rather than for data quality reasons? --- Jura 08:30, 5 August 2015 (UTC)
No. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:43, 6 August 2015 (UTC)
Any progress on the missing reference? --- Jura 04:34, 8 August 2015 (UTC)
In reply to @Sjoerddebruin: yes I imported date of birth and date of death in the past. I was certainly not the only one. I'm quite confident the persondata template on the English Wikipedia got scraped to Wikidata quite some time ago. I don't think there is much data left to scrape from that corner. My focus was on items about humans with a link to the Dutch Wikipedia, but without data of birth. I used regular expression to extract the data of birth from the introduction of the article. You could do that for other languages too. You just need to start conservative and expand a bit in each iteration. I was able to import thousands of birth dates this way. Multichill (talk) 17:17, 4 August 2015 (UTC)
Thanks for your helpful feedback. enwiki might indeed be mostly done. For the sample year 2007 in the table above, it's just 5%. BTW nl is not on the reports as there are no nl categories for persons by year of death. --- Jura 08:30, 5 August 2015 (UTC)
Actually of the 326 for enwiki, 300 do have persondata. --- Jura 08:36, 5 August 2015 (UTC)

I imported today birth and death dates of people deceased in 2000 by parsing the introduction phrase of the English article. If the edits [1] are okay, I could continue with other years and other languages. I pay attention not to import dates before 1924 and I will not run the script twice on the same article. --Pasleim (talk) 18:42, 12 August 2015 (UTC)

Thanks! I checked 10 and they were all fine. All but 2 or 3 had the sames dates in infobox and/or persondata too.
I noticed that many trwiki articles have a person infobox, maybe this could be imported as well. --- Jura 11:04, 15 August 2015 (UTC)
That was quick. Good work! It did reduce the numbers a bit. It might be worth applying the same method to some of the templates mentioned for enwiki.
The infobox in trwiki doesn't seem that frequent, but for ptwiki, I found that many use pt:Template:dni/pt:Template:Nascimento and pt:Template:Morte or pt:Template:morte e idade/pt:Template:Falecimento e idade. This is done in infoboxes or the article text. --- Jura 07:11, 16 August 2015 (UTC)
I did some from pt:Template:Morte. --- Jura 07:21, 17 August 2015 (UTC)
pt:Template:morte e idade/pt:Template:Falecimento e idade done as well. --- Jura 09:21, 17 August 2015 (UTC)
  • For jawiki (4370 missing), I had a look at ja:Template:死亡年月日と没年齢, but that would give only about 160 most with just the month of death. eswiki has a couple of templates that could be parsed, but there is no single one. --- Jura 04:13, 19 August 2015 (UTC)
  • I had a look at 2009: Most frequent languages are: ar 125, uk 116, en 114, es 109, ru 99, hu 86
For ukwiki, of 10 articles, 6 had an infobox (5 different ones: the uk ones from Template:Infobox ice hockey player (Q5650114), Template:Infobox scientist (Q5624818), Template:Infobox architect (Q10973090), Template:Infobox person (Q6249834), Template:Infobox artist (Q5914426) normally in the format, the other 4 had the dates in the beginning of the text in Cyrillic. --- Jura 10:37, 31 August 2015 (UTC)
For ukwiki, I just imported the dates from uk:Template:Особа. --- Jura 13:33, 7 September 2015 (UTC)

Given that we might have exhausted the bot approach, I made a request at Topic:Spgr35wayo8zy15y. --- Jura 06:20, 24 September 2015 (UTC)

Is there any way to import date of birth (P569) and date of death (P570) from Slovenian Wikipedia? We are at the halfway point in updating our infoboxes with Wikidata. We have 2 tracking categories that includes articles with birth and death dates, that are not yet written into Wikidata (birth: sl:Category:Lokalnega datuma rojstva še ni v Wikipodatkih and death: sl:Category:Lokalnega datuma smrti še ni v Wikipodatkih). Our biografic articles have special introduction phrase (example: * (?), ....., † 29. marec 1770, ... or * 7. junij 1707, ...., † 2. januar 1770 or just year † 1770 or unknown † ?. We first want to transfer dates to Wikidata and then continue with cleaning our infoboxes. Afterwars we will update next half of our infoboxes and according to that subsequent import data will be needed. --Pinky sl (talk) 11:11, 17 March 2016 (UTC)

You could try to do part of it with Harvesttemplates, e.g. [2]
--- Jura 11:29, 17 March 2016 (UTC)
You mention dates during which Europe was transitioning from the Julian to the Gregorian calendars, but you don't mention the data having any calendar indication. Thus I would suggest you not import any dates before 1924. Jc3s5h (talk) 12:08, 17 March 2016 (UTC)
Ok, thanks, will see what we can do. --Pinky sl (talk) 16:20, 18 March 2016 (UTC)

ALL CAPS[edit]

A series of items for people have labels in capital letters .. if these could be converted into a more standard format .. --- Jura 08:00, 28 August 2015 (UTC)

To start off with it, I created a quarry list: Most of these labels can be converted but there are also some excpetions, e.g. RUBEN XYZ (Q7277963) --Pasleim (talk) 16:19, 16 September 2015 (UTC)
I think the Japanese ones in that list could be left out. Latin isn't the usual script for Japanese, so a Latin script name is likely be deliberately written the way it is. The ones I checked all had a jawiki sitelink which is all caps like the label. - Nikki (talk) 18:06, 16 September 2015 (UTC)
I just did the en ones for items with P31:Q5. I don't think there was any CamelCasing in it. Thanks for the list! --- Jura 09:49, 30 September 2015 (UTC)
It seems that some didn't get included, sample: Q20734549 (has two spaces). --- Jura 09:27, 1 October 2015 (UTC)
@Jura1: There are a few items using the property Dictionary of Welsh Biography ID (P1648) for people with three- or four-part names which haven't been converted, e.g. Gomer Morgan Roberts (Q20733078) and Griffith Richard Maethlu Lloyd (Q20821426). (See the list of all DWB entries here.) Would you be able to fix these? Ham II (talk) 20:43, 17 March 2016 (UTC)
I could, but as there are few other things on my todo list, I'd rather leave this to others. This is probably a re-occurring task, so maybe someone wants to build a set of lists to handle it. Once one has the item and the label to correct, corrected ones can be added with QuickStatements (Q20084080)
--- Jura 13:56, 18 March 2016 (UTC)
Things like this should be converted as well.
--- Jura 14:07, 15 April 2016 (UTC)

Commons Category <> Sitelinks?[edit]

Hello, Is it possible/feasible for a bot to sync the Commons Category property with the sisterlinks? There were some cases where I added a sisterlink but forgot to add the property, e.g. Q6226954. A bot could easily read/write these back and forth. I'm not sure if this has been proposed before or not. I'd do it myself, but admittedly, I don't understand enough about how Wikidata works. Avicennasis (talk) 05:03, 18 September 2015 (UTC)

I think this would be an interesting task. I added hundreds of commons category properties using QuickStatements but as far as I know there is no equivalent tool to add sitelinks. Using commons category property to add sitelinks would be great.--Pere prlpz (talk) 22:28, 9 October 2015 (UTC)
Symbol oppose vote.svg Oppose Items which do not represent categories (without instance of (P31)  Wikimedia category (Q4167836)) should not have sitelinks to commons categories. We have Commons category (P373) which is apparently also used by the OtherProjectsExtension for this cases. I would propose the following: For all items which do not represent categories (without instance of (P31)  Wikimedia category (Q4167836)):
This would clean up the mess we have when it comes to commons sitelinks. -- T.seppelt (talk) 05:27, 6 April 2016 (UTC)
I don't think it's fair to remove sitelinks while they're still needed for Commons itself. There are issues that need resolving before something like that could be a workable solution.
  • Interwiki links from Commons don't work if the Commons category is on another item. Moving the sitelink is nice for consistency in Wikidata, but horrible for Commons who suddenly lose all their interwiki links (... unless they ignore Wikidata and stick to old-style interwiki links).
  • The "Add link" function in the sidebar for adding interwiki links puts the Commons category on the main item.
  • Worst of all: Our own notability policy is unclear. Some people believe it says Commons categories shouldn't be added to non-category items (i.e. create a separate item for them), other people, including some admins, believe it says Commons categories are not notable items (i.e. delete the separate items).
Commons wants interwiki links to articles. People on Commons will keep both accidentally and deliberately linking categories to non-category items. We can either continue as we are (with lots and lots of those) until the issues are fixed and we can model things in a way that works for Commons too, or we can try to force Commons categories to be separate items more aggressively and create even more friction between us and Commons (potentially to the point of Commons giving up on Wikidata because we would be actively and deliberately obstructing them). - Nikki (talk) 10:07, 6 April 2016 (UTC)

Cyrillic merges[edit]

This included pairs of items with articles at ruwiki and ukwiki each (Sample: Q15061198 / Q12171178). Maybe it's possible to find similar items merely based on labels in these languages and merge them. --- Jura 03:33, 19 September 2015 (UTC)

I cannot find any ru-uk pairs. Are they all done? --Infovarius (talk) 16:27, 3 November 2015 (UTC)
The ones on that list are identified based on dates of birth/death and we regularly go through them. The occasional findings there (also with ru/be) suggest that there are more (without dates). A query would need to be done to find them. --- Jura 16:33, 3 November 2015 (UTC)
Today the list includes quite a few, thanks to new dates of birth/death being added. --- Jura 16:43, 2 December 2015 (UTC)
A step could involve reviewing suggestions for missing labels in one language based on labels in another languages with Add Names as labels (Q21640602): sample be/ru. --- Jura 11:44, 6 December 2015 (UTC)
I came across a few items that had interwikis in ukwiki to ruwiki, but as they were on separate items, these weren't used to link the articles to existing items (sample, merged since). --- Jura 10:17, 15 December 2015 (UTC)
SELECT DISTINCT ?item ?Spanishlabel ?item2 ?Italianlabel
  	VALUES ?item { wd:Q19909894 }
  	?item wdt:P31 wd:Q5 .

    VALUES ?item2 { wd:Q16704775 }
  	?item2 wdt:P31 wd:Q5 .

    ?item rdfs:label ?Spanishlabel . FILTER(lang(?Spanishlabel)="ru")
	BIND(REPLACE(?Spanishlabel, ",", "") as ?Spanishlabel2)

    ?item2 rdfs:label ?Italianlabel . FILTER(lang(?Italianlabel)="uk")

    FILTER(str(?Spanishlabel2) = str(?Italianlabel))
  	FILTER(str(?Spanishlabel) != str(?Italianlabel))

#added by Jura1

Try it!

The above currently finds one pair. It times out when not limited to specific items ;) Maybe there is a better way to find these.
--- Jura 14:19, 3 April 2016 (UTC)

In the meantime the two items were merged, so it doesn't work anymore.
--- Jura 16:54, 4 April 2016 (UTC)

Import names in Latin script from kowiki[edit]

There are a few items for persons that link only to kowiki and don't have labels in English. Samples:

The articles for these in kowiki have names in Latin script (or other scripts) defined in the introduction.

This could be imported to Wikidata as label or alias.

The two samples were already merged as they appeared on the report for identical birth and death dates. --- Jura 10:54, 14 November 2015 (UTC)

All samples have been merged now. Maybe these items were all redundant? --Pyfisch (talk) 19:41, 27 December 2015 (UTC)
Does it matter? I'd assume there are other items without labels and articles that include names in Latin script at kowiki. No need for a bot for 3 items ;) --- Jura 10:37, 28 December 2015 (UTC)

Maybe the following could work for this:

  • Generate a list of items for people that don't have labels in a series of languages (including en), but (e.g.) kowiki
  • Maybe exclude items that already meet some other criteria
  • Scan these articles for names in Latin script at predefined places
  • Present the result in a browser like the ones for dates of birth/death to confirm by a user.
    --- Jura 09:06, 16 March 2016 (UTC)

HTML entities in monolingual strings[edit]

Working with addlingue it seems that there is a few strings that have HTML entities instead of their unicode equivalent character. The substitution could be safe to make by a bot I think.

If possible (and if that exists please point me to one) maybe a constraint report on the relevant properties should be set up. author  TomT0m / talk page 15:24, 10 January 2016 (UTC)

@TomT0m: SPARQL found only 4 titles with HTML entities: Matěj Suchánek (talk) 13:53, 16 April 2016 (UTC)
This also applies to other properties with string datatype and labels, as well. --Edgars2007 (talk) 09:05, 17 April 2016 (UTC)
Don't forget that HTML entities can also be written as decimal or hexadecimal numbers. this query includes those as well as HTML tags. Still not very many left for this property though. - Nikki (talk) 12:11, 17 April 2016 (UTC)
Another query, this time including some wiki markup too. There's also Template:Complex constraint which could be used to add constraints to the relevant properties. - Nikki (talk) 12:03, 24 April 2016 (UTC)

Canadian lakes[edit]

@Laurianna2, VIGNERON: As mentioned on WD:Bistro#Liens interlangues, user:Lsj has created more than 50k items about Canadian lakes in svwiki and they need to be linked to Wikidata. Relevant catgegory: sv:Kategori:Insjöar i Kanada). Sample article: sv:Étang de Hart.

We should

  • link articles:
  1. create a new item when the article does not correspond to an existing label (parentheses excluded)
  2. when the title matches an item, either list them to check it by hand, or given that there will be hundreds of them, devise an algorithm to determine if it refers to the same lake based on coordinates.
  • add data
  1. P31: lake (Q23397) and P17:Q16
  2. P131 from the "region" param and GeoNames ID (P1566) from the geonames param of sv:Template:Geobox.
  3. add labels at least in English and French that should be equal to the Wikipedia article. When the title starts with "Lac .. ", "Baie ", "Bassin " or "étang" the first letter should be lower-cased, at least in French.
  4. elevation above sea level (P2044) and coordinate location (P625) from the Infobox, Geonoames or wherever.

Most of that can be done with creator.html, autolist, and harvesttemplates, but I think adding labels requires a real bot. If someone can do the whole thing in one go, that would probably be best. -Zolo (talk) 09:34, 30 January 2016 (UTC)

Add Names as labels (Q21640602) can work for labels. It seems that svwiki prefers that we wait a month or so after they created such stubs. Apparently it could happen that they delete entire bot created sets.
--- Jura 09:39, 30 January 2016 (UTC)
Even if the article end up being deleted in svwiki, I think it makes sense to have the items.
I didn't know Add Names as labels (Q21640602), that could do the job. So, I guess I can do it with the standard tools, but that will require something more than 10 edits by item so a flood of more than 500k edits in all, maybe it is better to wait for a bot that do it in fewer edits ? --Zolo (talk) 09:57, 30 January 2016 (UTC)
Well, if they delete it, I don't think we want to have it either. It's something Innocent bystander mentioned on some of the other series. As for the number of edits, I'm not sure if it matters.
--- Jura 10:31, 30 January 2016 (UTC)
Yes, I thought about using robots for this project. Where did you see that svwiki wanted to delete this robot's stub? Btw, I've noticed lot of mistakes in Geonames, and it seems that the site has not been updates since December.--Laurianna2 (talk) 19:41, 4 February 2016 (UTC)
If pages like these are deleted on svwiki it is most likely done because they have found mistakes in the database or that the quality of the data is poor. Pages are also deleted when Lsj find mistakes in the bot code. It is then sometimes easier to delete the pages and restart the bot. That is why I recommend you to wait a month. One problem we have detected in Canada is that there are often duplicate items in GeoNames. One item with an English name and one item with the French name. -- Innocent bystander (talk) 07:15, 2 March 2016 (UTC)

Storing ICD9 and 10 codes from EN medical templates[edit]


I propose that the ICD9 and ICD10 codes are located on medical templates in the English Wikipedia and stored here.


Most templates associated with WikiProject Medicine on the english wikipedia have associated ICD9 and 10 stored in their titles: eg. [3]

This is an attempt to store related data that is better stored here, on Wikidata. This benefits readers by allowing data to be stored in a more appropriate location, and benefits data handlers by giving them more data to play with and analyse at some future date :)


A previous bot took similar data (Gray's Anatomy and Terminologia Anatomica data) from anatomical templates and stored them here. The bot request for that is here: [4]


Ping to @ValterVB (talkcontribslogs) who was so helpful last time :). --LT910001 (talk) 22:13, 10 February 2016 (UTC)

Just for record: Wikidata:Bot requests/Archive/2015/02#Move all template ICD9 and ICD10 references to wikidata. --Edgars2007 (talk) 22:31, 10 February 2016 (UTC)

(Voice) actors[edit]

I propose to move all cast members for subclasses of animated film (Q202866) from cast member (P161) to voice actor (P725). --Infovarius (talk) 16:53, 14 February 2016 (UTC)

Not all animated films are completely animated. This couldn't be accurately done without some level of manual supervision. --Yair rand (talk) 17:02, 14 February 2016 (UTC)
How many cases do we have of this? --Izno (talk) 12:37, 15 February 2016 (UTC)
On en.wikipedia 270 (more or less) --ValterVB (talk) 19:12, 15 February 2016 (UTC)
Excellent! So we have a small excluding set for manual work, and other huge set for automated work. --Infovarius (talk) 16:41, 20 February 2016 (UTC)
So, @ValterVB:, can you help? Meanwhile some statements with wrong property are simply deleted (without moving to right property). --Infovarius (talk) 10:59, 18 March 2016 (UTC)
Can anyone make a move for The Little Prince (Q16386722)? It needs also some refinement (there are English and original French voice actors mixed). --Infovarius (talk) 22:09, 25 April 2016 (UTC)

Sorting flags by level of government[edit]

Hello. I'm trying to control constraint violations for applies to jurisdiction (P1001). Could someone please:

  1. For items in w:en:Category:National flags, change instance of flag (Q14660) to national flag (Q186516).
  2. For items in w:en:Category:United States state flags could you please change instance of flag (Q14660) or national flag (Q186516) to flag of a country subdivision (Q22807280).
  3. For items in subcategories of w:en:Category:Flags of cities by country, change instance of flag (Q14660) to flag of a municipality (Q22807298).

Thank you! --Arctic.gnome (talk) 20:59, 15 February 2016 (UTC)

@Arctic.gnome: Sorry for the delay, I'm ready to do this. Could please just in case provide a rationale why it is okay to do this task? Matěj Suchánek (talk) 13:32, 16 April 2016 (UTC)

Import number of state representatives in Congress[edit]

Please import the total number of seats in the US Hourse of Representatives, as listed in wikipedia:List_of_U.S._states_and_territories_by_population#States_and_territories table, into each US state. I think Property:P1410 is a perfect candidate for that, as it requires to qualify that this is related to US House of Reps. Also, it would be amazing to do the same for other similar legislature, like European parliament. And lastly, historical data is always amazing, if one could find it (this data is connected to US census). Having this data would allow interesting political visualizations like these demos. --Yurik (talk) 05:05, 17 February 2016 (UTC)

Property:P1410 does not seem suitable to me in this context because the most straightforward interpretation is the number of seats in the state legislature, not the US House of Representatives. It could also be interpreted as the number of seats the state has in the US Senate and US House of Representatives combined. Further confusion result because each state has its own name for its legislature and the houses that make up the legislature. Jc3s5h (talk) 13:07, 6 March 2016 (UTC)
Jc3s5h, I think that's why that property has a mandatory Property:P194 qualifier. So for my request, you can make 3 values in each state: US Congress, US Senate (2 each), and US House of representatives. --Yurik (talk) 09:59, 13 March 2016 (UTC)
I'm not familiar with mandatory qualifiers. Will the UI or the API prevent the storage of entries that lack the mandatory qualifier? Jc3s5h (talk) 13:05, 13 March 2016 (UTC)
Jc3s5h, seems that even though it is "required", the only enforcement comes from the bots at this point. I added a sample entry Q99 - seems to look good. Would be great to automate the import, plus it would be amazing if the historical numbers are also added (they kept changing throughout the history based on the population) --Yurik (talk) 21:39, 13 March 2016 (UTC)

Possible paintings report[edit]

Articles about paintings are created all the time, but unfortunately not all of them have an updated item. Every once in a while I use autolist to generate a list of items that are in the category tree under Category:Paintings (Q6009893) and don't have instance of (P31) or location (P276) (for example possible paintings on the English Wikipedia). This only works on a per wiki basis and takes quite a while to load. Does someone feel like building an onwiki report for this?

My approach would probably be:

Anyone up for this? Multichill (talk) 12:07, 21 February 2016 (UTC)

Sounds like something for WikiProject Sum of all paintings.
--- Jura 08:35, 8 April 2016 (UTC)

Taxon labels[edit]

For items where instance of (P31)=taxon (Q16521), and where there is already a label one one or more languages, which is the same as the value of taxon name (P225), the label should be copied to all other empty, western alphabet, labels. For example, this edit. Please can someone attend to this? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:11, 10 March 2016 (UTC)

Do you mean label or alias? I would support the latter where there is already a label and that label is not already the taxon name. --Izno (talk) 17:03, 10 March 2016 (UTC)
No, I mean label; as per the example edit I gave. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:16, 10 March 2016 (UTC)
See your last request: Wikidata:Bot_requests/Archive/2015/08#Taxon_names. --Succu (talk) 18:57, 10 March 2016 (UTC)
Which was archived unresolved. We still have many thousands of missing labels. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:16, 10 March 2016 (UTC)
Nope. There is no consensus doing this. Reach one. --Succu (talk) 20:22, 10 March 2016 (UTC)
You saying "there is no consensus" does not mean that there is none. Do you have a reasoned objection to the proposal? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:56, 10 March 2016 (UTC)
Go back and read the linked discussions. In the nursery of wikidata some communities had strong objections. If they changed their mind my bot can easily execute this job. --Succu (talk) 21:19, 10 March 2016 (UTC)
So that's a "no" to my question, then. I read the linked discussions, and mostly I see people not discussing the proposal, and you claiming "there is no consensus", to which another poster responded "What I found, is a discussion of exactly one year old, and just one person that is not supporting because of 'the gadgets then need to load more data'. Is that the same 'no consensus' as you meant?". There are no reasoned objections there, either. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:24, 10 March 2016 (UTC)
For the lazy ones:
--Succu (talk) 21:53, 10 March 2016 (UTC)
I already done for Italian label in past. Here other two propose: May 2014 and March 2015 --ValterVB (talk) 09:54, 11 March 2016 (UTC)
@ValterVB: Thank you. Can you help across any other, or all, western-alphabet languages, please? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:18, 16 March 2016 (UTC)
Yes I can do it, but before to modify 2,098,749 items I think is necessary to have a strong consensus. --ValterVB (talk) 18:14, 16 March 2016 (UTC)
@ValterVB: Thank you. Could you do a small batch, say 100, as an example, so we can then ask on, say, Project Chat? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:03, 18 March 2016 (UTC)
Simply ask with the example given by you. --Succu (talk) 15:16, 18 March 2016 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── @Pigsonthewing:

  • Test edit: Q14945671, Q21444273, Q2508347, Q25247.
  • Languge: "en","de","fr","it","es","af","an","ast","bar","br","ca","co","cs","cy","da","de-at","de-ch","en-ca","en-gb","eo","et","eu","fi","frp","fur","ga","gd","gl","gsw","hr","ia","id","ie","is","io","kg","lb","li","lij","mg","min","ms","nap","nb","nds","nds-nl","nl","nn","nrm","oc","pcd","pl","pms","pt","pt-br","rm","ro","sc","scn","sco","sk","sl","sr-el","sv","sw","vec","vi","vls","vo","wa","wo","zu"
  • Rule:

Very important: is necessary verify if the list of languages is complete. Is the same that I use for disambiguation item. --ValterVB (talk) 09:42, 19 March 2016 (UTC)

    • I really don't like the idea of this. The label, according to Help:Label, should be the most common name. I doubt that most people are familiar with the latin names. Inserting the latin name everywhere prevents language fallback from working and stops people from being shown the common name in another language they speak. A very simple example, Special:Diff/313676163 added latin names for the de-at and de-ch labels which now stops the common name from the de label from being shown. - Nikki (talk) 10:29, 19 March 2016 (UTC)
      • @Nikki: The vast majority of taxons have no common name; and certainly no common name in every language. And of course edits can subsequently be overwritten if a common name does exist. As for fallback, we could limit this to "top level" languages. Would that satisfy? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:02, 19 March 2016 (UTC)
        • As far as I'm aware most tools rely on the absence of certain information. Adding #10,000 csv file of Latin / Welsh (cy) species of birds. would be rendered to handcraft. --Succu (talk) 23:11, 19 March 2016 (UTC)
          • Perhaps this issue could be resolved by excluding certain groups? Or the script used in your example could overwrite the label if it matches the taxon name? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:14, 23 March 2016 (UTC)
        • It may be the case that most taxon items won't have a common name in any language, but I don't see anything here which is only trying to target the taxon items which have no common names. Adding the same string to lots of labels isn't adding any new information and as Succu pointed out, doing that can get in the way (e.g. it makes it more difficult to find items with missing labels, it can get in the way when merging (moving common names to the aliases because the target already has the latin name as a label) and IIRC the bot which adds labels for items where a sitelink has been recently added will only do so if there is no existing label). To me, these requests seem like people are trying to fill in gaps in other languages for the sake of filling in the gaps with something (despite that being the aim of the language fallback support), not because the speakers of those languages think it would be useful for them and want it to happen (if I understand this correctly, @Innocent bystander: is objecting to it for their language). - Nikki (talk) 22:40, 22 March 2016 (UTC)
          • Yes, the tolerance against bot-mistakes is limited on svwiki. Mistakes initiated by errors in the source is no big issue, but mistakes initiated by "guesses" done by a bot is not tolerated at all. The modules we have on svwiki have no problem handling items without Swedish labels. We have a fallback-system which can use any label in any language. -- Innocent bystander (talk) 06:39, 23 March 2016 (UTC)
            • @Innocent bystander: This would not involve an "guesses". Your Wikipedia's modules may handle items without labels, but what about third-party reusers? Have you identified any issues with the test edits provided above? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:14, 23 March 2016 (UTC)
              • No, I have not found any issue in the examples. But this is not my subject, I would not see an issue even if it was directly under my nose. Adding correct statements for Scientific names and Common names looks more important here for the third party users than labels, which cannot be sourced. NB, the work of Lsjbot have done that Swedish and Cebuano probably have more labels than any other language in the taxon set. You will not miss much by excluding 'sv' in this botrun. -- Innocent bystander (talk) 07:00, 24 March 2016 (UTC)
                • If a taxon name can be sourced, then by definition so can the label. If you have identified no errors, then your reference to "guesses" is not substantiated. true, adding for Scientific names and Common names is important, but the two tasks are not mutually exclusive, and their relative importance is subjective. To pick one example at random, from the many possible, Dayus (Q18107066) currently has no label in Swedish, and so would benefit from the suggested bot run. indeed, it currently has only 7 labels, all the same, and all using the scientific name. Indeed, what are the various European language's common name for this mainly Chinese genus? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:34, 25 March 2016 (UTC)
          • No, this is not "trying to fill in gaps in other languages for the sake of filling in the gaps". Nor are most of the languages affected served by fallback. If this task is completed, then "find items with missing labels" will not be an issue for the items concerned, because they will have valid labels. Meanwhile, what is the likelihood of these labels being provided manually? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:14, 23 March 2016 (UTC)
            • If this is not trying to fill in the gaps for the sake of filling in the gaps, what problem is it solving and why does language fallback not help? (I'm sure the development team would be like to know that language fallback is not working properly). The taxonomic names are not the preferred labels and valid is not the same as useful (adding "human" as the description for humans with no description was valid, yet users found it annoying and useless and they were all removed again), the labels for a specific language in that language are still missing even if we make it seem like they're not by filling in all the gaps with taxonomic names, it's just masking the problem. I can't predict the future so I don't see any point in speculating how likely it is that someone will come along and add common names. They might, they might not. - Nikki (talk) 23:02, 24 March 2016 (UTC)
              • It solves the problem of an external user, making a query (say for "all species in genus X") being returned the Q items with no labels, in their language. This could break third party applications, also. In some cases, there is currently no label in any language - how does language fallback work then? How does it work if the external user's language is Indonesian, and there is only an English label saying, say, "Lesser Spotted Woodpecker"? And, again, taxonomic names are the preferred labels for the many thousands of species - the vast majority - with no common name - or with no common name in a given language. The "human" examples compares apples with pears. This is a proposal to add specific labels, not vague descriptions (the equivalent would be adding "taxon" as a description). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:26, 25 March 2016 (UTC)
                • Why should an external user query a Wikidata internal called label and not rely on a query of taxon name (P225)? --Succu (talk) 22:04, 25 March 2016 (UTC)
                  • For any of a number of reasons; not least that they may be querying things which are not all taxons. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:32, 26 March 2016 (UTC)
                    • Grand answer. Maybe they are searching the labels for aliens, gods, fairy tales or something else? A better solution would be if the Wikibase could be configured to take certain properties like as taxon name (P225) or title (P1476) as a default value as a language independent label. --Succu (talk) 21:09, 27 March 2016 (UTC)
                      • Maybe it could. But it is not. That was suggested a year or two ago, in the discussions you cited above, and I see no move to make it so, no any significant support for doing so. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:19, 27 March 2016 (UTC)
                        • So what? Did you reached an agreement with svwiwki, cebwiki, warwiki, viwiki or nlwiki we should go along your proposed way? --Succu (talk) 21:43, 27 March 2016 (UTC)
    • @ValterVB: Thank you. I think your rules are correct. I converted the Ps &Qs in your comment to templates, for clarity. Hope that's OK. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:02, 19 March 2016 (UTC)
  • Symbol oppose vote.svg Oppose That majority of taxons does not have a common name, does not mean that all western languages should automatically use the scientific name as label. Matěj Suchánek (talk) 13:23, 16 April 2016 (UTC)
    • Nobody is saying "all western languages should automatically use the scientific name as label"; if the items already have label, it won't be changed. If a scientific label is added as a label, where none existed previously, and then that label is changed to some other valid string, the latter will not be overwritten. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:31, 20 April 2016 (UTC)

We seem to have reached as stalemate, with the most recent objections being straw men, or based on historic and inconclusive discussions. How may we move forward? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:28, 16 May 2016 (UTC)

That's simple: drop your request. --Succu (talk) 18:33, 16 May 2016 (UTC)
Were there a cogent reason to, I would. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:57, 17 May 2016 (UTC)

Add labels from sitelinks[edit]

There used to be a bot that added labels based on sitelinks (enwiki sitelink => en label). I think it stopped running at some point. Maybe some alternative should be found.
--- Jura 08:32, 8 April 2016 (UTC)

I have seen, that Pasleim's bot is doing some job in this area, at least for German and French. --Edgars2007 (talk) 16:20, 9 April 2016 (UTC)
I do it for all the languages, but only for item that have one of these values in instance of (P31):

There is the problem with uppercase/lowercase --ValterVB (talk) 16:30, 9 April 2016 (UTC)

Another rule that I use: add label if the first letter of sitelink is one of this list:
  • (
  • !
  • ?
  • "
  • $
  • '
  • ,
  • .
  • /
  • 0
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

If you have other suggestion I can add it --ValterVB (talk) 16:41, 9 April 2016 (UTC)

  • Pictogram voting comment.svg Comment Just to make sure this is clear: this is mainly for items that exist and where someone added manually a sitelink to, e.g., enwiki, but the items doesn't have a label in the corresponding language yet. It does not concern items that don't have an English label, but no sitelink to English. I don't think search finds such items if they have no label defined at all. It's key that at least a basic label is defined for such items.
    If you are looking for rules to implement, then try the ones used by PetScan (Q23665536). It mainly removes disambiguators in round brackets. I think this works fine for Wikipedia. A large amount of pages are created that way. It might not work well for Wikisource.
    --- Jura 10:50, 10 April 2016 (UTC)
Jura, these rules are applied only on item that have a sitelink but don't have a label in language of the sitelinki. I check for all the sitelink that end with "wiki", excep "commonswiki", "wikidatawiki", "specieswiki", "metawiki" and "mediawikiwiki" and I delete disambiguation with parenthesis. --ValterVB (talk) 12:13, 10 April 2016 (UTC)

Import GNIS ID (P590) from en:Template:Infobox settlement[edit]

We have started to use that property on frwiki, through Template:Infobox settlement (Q23779748), and as you can see in fr:Catégorie:Page utilisant P590. Thank you to any bot who would do this! Thierry Caro (talk) 04:33, 11 April 2016 (UTC)

Do you have some examples where it hasn't been imported? I already added thousands of those a couple of months ago. - Nikki (talk) 18:45, 12 April 2016 (UTC)
I was about to mention some place like Cheraw (Q1070214), but apparently you've found this and have already added the data. Thank you! Thierry Caro (talk) 17:16, 13 April 2016 (UTC)
The same thing with FIPS 55-3 (locations in the US) (P774) would be awesome, by the way. Thierry Caro (talk) 17:17, 13 April 2016 (UTC)

Oh! I found an example of missing GNIS ID (P590). Chincoteague (Q1073686) for instance. Thierry Caro (talk) 17:19, 13 April 2016 (UTC)

@Thierry Caro: Thanks! :) It turns out I hadn't actually finished importing the ones I'd started to import before and I'd missed some. Most of them should be there now, we now have twice as many as before. :D There's still a few hundred left that I'm going to look at once the GNIS site is back up (they're quite inconsistently represented in Wikipedia so there could be more I missed, so feel free to continue linking me to any that are missing if you want to).
Regarding FIPS codes, how urgent is it? I'm currently trying to write a bot to add information from GNIS and the US census data (and hopefully also look for information that's wrong or add missing references if I get that far). It looks like the data includes FIPS codes, so I should be able to add them from there once I'm far enough with the bot that I can add data. That would be easier than trying to extract data from templates (and I could add references too).
- Nikki (talk) 13:07, 16 April 2016 (UTC)
OK. Perfect! i'll wait for the bot, don't worry! Thierry Caro (talk) 13:50, 16 April 2016 (UTC)

Add P1082 (population) and P585 (point in time) from PLwiki to Wikidata[edit]

Looks like PLwiki has lots of population information other Wiki does not have. It will be useful to have it for all of us. בורה בורה (talk) 18:23, 12 April 2016 (UTC)

It might be helpful to give some supporting links here, to be sure to get the right information from the right place into the right fields. Can you list one pl-article and one corresponding wikidata-item that is manually filled with the desired information? Than I can see if I can get the information filled by a script in the same way. Edoderoo (talk) 18:26, 16 April 2016 (UTC)
Edoderoo sorry for the late reply. I was on vacation. Take for example the article "Żołynia" in PLwiki. It has a population of 5188 as of 2013. However this information does not exist on Wikidata item (Q2363612). There are thousands of examples like this, but you got the idea... PLwiki is really great on population. Share it with us all. בורה בורה (talk) 10:19, 4 May 2016 (UTC)

Take care of disambiguation items[edit]

Points to cover

Somehow it should be possible to create a bot that handles disambiguation items entirely. Not sure what are all the functions needed, but I started a list on the right side. Please add more. Eventually a Wikibase function might even do that.
--- Jura 13:36, 18 April 2016 (UTC)

Empty disambiguation: Probably @Pasleim: can create User:Pasleim/Items for deletion/Disambiguation . Rules: Item without sitelink, with P31 that have only 1 value: Wikimedia disambiguation page (Q4167410). For the other point my bot alredy do something, (for my bot a disambiguation is an item with P31 that have only 1 value: Wikimedia disambiguation page (Q4167410)). Descriptions I use description used in autoEdit Label: I add the same label for all the latin language only if all the sitelink without disambiguation are the same. With these 2 operation I detect a lot of duplicate: same label+description. For now the list is very long (maybe >10K item) but isn't possible to merge automatically too much errors. Another thing to do is normalize the descriptions, there are a lot of item with not standard description. --ValterVB (talk) 18:02, 18 April 2016 (UTC)
  • Personally, I'm not that much worried about duplicate disambiguation items. Mixes between content and disambiguations are much more problematic. It seems they keep appearing through problems with page moves. BTW, I added static numbers to the points.
    --- Jura 10:06, 19 April 2016 (UTC)
    You will always have duplicate disambiguation items, since svwiki has duplicate disambiguation-pages. Some of these duplicates exists because they cover different topics and some of them exists since the pages otherwise becomes to long. A third category are the bot-generated duplicates. They should be treated as temporary, until a carbon based user has merged them.
    And how are un-normalized descriptions a problem? -- Innocent bystander (talk) 10:58, 19 April 2016 (UTC)
About "un-normalized descriptions": ex I have a disambiguation item with label "XXXX" and description "Wikipedia disambiguation", if I create a new item with label "XXXX" and description "Wikimedia disambiguation" I don't see that already exist an disambiguation item "XXXX", if the description is "normalized" I see immediately the the disambiguation already exist so I can merge it. --ValterVB (talk) 11:10, 19 April 2016 (UTC)
For some fields, this proved quite efficient. If there are several items that can't be merged, as some point, there will be something like "Wikimedia disambiguation page (2)", etc.
--- Jura 12:10, 19 April 2016 (UTC)

Lazy start for point (4): 47 links to add instance of (P31)=Wikimedia disambiguation page (Q4167410) to items without statements in categories of sitelinks on Category:Disambiguation pages (Q1982926): en, simple, sl, lv, da, bg, ja, ka, la, es, fi, sv, hy, ba, ca, bs, ru, pl, et, lt, uk, eo, it, mk, kk, pt, sh, nl, id, el, fr, az, hr, sr, de, tr, be, hu, sq, nn, eu, be_x_old, sk, ro, no, cs, zh,
--- Jura 12:07, 23 April 2016 (UTC)

The biggest problem is to define what pages are disambiguation pages, given names and surnames. For example Backman (Q183341) and Backman (Q23773321). I don't see what is the difference between enwiki and fiwiki links. Enwiki page is in category "surnames" and fiwiki page in categories "disambiguation pages" and "list of people by surname", but the page in fiwiki only contains surnames, so basically it could be in the same item as the enwiki link. --Stryn (talk) 13:10, 23 April 2016 (UTC)

I think people at Wikidata could be tempted to make editorial decisions for Wikipedia, but I don't think it's up to Wikidata to determine what Wikipedia has to consider a disambiguation page. If a language version considers a page to be a disambiguation page, then it should go on a disambiguation item. If it's an article about a city that also lists similarly named cities, it should be on an item about that city. Even if some users at Wikidata attempted to set "capital" to a disambiguation page as Wikipedia did the same, such a solution can't be sustained. The situation for given names and family names isn't much different. In the meantime, at least it's clear which items at Wikidata have what purpose.
--- Jura 14:20, 23 April 2016 (UTC)
You then have to love Category:Surname-disambigs (Q19121541)! -- Innocent bystander (talk) 14:35, 23 April 2016 (UTC)
IMHO: In Wikipedia disambiguation page are page that listing page or possible page that have the same spelling, no assumption should be made about the meaning. If we limit the content to partial sets whith some specific criterion we haven't a disambiguation page but a list (ex. list of person with the same surname List of people with surname Williams (Q6633281). These pages must use tag __DISAMBIG__ to permit bot and human to recognize without doubts a disambiguation from a different item. In Wikidata disambiguation item are item the connect disambiguations page with the same spelling. --ValterVB (talk) 20:02, 23 April 2016 (UTC)

Disambiguation item without sitelink --ValterVB (talk) 21:30, 23 April 2016 (UTC)

I'd delete all of them.
--- Jura 06:13, 24 April 2016 (UTC)

Some queries for point (7):

A better way needs to be found for (7a).
--- Jura 08:07, 25 April 2016 (UTC)

I brought up the question of the empty items at Wikidata:Project_chat#Wikidata.2C_a_stable_source_for_disambiguation_items.3F.
--- Jura 09:39, 27 April 2016 (UTC)

As this is related: Wikidata:Project chat/Archive/2016/04#Deleting descriptions. Note, that other languages could be checked. --Edgars2007 (talk) 10:30, 27 April 2016 (UTC)

I don't mind debating if we should keep or redirect empty disambiguation items (if admins want to check them first ..), but I think we should avoid recycling them for anything else. --- Jura 10:34, 27 April 2016 (UTC)
As it can't be avoided entirely, I added a point 10.
--- Jura 08:32, 30 April 2016 (UTC)
Point (3) and (10) are done. For point (2) I created User:Pasleim/disambiguationmerge. --Pasleim (talk) 19:22, 2 July 2016 (UTC)
Thanks, Pasleim.
--- Jura 05:02, 11 July 2016 (UTC)
  • Matěj Suchánek made User:MatSuBot/Disambig errors which covers some of 7b.
    Some things it finds:
    • Articles that are linked from disambiguation items
    • Disambiguation items that were merged with items for concepts relevant to these articles (maybe we should check items for disambiguation with more than a P31-statement or attempt to block such merges)
    • Pages in languages were the disambiguation category isn't correctly set-up or recognized by the bot (some pages even have "(disambiguation)" in the page title). e.g. Q27721 (36 sitelinks) – ig:1 (disambiguation)
    • Pages in categories close to disambiguation categories. (e.g. w:Category:Set indices on ships)
    • Redirects to non-disambiguations. (e.g. Q37817 (27 sitelinks) idwiki – id:Montreuil – redirects to id:Komune di departemen Pas-de-Calais (Q243036, not a disambiguation)

Seems like an iceberg. It might be easier to check these by language and once the various problems are identified, attempt to sort out some automatically.
--- Jura 05:02, 11 July 2016 (UTC)

Note that my bot only recognizes pages with the __DISAMBIG__ magic word as disambiguations. If you want a wiki-specific approach, I can write a new script which will work only for chosen wikis. Matěj Suchánek (talk) 09:12, 12 July 2016 (UTC)
  • Step #4 should be done for now. The above list now includes links for 160+ sites.
    --- Jura 22:02, 5 August 2016 (UTC)
  • For step #3a, there is now Phab:T141845
    --- Jura 22:30, 5 August 2016 (UTC)
List of disambiguation item with conflict on Label/description --ValterVB (talk) 13:57, 6 August 2016 (UTC)

Import P569/P570 dates from slwiki (text)[edit]

Wiki slwiki
Items without P569 count (all)
As of Oct 12 11116 (31 %)

slwiki has dates in the formats "* YYYY" and "† YYYY." or more precise. These could be imported by bot. --- Jura 07:50, 11 October 2015 (UTC)

There seem to be a lot of dates with year-precision only.
--- Jura 06:15, 24 April 2016 (UTC)

Exploitation visa number[edit]


Can we add the exploitation visa number (P2755) (number of the exploitation visa of a movie in France) to all movie avaible in the website of the CNC? Maybe the bot can compare the label in french with the title of the movie or the year, the duration, the country, etc.

Then, can it add the CNC film rating (P2758) with :

It's written in the legal notice:

Sauf mention particulière, toute reproduction partielle ou totale des informations diffusées sur le site internet du CNC est autorisée sous réserve d’indication de la source.
Unless otherwise stipulated, any total or partial reproduction of the information published on the CNC website is authorized subject to indication of the source.

--Tubezlob (🙋) 16:58, 26 April 2016 (UTC)

Integrate data about the relationships from the Foundational Model of Anatomy into Wikidata[edit]

Most Anatomical concepts on Wikidata already have information about the Foundational Model of Anatomy ID. On the other hand the lack the information about hypernyms, holonyms and meronym that are found in the Foundational Model of Anatomy ID ontology.

On their website they describe the availability of the database:

The Foundational Model of Anatomy ontology is available under a

Creative Commons Attribution 3.0 Unported License (link is external). It can be accessed through several mechanisms:

1. The latest OWL2 files are available at These can be viewed in the latest version of Protege.

Furthermore I think that valuable infoboxes could be created based on the data from the FMA-ontology within Wikipedia.

ChristianKl (talk) 20:47, 26 April 2016 (UTC)

  • Our database is CC0, not CC3.0 Unported. This would be a copyright problem. --Izno (talk) 14:03, 27 April 2016 (UTC)
    • CC3.0 Unported doesn't require derivitive works to use CC3.0 Unported. It's not a share-alike license. What's required is attribution. If every entry in Wikidata would cite the FMA as the source, Wikidata would fulfill the attribution requirement and thus the terms of the license.ChristianKl (talk) 09:47, 18 May 2016 (UTC)

Import of data from UNESCO Intitute for Statistics[edit]

Hi all

I'd like to request the import of some data from the UNESCO Intitute for Statistics, specifically data on out of school children. I have been working Jens Ohlig to prepare a spreadsheet to import the data on a Google Doc which is available here and have created the property Number of Out of School Children.

If you have any questions or I can help with the upload please let me know


John Cummings (talk) 12:54, 27 April 2016 (UTC)

Does Q222#P2573 look good? --Pasleim (talk) 13:24, 27 April 2016 (UTC)
Hi @Pasleim:, I'm sorry, I need to sort out the properties and qualifiers in the sheet, I missed a step before sending this by mistake, I'll separate them all out and then ping you again. Thanks very much for your help :) John Cummings (talk) 14:19, 28 April 2016 (UTC)

Mother/father/brother etc.[edit]

Can some bot (regularly) update these statements, putting information in all relevant items? --Edgars2007 (talk) 14:57, 12 May 2016 (UTC)

There is a list at User:Yamaha5/List_of_missingpairs_querys.
--- Jura 07:59, 20 May 2016 (UTC)
The Bot LandesfilmsammlungBot is currently under test for this request --Landesfilmsammlung (talk) 13:01, 2 June 2016 (UTC)
User:Landesfilmsammlung: Please check the P31 values as suggested on Yamah5's list. This to avoid edits like on Q629347.
--- Jura 05:26, 3 June 2016 (UTC)
Oh thanks... I will fix it... can someone correct the Node Samuel (Q629347). Cause the children-Property seems very unusual. --Landesfilmsammlung (talk) 11:55, 3 June 2016 (UTC)
Landesfilmsammlung: the general idea is that that someone should be you. You might want to keep an idea on the constraint violation reports for properties you added a day or two earlier.
--- Jura 09:42, 4 June 2016 (UTC)

labels from name properties[edit]

For people for which we know family name (P734) See with SQID and given name (P735) See with SQID, it could be possible to autogenerate a label in several relevant languages following certain rules. Could someone set up a robot to do this ? author  TomT0m / talk page 10:13, 15 May 2016 (UTC) Ash Crow
Harmonia Amanda
Чаховіч Уладзіслаў
Place Clichy

Pictogram voting comment.svg Notified participants of Wikiproject Names

@TomT0m: I can imagine working on this but I feel it can be controversial (therefore I want more comments on this). Query for this: Matěj Suchánek (talk) 09:45, 12 July 2016 (UTC)
I have seen people insist on one item for each spelling of a name, which means an approach like this would be unreliable (at best) when languages don't copy the original spelling. I think something based on this idea could work though if it takes into account things like where the person is from and the target language (and it would be better if people who speak the target language can confirm that they would expect the original spelling to be used for all people from those countries, because there might be things which are different that we're not aware of).
For example, I can't think of many examples of British people whose names are written differently in German so if a person is British and the names match the English label, using the same label for the German label sounds like it would be very unlikely to cause a problem. At the other extreme, Japanese writes almost all foreign names in katakana based on the pronunciation, so Michael Jackson (Q2831) (American), Michael Mittermeier (Q45083) (German) and Michael Laudrup (Q188720) (Danish) are all written differently in Japanese despite all having the same given name (P735) Michael (Q4927524) statement.
Generating the expected name from the statements and comparing it to the most appropriate label seems like a good sanity check. If the name expected from the statements doesn't match the actual label, there must be a reason for it. Some of the labels or statements could be wrong and need fixing, or perhaps the person is most commonly known by a different name.
Looking at that query, a few things already stand out to me: It says "Kirsten Johnson" for Czech for Kirsten Johnson (Q6416089), but the Czech sitelink for Betsey Johnson (Q467665) is "Betsey Johnsonová". For Azeri it also says "Kirsten Johnson", but the Azeri sitelink for Boris Johnson (Q180589) is "Boris Conson". It says "Bert Jansen (příjmení)" for Czech for Bert Jansen (Q1988186).
- Nikki (talk) 10:36, 15 July 2016 (UTC)
I support doing this, per the proposal, for some languages, but not for others. I'd be happy to collaborate on drawing up a "safe list" of languages. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:41, 12 July 2016 (UTC)
P.S. See also #Taxon labels, above. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:00, 12 July 2016 (UTC)
Just be careful. The transcriptions of names with origin in Cyrillic script is transcribed differently depending on the original and transcripted language. In Swedish we also have different transcriptions of Middle East Arabic and North African Arabic names. -- Innocent bystander (talk) 19:14, 12 July 2016 (UTC)
To be honest @TomT0m: I'm really really wary about this. I saw too many P735/P734 errors to believe we can accurately expand on it. I saw too many P735/P734 correct uses which don't permit to deduce label (for example, second or third given name, with the pseudonym-family name, noble people, etc.). It seems to me like every label should be checked manually and that's not possible with a bot. @Ash_Crow: had started working on something a little different: if people had the exact same label in English, French and German, he expanded the labels to all languages in Latin script and same naming usages. --Harmonia Amanda (talk) 09:32, 20 July 2016 (UTC)
@Harmonia Amanda: What's the worst, having an item really without a label so possibly very difficult to identify or an item with a close but non accurate label ? Considering the fact that the labels are probably missing in a very big number of languages in most cases, I think a non perfect information is better that no information at all. Plus ... a good way to improve quality of datas is actually use them to start spotting and correcting errors. I guess to be really useful both as easy to maintain, the bot should check if a name property has been modified since the last time he put the label. Hence the correction would propagate in each languages in a minimal numbers of edits and it would be clear that we should focus on the naming properties to optimize the interproject effort. author  TomT0m / talk page 09:41, 20 July 2016 (UTC)
Given the number of errors already present, especially for ancient/medieval people (Romans for a start) when names used to be translated, I'd be *very* careful.
I suggest you limit your action to people who:
  1. lived in the last century or so
  2. already have at last a label in one language matching the First+last name combination
  3. don't have a pseudonym (P742) or nickname (P1449).
Ash Crow (talk) 10:53, 20 July 2016 (UTC)
Strong reticence. This task would lead to too many false positive. Furthermore, there is an active community around the question of names, available to work on these topics. The bot would impede their work. --Dereckson (talk) 12:12, 20 July 2016 (UTC)
@Dereckson, Ash Crow, Harmonia Amanda: Can you be (publicly) more specific on how the work would be impeded ? Maybe solutions can be found to make everyone happy.
I've already have a suggestion to put only the computed label as alias. author  TomT0m / talk page 12:28, 20 July 2016 (UTC)

date of birth (P569) with century precision (7)[edit]

For date of birth (P569) with century precision (7), values could be changed

Change Sample (displays as "20. century") WQS
From +(century)00-00-00T00:00:00Z/7 +2000-00-00T00:00:00Z/7 2000-01-01
To +(century-1)01-00-00T00:00:00Z/7 +1901-00-00T00:00:00Z/7 1901-01-01

For dates of birth, it seems that for century precision "dates", it would be better to specify the first year in the century rather than the last one.

When queried at WQS these appear as January 1.
--- Jura 07:38, 16 May 2016 (UTC)

Symbol oppose vote.svg Oppose With the current implementation of the time datatype lower order elements can be omitted for reduced precision without the need of doing any further calculations. --Pasleim (talk) 09:14, 16 May 2016 (UTC)
That actual leads you to mix-up 20th century people (born maybe 1910) with people from the 21st century (born 2005).
--- Jura 09:59, 16 May 2016 (UTC)
I don't unterstand your example. A person born in 1910 has the value +1910-00-00T00:00:00Z/9, born in 2005 the value +2005-00-00T00:00:00Z/9, born in 20th century the value +2000-00-00T00:00:00Z/7 and born in the 21th century the value +2100-00-00T00:00:00Z/7. If precision 9 is given, you have to omit everything except the first 4 digits, with precision 7 you have to omit everything except the first 2 digits. --Pasleim (talk) 10:34, 16 May 2016 (UTC)
The sample I have in mind would be a person born (maybe) in 1910 using +2000-00-00T00:00:00Z/7 compared to a person born in 2005 using +2005-00-00T00:00:00Z/9 . If you use just wdt:P569 rounded to the century digits you would get "20" for both.
--- Jura 15:30, 16 May 2016 (UTC)

Undo edits by Edoderoobot[edit]

User:Edoderoobot has added thousands of incorrect instance of (P31) statements (see User talk:Edoderoobot#Hundreds_.28if_not_thousands.29_of_incorrect_P31_statements). It's now been nearly 10 weeks since User:Edoderoo was first told about it and despite numerous messages, I have seen no progress at all. Although Wikidata:Bots makes it quite clear that it's Edoderoo's responsibility to clean up the mess, the incorrect statements have been there far too long already and I don't want them to continue being there indefinitely so I'm asking here to see if someone else is willing to help.

The problematic statements are instance of (P31) statements with imported from (P143) Swedish Wikipedia (Q169514) as a reference, so I'd like a bot to go through all the edits by Edoderoobot and remove any statement it added which matches that as being questionable. If the bot is able to determine whether the statement is actually correct and only remove the incorrect ones, that would be a bonus.

- Nikki (talk) 17:32, 21 May 2016 (UTC)

@Alphos: Could you help here? Matěj Suchánek (talk) 12:30, 22 May 2016 (UTC)
Most definitely can, but I just noticed this ping, and it's getting awfully late for me - and unlike some, I'd rather be available to shut RollBot down immediately, should anything undesirable happen Face-wink.svg
The last "offending" edit seems to be on May 5th (and Edoderoobot seems to be inactive since then), but could you point me to the first one, or at least give me the date ? If not don't worry, I'll find it tomorrow.
Another note is that some of these edits appear to be legit, and RollBot cannot discriminate : should it revert all of them nonetheless ?
Alphos (talk) 22:37, 22 May 2016 (UTC)
@Nikki: It's been a few days, and I haven't started RollBot on the task yet - good thing, as it turns out I was thinking of another bad set of edits Edoderoobot made, for which I'll probably contact Tacsipacsi, EncycloPetey, Tobias1984 and Multichill to offer RollBot's services, when the task at hand is done.
I really need more details (first and last edit in that set of "bad" edits, mainly), and possibly a decision as to "nuking" (reverting pages to their former state regardless of edits made by other users since, thus also reverting subsequent edits by other people) ; RollBot usually doesn't nuke, leaving the pages the pages with edits by other users and listing them instead, but it can nuke.
It may seem counterintuitive, but my bot doesn't technically revert edits, it brings pages/entities to their prior state, and there is a difference.
Alphos (talk) 13:46, 26 May 2016 (UTC)
I'm really not sure when it started or ended. :( The biggest problem for me is finding the edits. Edoderoboobot was doing multiple bot things simultaneously, so the edits adding incorrect P31 statements are mixed in with thousands of unrelated edits to descriptions and there are far more edits than I can possibly go through by hand, so I can't actually find all the bad edits. The examples I have to hand are from the 2nd and 11th of March. I'm not aware of any bad edits since I first reported it on the 15th of March (but I could just have missed them amongst all the other edits) and I think I've also seen bad edits from February too.
Since there were multiple things happening at the same time, reverting everything between two dates would also revert lots of unrelated edits. I'm not sure how many of those also had issues. It would work a lot better if it could filter based on the edit summary, the descriptions have a different summary to the P31 ones.
I'm not sure about nuking. Of the handful of examples I have to hand, most of them have been edited since. Some are good, some are bad (based on the bad bot edits), others just undid the bad edits. If there were a list of items (and it's not too many), I could maybe check them by hand, but like I said, I can't even find the bad edits. :/ - Nikki (talk) 14:52, 26 May 2016 (UTC)
Bots that do several tasks at once are a nightmare, and it's even worse when the tasks aren't thoroughly tested first >_<
But now that I have the dates (further in the past than I initially thought), I can try and see if there's a way to do it (whether for instance there are time slots where Edoderoobot worked on one task rather than another, and work in segments on the slots for that P31 task), or if, on your suggestion, I can (or should) alter RollBot to add a summary matching condition, whether by blacklisting or whitelisting or both - this alteration will however take me some more time to get into, my health being what it is.
I'll keep you posted either way.
I'll also ask the contributors to the second complaint I saw on that bot's talk page if they want me to do anything about it.
Alphos (talk) 15:30, 26 May 2016 (UTC)
@Nikki: It seems that, for Edoderoobot's March spree, Stryn has removed already a significant chunk using Autolist (example) - more or less all the ones I looked at before starting RollBot.
It doesn't excuse the fact Edoderoo didn't do it in over a month, with Stryn having done it in late April instead. It does however mean that RollBot is probably unneeded here.
On another note, thanks for the summary matching suggestion, I'll definitely think of implementing it Face-wink.svg
Alphos (talk) 16:31, 26 May 2016 (UTC)
PS: I'll contact Tacsipacsi, EncycloPetey, Tobias1984 and Multichill, to see if they need help with their issue with Edoderoobot.
I don't need help, thanks. I work on a very limited set of items here regularly. My issue was the addition of English phrases to data items as Dutch labels, and the bot continued to make the same mistake after the user was alerted and responded. For the literary entries where I saw this happen, I've cleaned up the problems already. For the taxonomic entries, I haven't looked them over because the issue will be far more complicated: many plants, animals, and other organisms will have no Dutch name and will be known only by their Latin binomial. --EncycloPetey (talk) 17:47, 26 May 2016 (UTC)
Thanks for your message Face-smile.svg
I noticed it too after reading the section of Edoderoo's talk page you replied in. The same occurred for categories, where the english labels were pretty much copy-pasted into the dutch labels, which Multichill made a note of.
Thanks also for your work rolling back those changes. Next time, you may consider to ask RollBot to do it for you Face-wink.svg It's a bit crude ("Hulk SMASH !", if you will), but it does the deed !
I'll wait for the other users to chime in on what they noticed.
Alphos (talk) 18:31, 26 May 2016 (UTC)
I have reverted an enormous amount of edits, especially the ones where the source was sv-wiki and the item didn't have an sv-wiki-link. I believe this request can therefor be archived? Edoderoo (talk) 17:03, 11 August 2016 (UTC)

Bloomberg Privat Company Search[edit]

Crawl all ~300.000 companies and add them to wikidata.  – The preceding unsigned comment was added by (talk • contribs) at 16:09, 23 May 2016 (UTC).

(related) I've research and spoken to Bloomberg employees previously on importing their symbols (BBGID). I've tried quickly proposing clear cut properties with some taking nearly a year to be approved (What you'd need). Disappointingly we've imported notability from Wikipedia with people worrying about too many items. There's also significant structural problems with Wikidata because its a crappy mirror of Wikipedia (and the smaller ones at that). Movie soundtracks can't be linked to the article's Soundtrack section (many items => 1 article). Multi-platform video games are currently a mess (1 article => many items).

To start you'll need to propose a new property Dispenser (talk) 20:09, 23 May 2016 (UTC)

MCN number import[edit]

There are 10,031 identifiers for MCN code (P1987) that can be extracted from [5] or this English version. Many (but not all) items cited are animal taxons, which can be easily machine-read. For the rest, it would be useful if the bot generated a list presenting possible meanings (by comparing the English and Portuguese versions of the xls file with Wikidata language entries). Pikolas (talk) 12:38, 14 August 2015 (UTC)

What's the copyright status of those documents? Sjoerd de Bruin (talk) 13:04, 14 August 2015 (UTC)
It's unclear. I've opened a FOIA request to know under what license those are published. For reference, the protocol number is 52750.000363/2015-51 and can be accessed at Pikolas (talk) 13:40, 14 August 2015 (UTC)
I heard back from them. They have assured me it's under the public domain. How can I prove this to Wikidata? Pikolas (talk) 01:48, 2 October 2015 (UTC)
@Sjoerddebruin: Reopening this thread since I forgot to ping you. NMaia (talk) 15:45, 1 June 2016 (UTC)
Updated links: Portuguese version, English version. NMaia (talk) 19:35, 2 June 2016 (UTC)

Mark items being used in lists as "used" on Wikidata:Requests for deletion[edit]

Currently Benebot marks some items as "in use" when there are links on a given item.

As sites such as Wikipedia start using arbitrary access to retrieve information from Wikidata, the above approach doesn't capture what may be key uses for some items.
--- Jura 16:01, 5 June 2016 (UTC)

Lowercase adjectives[edit]

It might be worth doing another conversion of lowercase adjectives in descriptions of people, "italian" → "Italian", "british" → "British", etc.
--- Jura 11:51, 22 June 2016 (UTC)

replace "country: Antarctic Treaty area" with "country: no value"[edit]

Per discussion on Project Chat, we should replace all instances of country (P17): Antarctic Treaty area (Q21590062) with country (P17): no value. I would have done this myself with Autolist/WiDaR, but I couldn't figure out how to get it to add "no value". Kaldari (talk) 18:53, 30 June 2016 (UTC)

Symbol oppose vote.svg Oppose. This was added per discussion at Property talk:P17
--- Jura 18:55, 30 June 2016 (UTC)
@Jura1: I'm aware of that discussion and why it was originally added. "country: Antarctic Treaty area" is an improvement on the various conflicting country claims that were otherwise being used, but per the discussion on Project Chat it is still incorrect, and the best claim to use is actually "country: no value", as locations such as South Pole (Q933) do not belong to any country. Kaldari (talk) 14:50, 2 July 2016 (UTC)
I agree with Kaldari. "Antarctic Treaty area" is not a country. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:00, 4 July 2016 (UTC)
I agree too. If they do not currently have a statement continent (P30) Antarctica (Q51) then the bot should add this at the same time. Thryduulf (talk) 15:20, 4 July 2016 (UTC)
I also agree. Country is "a region that is identified as a distinct entity in political geography" and Antarctic Treaty area (Q21590062) does not fit the definition. — Revi 15:46, 4 July 2016 (UTC)
Added some comments to Property talk:P17! Even if I think "ATA" could fulfil a semantic version of -revi's definition, I have doubts that these statements ever will stop being questioned. It is better to find a consensus that is acceptable, than having a never ending discussion. -- Innocent bystander (talk) 18:39, 4 July 2016 (UTC)

Strunz 8th edition (series ID)[edit]

Strunz 8th edition (series ID), Property:P711, after MinDat mainly
Changes are needed: VIII to 8, VII to 7, VI to 6, V to 5, IV to 4, III to 3, II to 2 and I to 1
Thx --Chris.urs-o (talk) 03:25, 3 July 2016 (UTC)
@Chris.urs-o: Has this been disscussed anywhere? Matěj Suchánek (talk) 09:48, 12 July 2016 (UTC)
Yes, but no answer (Wikidata_talk:WikiProject_Mineralogy/Properties#Procedures (review)).
Strunz 8 ed uses roman numbers. But this is, and uses arabic numbers. So it is back to original. and Athena adapted it to Nickel-Strunz 9 ed (updated). @Tobias1984: are you ok with it? --Chris.urs-o (talk) 09:54, 12 July 2016 (UTC)
It might be easier to parse the arabic numbers and then output them as roman numbers than the other way around? That aside, I don't think there is any standard, so we might as well choose one and do it consistent. --Tobias1984 (talk) 16:16, 12 July 2016 (UTC)
Strunz 8th edition has a slash that parse well (1/A.01-40). --Chris.urs-o (talk) 08:18, 16 August 2016 (UTC)

ARKive IDs, etc[edit]

I have a 16,000+ line CSV file, with each line having values for:

of an endangered species. A typical entry looks like:

perinet-chameleon/calumma-gastrotaenia,Calumma gastrotaenia,Perinet chameleon,172774

although (to reiterate) some lines have no IUCN ID.

Please can someone import this data, chiefly the ARKive IDs (sample edits: [6], [7]), by matching with the IUCN ID and/ or scientific name, and create a second file with each line that isn't matched, for manual attention?

I can email the file to whoever can do that. I am anticipating a second file, for other taxon ranks, shortly.

If there's a tool that I can use to do this, please let me know. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:07, 5 July 2016 (UTC)

see also. --Succu (talk) 20:58, 5 July 2016 (UTC)
That has no relevance to this request whatsoever. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:36, 6 July 2016 (UTC)
I can import it into Mix'n'match if you like, just give me the file/URL. --Magnus Manske (talk) 18:11, 7 July 2016 (UTC)
Can I take a raincheck? I was hoping for a direct import, if possible. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:31, 7 July 2016 (UTC)
Magnus you are recreating all sorts of errors we fixed in the past: spelling errors, invalid IUCN ids, ... --Succu (talk) 15:13, 8 July 2016 (UTC)
I'm uploading Andy's data. He says he's watching the progress, fixing the errors. 30K done, ~8K to go. I can abort if need be. --Magnus Manske (talk) 15:19, 8 July 2016 (UTC)
Please do not. Succu provides no detail and no example of errors. He has, though, been removing cited data with no explanation. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:06, 8 July 2016 (UTC)
invalid IUCN id. I think my bot can fix both errors. I updated the IUCN data according to the new list The IUCN Red List of Threatened Species 2016.1 (Q25354282) yesterday. --Succu (talk) 16:33, 8 July 2016 (UTC)
A highly opaque response, but nonetheless it's a valid IUCN iD, albeit one which has expired - it's still useful and appropriate to include it in Wikdiata. It seems you knew about this practice, but didn't bother to mention it when you commented here previously, nor today. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:59, 8 July 2016 (UTC)
It's a well known fact (at least since 2012) and you didn't ask for other potential problems. BTW: IUCN-ID (P627) was intended to be used as the source of IUCN conservation status (P141) and used that way until today. --Succu (talk) 19:07, 8 July 2016 (UTC)
This is one is the result of a bad datasource. --Succu (talk) 21:49, 10 July 2016 (UTC)
Since the IDs reported are valid (example: 106003040) it appears to be the result of badly-written constraints. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:45, 11 July 2016 (UTC)
The highest IUCN ID in the current list is 97357437. According to the IUCN all other IDs are not longer available und must not be used. --Succu (talk) 15:50, 11 July 2016 (UTC)
Yes; in the current list. And IUCN is not the boss of Wikidata. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:35, 11 July 2016 (UTC)
We (as Wikidata) should respect their Terms of Use. But this is only loosly related to the low quality of your dataset, Mr. Mabbett. Erica bolusiae (Q25695255) = Erica balusiae is only one example of a misspelling. Presumably (Erica bolusiae (Q3056462)) is meant. I corrected other misspellings before. I doubt you are „fixing the errors“ as Magnus suggested. Doing this is hard, time-consuming labour. Nothing that could be solved by assuming AGF or edit warring. If you want to help start here. --Succu (talk) 20:40, 11 July 2016 (UTC)

Get GeoNames ID from the Cebuano or Swedish Wikipedia[edit]

Currently there are many concepts such as no label (Q22564260) that refer to geographical features that have articles in the Cebuano and Swedish Wikipedia. For most of them there's an Infobox with information at the respective Wikipedia but not all of the information is available in Wikidata. I would propose that the information get's copied over by a bot. There are to many articles to copy information manually. Especially the GeoNames ID should be easy to copy automatically. ChristianKl (talk) 15:52, 6 July 2016 (UTC)

Be very very careful! The GeoNamesID's that has been added here before, based on the Wikipedia-links in the GeoNames database are very very often very very wrong! Starting with copying the geonames-ID's from the sv/ceb-articles is a good start! We then can detect mismatching in Wikidata and GeoNames. Other kind of information can thereafter be directly be collected from GeoNames. But even that data is often wrong. An example, large parts of the Faroe Islands (Q4628) in GeoNames is located on the bottom of the Atlantic. -- Innocent bystander (talk) 16:26, 12 July 2016 (UTC)
@Innocent bystander: Note: I did import few thousands of Geonames IDs some few weeks ago. Can't say, how many are left there. If svwiki had some tracking category, that would be helpful :) --Edgars2007 (talk) 17:18, 31 July 2016 (UTC)
@Edgars2007: I'll see what I can do (tomorrow). One issue here is that a tracking-category cannot separate the Lsjbot-articles from the others. -- Innocent bystander (talk) 18:56, 31 July 2016 (UTC)
@Innocent bystander: To clarify, I'm only asking about category for Geonames parameter, not about others. I don't see any reason why this fact (who created article) is relevant in this situation. If needed, that can be get with database query. --Edgars2007 (talk) 19:43, 31 July 2016 (UTC)
@Edgars2007: I intend to create (at least) two categories. One for when P1556 is missing here and one for when WD and WP do not agree about the geonames-id. A third potential category could be used to detect when there is a geonames-parameter in WP and it matches P1556. In such cases, the parameter could be removed from WP. -- Innocent bystander (talk) 05:25, 1 August 2016 (UTC)
@Edgars2007: ✓ Done Category:Wikipeda:Articles with a geonames-parameter but without P1566 at Wikidata (Q26205593)! It will take some time until the category is completely filled with related articles. It will also take some time after you have added the property here, until the category is removed on svwiki. -- Innocent bystander (talk) 07:01, 1 August 2016 (UTC)
The category is now filled with almost 250000 pages. A category for the cases when WD and svwp contradicts each other have ~4000 members. -- Innocent bystander (talk) 07:10, 2 August 2016 (UTC)
Yesterday evening that was some 300 pages (for the first category) :D --Edgars2007 (talk) 07:17, 2 August 2016 (UTC)
@Edgars2007: Any progress? Lsjbot is halted for some more time, so there is a possibility to catch up with hir! I am daily sorting out some of the more complicated constraints-problems and other problems reported on svwiki. -- Innocent bystander (talk) 06:37, 21 August 2016 (UTC)
@Innocent bystander: I haven't forgot about you. Yes, I haven't had (much) time to do this yet, but will try to clean-up the category. --Edgars2007 (talk) 07:38, 21 August 2016 (UTC)

Import vernacular names from Wikispecies[edit]

Wikispecies stores vernacular names for taxons, using a template, species:Template:VN. For example, on species:Passer_domesticus, the template markup includes:

{{VN |af=Huismossie |als=Spatz, Schbads |ar=عصفور دوري |de=Haussperling |en=House Sparrow

(and many more entries besides). Note that Spatz, Schbads represents two names, separated by a comma.

We need to have these names imported to taxon common name (P1843), avoiding duplication of existing values, and with the language codes. Assuming the latter is technically possible, can someone do this, please? We can then work on converting the template to pull its data from Wikidata. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:49, 7 July 2016 (UTC)

Pictogram voting info.svg Info There is a related discussion at Wikispecies: Moving vernacular names to Wikidata. --Succu (talk) 11:32, 18 July 2016 (UTC)

Change Q19812227 on Q19002866[edit]

Mass property changing needed. Dmitri (Q19812227) is essentialy a disambiguation page, whereas a lot of items point to it as to a male given name. The correct link is Dmitry (Q19002866). Artem Korzhimanov (talk) 13:32, 15 July 2016 (UTC)

Dmitri and Dmitry are different names. --Stryn (talk) 15:15, 15 July 2016 (UTC)
Dmitri (Q19812227) was created to be about the given name. So it's better to move the Russian sitelink than chaning all links. --Pasleim (talk) 15:38, 15 July 2016 (UTC)
Well, then there is question where is an item for a disambiguation page w:ru:Дмитрий (значения), because for the given name there exists a separate page w:ru:Дмитрий. Another question, what is the purpose of Dmitriy (Q21694209)? Artem Korzhimanov (talk) 18:52, 16 July 2016 (UTC)
Artem, what is the purpose of w:ru:Дмитрий (значения)? List can be included in the main article, or it can be named "List of people named..." - format is vague. --Infovarius (talk) 22:47, 16 July 2016 (UTC)
Definitely, it could. But it's not. At least, in Russian Wikipedia. So we need an item for this disambiguation page. As well as we need a separate item for the name itself. Artem Korzhimanov (talk) 01:28, 17 July 2016 (UTC)

Updating population of US towns[edit]

Hello, I was wondering if a bot can be used to update population estimates in the U.S for 2015. I think a good source of information is here. It is a government website. Is this feasable?MechQuester (talk) 06:09, 26 July 2016 (UTC)

Import Template:Bio from itwiki[edit]

To avoid that the gap gets to big, it might be worth doing another import. There are a series of steps outlined in Help:Import Template:Bio from itwiki.
--- Jura 17:08, 29 July 2016 (UTC)

Labels in English for items whose names begin with "The"[edit]

Items with a label in English beginning with "The " (note space; case insensitive), like "The Dark Side of the Moon", should have an alias in the form "Dark Side of the Moon" and/or "Dark Side of the Moon, The", if one does not already exist. Can someone do this, please? I'm ambivalent as to which of the varieties, or both, is used. This also applies to labels in types of English, such as en-GB. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:52, 4 August 2016 (UTC)

Category:Commons category without a link on Wikidata[edit]

Hello.Please connect the sub-pages of this category with items (AWB should be used).Thank you --ديفيد عادل وهبة خليل 2 (talk) 13:01, 6 August 2016 (UTC)

I would like to see, how it could be done with AWB ;) --Edgars2007 (talk) 23:57, 9 August 2016 (UTC)
AWB should be used if possible.It is generally useful Face-glasses.svg --ديفيد عادل وهبة خليل 2 (talk) 08:50, 10 August 2016 (UTC)
That isn't an answer to his question. Sjoerd de Bruin (talk) 09:05, 10 August 2016 (UTC)
Personally, I don't understand this request. What is meant with a "sub-page"? AFAIK, AWB doesn't work on Wikidata, so if the bot should work on another wiki, this request does not belong here. Matěj Suchánek (talk) 14:36, 10 August 2016 (UTC)
For clarity: Special:Diff/362948011. The request is to simply clean-up Category:Commons category without a link on Wikidata (Q11925744). Did import few ten thousands weeks or months ago from enwiki, but it wasn't enough - there are a lot more. --Edgars2007 (talk) 14:47, 10 August 2016 (UTC)
Maybe the time has come to convert this to normal sitelinks.
--- Jura 14:56, 10 August 2016 (UTC)

CWGC cemetery IDs[edit]

Many instances of en:Template:CWGC cemetery are used in infoboxes, like on en:Doiran Memorial. Please can someone extract the IDs from them, and add them to the equivalent items here, using CWGC burial ground ID (P1920)? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:03, 6 August 2016 (UTC)

Can't this be done with HarvestTemplates? --Pasleim (talk) 23:10, 10 August 2016 (UTC)

Zerozero footballer Ids & others in refs[edit]

It seems that many of en.Wikipedia's use of en:Template:Zerozero is in <ref></ref> tags. Does anyone have a bot that can compare the subject of the target page, and add matches using footballzz ID (P3047)? And similar cases?Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:14, 9 August 2016 (UTC)


-- Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:30, 11 August 2016 (UTC)

Replace incorrect usage of P1896 with P854[edit]

I've noticed that people have been using source website for the property (P1896) instead of reference URL (P854), easy confusion. Can someone fix this? It also shows up as qualifier suggestion for work location (P937), so there must be something really wrong on that area then. Sjoerd de Bruin (talk) 21:33, 10 August 2016 (UTC)

I'm not sure if this couldn't be misunderstood. Frequently values added come only from Wikipedia and not from a "source website". We could use a property to link to these source categories at Wikipedia.
--- Jura 10:05, 11 August 2016 (UTC)
Actually, this should be changed on items, e.g. Q19455277
--- Jura 10:37, 14 August 2016 (UTC)
I think this was the intention of Sjoerddebruin to change it on items. On 2015 Iditarod (Q19455277) I replaced now source website for the property (P1896) with reference URL (P854). Is it okay if i continue with other items? --Pasleim (talk) 10:40, 14 August 2016 (UTC)
Sure, please do. Initially I had misunderstood his requests. The use might come from the fact that before May 2016 "URL de référence" lead to this property only and the other could only be found with "URL de la référence" (in French).
--- Jura 10:52, 14 August 2016 (UTC)
✓ Done including the misuse on work location (P937). There are some more claims where source website for the property (P1896) is used as qualifier, see Wikidata:Database_reports/Constraint_violations/P1896#.22Value_only.22_violations --Pasleim (talk) 12:37, 14 August 2016 (UTC)
If the remaining uses of this property on items could be cleaned-up, we could attempt to add a filter that blocks further use. Alternatively, the uses on properties could be replaced with P854 as well and the property done away with.
--- Jura 13:38, 14 August 2016 (UTC)

Remove incorrect svwiki sitelinks[edit]

Many svwiki articles about places of China created by Lsjbot have incorrect zhwiki sitelinks, which are imported to Wikidata. Some examples I have fixed manually are:

  1. sv:Gaohu (köpinghuvudort i Kina, Jiangxi Sheng, lat 28,93, long 115,24), incorrectly links to no label (Q11671865) (a person), correct item is Gaohu (Q2656384) (a town)
  2. sv:Yushan (köpinghuvudort i Kina, Chongqing Shi, lat 29,53, long 108,43), incorrectly links to no label (Q22079207) (a person), correct item is no label (Q13714765) (a town)
  3. sv:Hongfenghu, incorrectly links to no label (Q15928338) (a lake), correct item is no label (Q14143028) (a town). I have added the correct svwiki link to the former article
  4. sv:Bianhe (köping i Kina, Anhui), incorrectly links to no label (Q11137293) (a disambiguation page), correct item is no label (Q11137300) (a subdistrict, formerly a town)
  5. sv:Chenyaohu, incorrectly links to no label (Q16935572) (a lake), correct item is no label (Q14343855) (a town)

Request to:

  1. Remove all svwiki sitelinks and Swedish labels in (all are errors)
  2. Remove all svwiki sitelinks and Swedish labels in (all are errors)
  3. Remove all svwiki sitelinks and Swedish labels in (all are errors)
  4. Remove all svwiki sitelinks and Swedish labels in sv:Kategori:Robotskapade Kinaartiklar if the distance between the Wikidata P625 and coordinate in svwiki>100km

--GZWDer (talk) 15:43, 11 August 2016 (UTC)

Note: Lsjbot have not completed all articles about places of China. This should be done again when it's completed.--GZWDer (talk) 15:47, 11 August 2016 (UTC)
Looks like Chinese to me. Maybe @Innocent bystander: can explain it in Swedish to Lsj.
--- Jura 16:01, 11 August 2016 (UTC)
@Lsj: most likely speaks/writes better English than me.
I think Lsjbot has finished with Peoples republic of China. (The nations has been edited in alphabetic order after the ISO-codes, with a few exceptions for nations who have been requested (ex: Syria) or was a part of a benchmark (ex: South Sudan)) This is a known bug, but it has been hard to undo all the mistakes. It happend when places in China was interwiki-linked to articles on zh-wiki with the same "label" as the Chinese places in GeoNames. So the Swedish labels are not necessarily wrong in these cases, even if the sitelinks are. -- Innocent bystander (talk) 16:46, 11 August 2016 (UTC)
There're still a red link in sv:Bianhe. For labels: zhwiki articles always use full name, not short names; in addition, the full name may not be unique, which disambiguation pages are needed.--GZWDer (talk) 17:18, 11 August 2016 (UTC)
I ping @Bothnia: who is skilled in both East Asian and Swedish. My knowledge in Chinese is extremely limited! -- Innocent bystander (talk) 18:29, 11 August 2016 (UTC)

United Kingdom properties[edit]

Please could someone add:

{{Country-related property|Q145}}

to the talk pages of all the properties shown in {{United Kingdom properties}} (except National Library of Ireland authority (P1946)) ? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:49, 12 August 2016 (UTC)

Likewise for {{Australia properties}}; using {{Country-related property|Q408}}, please. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:22, 14 August 2016 (UTC)


SIMBAD ID (P3083) cannot be collected from en.Wikipedia using HarvestTemplates, as most of the values are in subtemplates, such as en:Template:Planetbox reference, which builds them by concatenating two other values. Can anyone fetch them, please? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:34, 13 August 2016 (UTC)

Mayors of Spain[edit]

It would be great to carry out the following task.

for each Wikidata item A
    instance of (P31) = municipality of Spain (Q2074737) and
    country (P17) = Spain (Q29) and
        not exists office held by head of government (P1313) or
        office held by head of government (P1313) = mayor (Q30185) or
        office held by head of government (P1313) = alcalde (Q5663900)
        there is no Wikidata item labeled as "alcalde de $A" in Spanish and
        there is no Wikidata item labeled as "mayor of $A" in English
        create a new item B
            label in Spanish := "alcalde de $A"
            label in English := "mayor of $A"
            description in Spanish := "cargo político"
            subclass of (P279) := alcalde (Q5663900)
            applies to jurisdiction (P1001) := A
            country (P17) := Spain (Q29)
            instance of (P31) := position (Q4164871)
        edit item A
            office held by head of government (P1313) := B

Any improvements are also appreciated. Thanks in advance, and best regards, --abián 15:45, 16 August 2016 (UTC)


for each Wikidata item A in <>
        there is no Wikidata item labeled as "alcalde de $A" in Spanish and
        there is no Wikidata item labeled as "mayor of $A" in English
        create a new item B
            label in Spanish := "alcalde de $A"
            label in English := "mayor of $A"
            description in Spanish := "cargo político"
            subclass of (P279) := alcalde (Q5663900)
            applies to jurisdiction (P1001) := A
            country (P17) := Spain (Q29)
            instance of (P31) := position (Q4164871)
        edit item A
            office held by head of government (P1313) := B

Thanks again. --abián 22:28, 23 August 2016 (UTC)

  • The other day I did something like this with QuickStatements (Q20084080). It takes a bit of preparation, but it's doable.
    --- Jura 22:34, 23 August 2016 (UTC)
    • Thanks, Jura! I'll try. Anyway, I don't know how to check this...

        there is no Wikidata item labeled as "alcalde de $A" in Spanish and
        there is no Wikidata item labeled as "mayor of $A" in English

Do you know any tool that lists all the Wikidata items starting by a prefix? This part tries to avoid creating duplicates but, if it's not possible to list these items, perhaps it's better to ignore this restriction and check for duplicates later. What do you think? --abián 23:09, 23 August 2016 (UTC)
It's possible to do that on . After a summary check of [8] and [9], I'd use the assumption that if such an item exists, it's linked to/from the item of the city.
--- Jura 23:25, 23 August 2016 (UTC)
Job in process... --abián 12:37, 24 August 2016 (UTC)

Remove P2273 as a qualifier[edit]

According to the entity suggester, Heidelberg Academy for Sciences and Humanities member ID (P2273) is being used a lot as qualifier for member of (P463). I don't think the property should be used as that. Sjoerd de Bruin (talk) 16:24, 16 August 2016 (UTC)

Yes, it should be used only as either a source or directly on the item's page. I'm not inclined to agree with its use as a source, and it should at least be used as a statement. --Izno (talk) 16:30, 16 August 2016 (UTC)
User:Laddo added most of these qualifiers. Any comments? --Pasleim (talk) 16:34, 16 August 2016 (UTC)
No strong opinion about this; I guess at that time, I thought it was a good idea to prove that the person is indeed a member of (P463) = Heidelberg Academy for Sciences and Humanities (Q833738) by qualifying the statement with the corresponding Heidelberg Academy for Sciences and Humanities member ID (P2273). On second look, and from a data perspective, it sounds reasonable to rather have that very same statement as a source. -- LaddΩ chat ;) 22:13, 16 August 2016 (UTC)
So we can move the qualifier to the source section or remove it completely, as it is already available as external identifier. Sjoerd de Bruin (talk) 13:35, 24 August 2016 (UTC)

Redundant P1343 for DNB00[edit]

A lot of items contain two described by source (P1343) statements for the exact same article. One links the article directly, and one uses Dictionary of National Biography (1885-1900) (Q15987216) as value with the article as qualifier. I think the latter is the correct way of linking these, thus the redundant statements should be removed. See this item for a example. Sjoerd de Bruin (talk) 19:17, 17 August 2016 (UTC)

Wikidata:WikiProject DNB recommends the other. Using Q15987216 would be redundant as it's present in the linked item.
--- Jura 05:47, 18 August 2016 (UTC)
@Sjoerddebruin: I also prefer to link directly to the article, as recommended in Wikidata:WikiProject DNB#Examples. Is it okay for you if I remove the statement with the qualifier? --Pasleim (talk) 12:47, 24 August 2016 (UTC)
Please do, everything is better than this duplicated stuff. Sjoerd de Bruin (talk) 13:33, 24 August 2016 (UTC)


We have lots of items for places, which are instances of hundred (Q313354), in various countries.

We should split these up, by country, with instance of "Hundred (England and Wales)", "Hundred (Sweden)" and so on. I'll create I have created the new "hundred" items, can anyone oblige with the rest, please? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:14, 23 August 2016 (UTC)

VIAF import[edit]

Please see Property_talk:P214#Import_.3F.
--- Jura 05:09, 25 August 2016 (UTC)

Revert label additions by Edoderoobot in the beginning of May[edit]

In the beginning of May, Edoderoobot (talkcontribslogs) copied a lot of labels from other languages like here and here. You can clearly see that these aren't acceptable labels in Dutch. I've asked the bot operator multiple times to clean this up, but they are still there. Can someone help me? Sjoerd de Bruin (talk) 07:17, 25 August 2016 (UTC)

  	?item wdt:P31 wd:Q13406463 .
	?item rdfs:label ?labelnl FILTER(lang(?labelnl)="nl")
  	?item rdfs:label ?labelen FILTER(lang(?labelen)="en" && str(?labelnl) = str(?labelen) )

Try it!

Above a list of (all) NL labels that are identical with EN (4698). You could use QuickStatements to delete the label for some or all (or replace it).
--- Jura 07:36, 25 August 2016 (UTC)

I would not be bothered if they were cleared all. I have now filtered for what items/instance of (P31) it makes sense to take over the English description (right now items like human (Q5)), so if any are deleted in excess I can re-do them with my (repaired) bot script. But i will have a look myself if this SPARQL-script can automate a repair action. This might be the opening I needed to get it fixed myself. Edoderoo (talk) 08:21, 25 August 2016 (UTC)
I created a repair script based on the above SPARQL-query... will run it tomorrow, as right now another script isn't finished yet. Please be adviced that there might be more P31-types, but those can be fixed with the same script. Most likely Sjoerd will keep contact with me about those, but feel free to contact me in case someone finds another case. Once more thanks to Jura for this helpful SPARQL-script! Edoderoo (talk) 13:24, 25 August 2016 (UTC)

member of (P463) > P39 qualifier parliamentary term (P2937)[edit]

Some time ago the qualifier P2937 was created. It should allow to link the parliamentary term from the statement in P39. Many items still have the information in P463. The difficulty is to find the P39 that goes with it. Sample change: [10].
--- Jura 07:17, 25 August 2016 (UTC)