Shortcuts: WD:RFBOT, WD:BRFA, WD:RFP/BOT

Wikidata:Requests for permissions/Bot

From Wikidata
Jump to: navigation, search
Wikidata:Requests for permissions/Bot
To request a bot flag, or approval for a new task, in accordance with the bot approval process, please input your bot's name into the box below, followed by the task number if your bot is already approved for other tasks.


Old requests go to the archive.

Once consensus is obtained in favor of granting the botflag, please post requests at the bureaucrats' noticeboard.

Translate this header box!
Bot Name Request created Last editor Last edited
AliciaFagervingWMSE-bot 3 2017-03-20, 08:36:27 Ymblanter 2017-03-22, 20:06:42
Emijrpbot 6 2017-03-18, 17:38:27 Ymblanter 2017-03-20, 21:44:42
PokestarFanBot 2017-03-16, 00:34:47 MechQuester 2017-03-17, 03:55:47
MechQuesterBot 2 2017-03-15, 01:45:06 Ymblanter 2017-03-20, 21:43:34
AliciaFagervingWMSE-bot 2 2017-03-15, 15:16:01 Ymblanter 2017-03-19, 10:01:24
Emijrpbot 5 2017-03-11, 23:45:07 Ymblanter 2017-03-19, 09:59:44
Emijrpbot 4 2017-03-09, 12:31:07 Ymblanter 2017-03-19, 09:58:44
RaymondYeeBot 2017-03-11, 23:38:25 Ymblanter 2017-03-18, 23:06:44
ZacheBot 2017-03-04, 23:29:38 Ymblanter 2017-03-14, 21:20:39
JhealdBatch 2017-03-02, 19:08:25 Ymblanter 2017-03-14, 21:17:17
Mr.Ibrahembot 2 2017-02-28, 17:07:38 Ymblanter 2017-03-05, 00:01:31
НСБот 2017-02-24, 12:12:11 Ymblanter 2017-03-03, 08:48:41
MechQuesterBot 1 2017-02-26, 22:31:51 Ymblanter 2017-03-14, 21:16:18
YULbot 2017-02-21, 18:05:13 Ymblanter 2017-03-14, 21:12:25
AliciaFagervingWMSE-bot 2017-02-14, 13:15:02 Ymblanter 2017-03-02, 07:49:38
Emijrpbot 3 2017-02-12, 11:12:07 Ymblanter 2017-02-17, 20:39:15
JarBot 2 2017-02-12, 04:57:28 Ymblanter 2017-02-23, 04:52:25
Mr.Ibrahembot 2017-02-11, 19:02:51 Ymblanter 2017-02-26, 20:25:12
JayWackerBot 2017-02-09, 17:26:47 JayWacker 2017-03-01, 18:18:50
Emijrpbot 2 2017-02-07, 14:35:11 Ymblanter 2017-02-14, 19:40:43
YBot 2017-01-12, 16:43:19 Pasleim 2017-01-15, 19:26:39
EaasServiceBot 2017-01-10, 15:09:13 Ymblanter 2017-03-05, 00:00:16
DiscogsBot 2016-12-12, 11:32:55 Pasleim 2017-01-15, 19:52:08
DoctorBot 2016-11-27, 03:01:24 DoctorBud 2016-12-21, 00:51:03
QuickStatementsBot 2016-11-30, 14:16:53 Ymblanter 2017-02-13, 20:01:42
HannolansBot 2016-11-23, 09:37:08 Ymblanter 2017-02-27, 21:03:34
WikiLovesESBot 2016-07-03, 10:25:13 Jura1 2016-08-26, 08:42:53
MatSuBot 6 2016-07-01, 19:12:23 Ymblanter 2016-07-05, 14:47:41
1-Byte-Bot 2016-03-02, 15:23:09 1-Byte 2016-03-03, 08:58:36
Hkn-bot 2016-01-16, 18:52:00 Pasleim 2017-01-15, 20:02:36
RollBot 2016-01-14, 17:01:42 Alphos 2017-01-19, 13:58:00
Dexbot 11 2015-04-07, 18:15:00 Ladsgroup 2017-01-05, 19:12:57
KunMilanoRobot 2014-01-21, 19:27:44 Alphama 2016-06-21, 18:15:06
AviBot 2016-05-17, 21:29:54 Offthewoll 2016-05-17, 21:29:54
fabot 2017-02-12, 19:51:38 Vogone 2017-02-12, 19:51:38
florentyna 2017-02-12, 19:50:21 Vogone 2017-02-12, 19:50:21
InteliBOT 2015-01-16, 20:14:11 Vogone 2017-02-12, 19:49:42
Mahirbot 2016-02-25, 04:22:59 Vogone 2016-02-25, 04:23:52
pywikibot 2017-02-12, 19:47:50 Vogone 2017-02-12, 19:47:50
ravenXBot2 2017-02-12, 19:46:32 Vogone 2017-02-12, 19:46:32
SaamDataImportBot 2016-04-20, 18:45:43 160.111.254.17 2016-05-02, 14:17:01
welvon-bot 2017-02-12, 19:47:23 Vogone 2017-02-12, 19:47:23

PokestarFanBot[edit]

PokestarFanBot (talkcontribsSULBlock logUser rights logUser rights)
Operator: PokestarFan (talkcontribslogs)

Task/s:Welcomes users who have done non-userspace edits

Code:GitHub

Function details: Finds a list of users who have done 1 edit in last 30 days. Looks to see if the talk page is created. Looks to see if the edit is in userspace. If not, creates the talk page with a welcome. --PokestarFanBot (talk) 00:34, 16 March 2017 (UTC)

Comment The GitHub repository that is pointed to above seems to contain pywikibot code for welcoming users, but there does not seem to be any particular setup for Wikidata. The user has made a number of edits mentioning "pywikibot" in the summary from their main account. The edit summary would have to contain something like "Welcome to Wikidata" rather than the current "Pywikibot". I'm also not sure whether there is value in bot-based welcoming. Part of the point of welcoming is that there is an actual human behind the welcome, someone who you can talk to if you need help or assistance. I'd suggest that the community be quite cautious in approving this bot on both policy and technical grounds. —Tom Morris (talk) 13:09, 16 March 2017 (UTC)
Comment. He himself promised to not use automation a few days ago after making massive mistakes. MechQuester (talk) 15:12, 16 March 2017 (UTC)
@MechQuester:This is welcoming, not editing anything important. Plus, I am going to try and give it another use on my userspace. PokestarFan (talk) (My Contribs) 21:00, 16 March 2017 (UTC)

That is still automation. MechQuester (talk) 03:55, 17 March 2017 (UTC)

MechQuesterBot[edit]

MechQuesterBot (talkcontribsSULBlock logUser rights logUser rights)
Operator: MechQuester (talkcontribslogs)

Task/s: Plan to use descriptioner to add "Beetle" or "Bettle in Cerambycidae family" to en:Category:Lamiinae

Code:

None. All manual. It would use petscan to get the items in the category. Then use quickstatements to add P31:Q22671 + "species of insect" if it does not have it.MechQuester (talk) 01:44, 15 March 2017 (UTC)

Please make a couple of test edits--Ymblanter (talk) 21:43, 20 March 2017 (UTC)

ZacheBot[edit]

ZacheBot (talkcontribsSULBlock logUser rights logUser rights)
Operator: Zache (talkcontribslogs)

Task/s: Import data from pre-created CSV lists.

Code: based on Pywikibot (Q15169668), sample import scripts [1]

Function details:

--Zache (talk) 23:29, 4 March 2017 (UTC)

@Zache:, could you pls make a couple of test edits, I do not see any lakes in the contribution of the bot.--Ymblanter (talk) 21:20, 14 March 2017 (UTC)

НСБот[edit]

НСБот (talkcontribsSULBlock logUser rights logUser rights)
Operator: Nikola Smolenski (talkcontribslogs)

Task/s: Import Zapis database of Wikimedia Serbia.

Code: Code not yet written.

Function details: I need a bot to import data from the database of Zapis trees as a part of Zapis - Sacred Tree project of Wikimedia Serbia. This will essentially be data from the table at sr:Списак записа у Србији#Табела регистрованих записа though there are additional data (such as tree height and similar).

The first bot task should be to fix data about municipalities of Serbia, examples of manual edits: [7] and [8]. Then it should create items about cadastral municipalities of Serbia, then about the trees.

I have previously operated commons:User:NSBot and sr:Корисник:НСБот without any problems. --Nikola (talk) 12:08, 24 February 2017 (UTC)

@Nikola Smolenski:, please register the bot account and make some test edits.--Ymblanter (talk) 08:48, 3 March 2017 (UTC)

MechQuesterBot[edit]

MechQuesterBot (talkcontribsSULBlock logUser rights logUser rights)
Operator: MechQuester (talkcontribslogs)

Task/s: Add description to a bunch of villages in China.

Code:

SPARQL Code: select ?item {?item wdt:P31 wd:Q13100073}

Function details: I am refiling this bot request. It is to add descriptions to the villages in China. --MechQuester (talk) 22:31, 26 February 2017 (UTC)

What descriptions? "village in China"? Or in which languages? Steak (talk) 16:17, 28 February 2017 (UTC)
just in Chinese + English. MechQuester (talk) 17:55, 28 February 2017 (UTC)
I am also adding "subdistrict of China" to a few entries. I ran 10 test edits if anyone wants to read. MechQuesterBot (talk) 18:05, 28 February 2017 (UTC)
  • BTW I started looking into the timezone stuff. There are queries at Wikidata:Database_reports/time_zones.
    --- Jura 19:02, 28 February 2017 (UTC)
    • I reviewed some of it and I don't think it shows big problems even if the DST question should be standardized.
      --- Jura 11:31, 5 March 2017 (UTC)

anyone gonna comment any furthur?03:13, 2 March 2017 (UTC)

  • There are 590,000 items for villages in China but only 250,000 different labels for villages in China. [9] That means there are many villages sharing the same name. Since the software enforces to have label+description to be unique, you will run into many conflicts. --Pasleim (talk) 12:42, 2 March 2017 (UTC)
    • I think these item could need some edits to make them useful to people who don't speak Chinese. Let him have a go. Eventually maybe a better description will emerge.
      --- Jura 11:31, 5 March 2017 (UTC)
Im not totally willing to make a go at it now that there are numerous conflicts. MechQuester (talk) 16:37, 5 March 2017 (UTC)
Well, maybe it could be limited by doing something like "village in <adm3>, <adm2>, <adm1>, PRC". As I'm not sure which administrative layers are relevant and stable for China, I can't really suggest more. For the US, this could read "village in Trump County, Wyoming, USA" or "neighborhood of the city of <..>, Trump County, Wyoming, USA". For conflicts that remain, one would need to find a manual improvement. Maybe there is a default way to disambiguate such names.
--- Jura 16:52, 5 March 2017 (UTC)
Perhaps there. It would require some form of looking at parent "located in administrative category" but i don't have the coding skill abandoned. Thus a new one is better.

How about a new proposal?

select ?item {?item wdt:P31 wd:Q61878}

Using this to add P421:Q6985. MechQuester (talk) 03:27, 10 March 2017 (UTC)

What about just P421 and Q6985 for every article? MechQuester (talk) 17:18, 10 March 2017 (UTC)

@MechQuester, Jura1, Pasleim:, are we ready for approval here?--Ymblanter (talk) 21:16, 14 March 2017 (UTC)

Different Request[edit]

Plan to use descriptioner to add "Beetle" or "Bettle in Cerambycidae family" to en:Category:Lamiinae MechQuester (talk) 03:31, 14 March 2017 (UTC)

Please file it as a separate request so that it could be properly discussed.--Ymblanter (talk) 21:16, 14 March 2017 (UTC)

YULbot[edit]

YULbot (talkcontribsSULBlock logUser rights logUser rights)
Operator: YULdigitalpreservation (talkcontribslogs)

Task/s:

  • YULbot has the task of creating new items for pieces of software that do not yet have items in Wikidata.
  • YULbot will also make statements about those newly-created software items.

Code: I haven't written this bot yet.

Function details:

This bot will set the English language label for these items and create statements using publisher (P123), ISBN-13 (P212), ISBN-10 (P957), place of publication (P291), publication date (P577). --YULdigitalpreservation (talk) 18:04, 21 February 2017 (UTC)

good to run a test with a few examples so we can see what you're planning! ArthurPSmith (talk) 20:46, 22 February 2017 (UTC)
Interesting. Where does the data come from? Emijrp (talk) 12:04, 25 February 2017 (UTC)
The data is coming from the pieces of software themselves. These are pieces of software that are in the Yale Library collection. We could also supplement with data from oldversions.com.YULdigitalpreservation (talk) 13:07, 28 February 2017 (UTC)
Please let us know when the bot is ready for approval.--Ymblanter (talk) 21:12, 14 March 2017 (UTC)

JayWackerBot[edit]

JayWackerBot (talkcontribsSULBlock logUser rights logUser rights)
Operator: JayWacker (talkcontribslogs)

Task/s: This will be used to set and remove Quora topic identifiers. It will also update the matches as Quora topics are renamed or merged. We are manually vetting the 250,000 MixNMatch matches of Quora topic to Wikidata entity. This bot will not update other properties of the Wikidata entity.

Code:

Function details: I'm unsure how much detail is necessary

set_quora_identifier(wikidata_id, quora_relative_url)

remove_quora_identifier(wikidata_id, quora_relative_url)

--JayWacker (talk) 17:25, 9 February 2017 (UTC)

Could you please explain in more detail on which basis you will remove or update Quora topic ids? How will setting Quora topic ids be different from the current approach with Mix'n'Match? --Pasleim (talk) 13:32, 14 February 2017 (UTC)
@Pasleim: Mix'n'Match may be used more rapidly with a bot-flagged account. @JayWacker: You may have missed this question. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:02, 19 February 2017 (UTC)
@Pasleim: First, we are manually vetting the 255,000 matches that MixNMatch identified. As generous as @Pigsonthewing: has been, we are taking his time regularly going through the hundreds of thousands of matches outside of MixNMatch and then giving them to him to set. Additionally, Quora topic names change regularly and are merged together and this results in the URLs changing. This means that the Wikidata-Quora identifiers will be out of date (though still redirected to the correct place). We may also be creating Quora topics from Wikidata entities means we can set these identifiers directly. We can also resolve the constraint violations more efficiently. JayWacker (talk) 04:04, 21 February 2017 (UTC)
  • Symbol support vote.svg Support. While I'm happy to assist Quora as long as needed, it's right and proper - and welcome - that they should be able to contribute directly. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:58, 21 February 2017 (UTC)
I will approve the bot in a couple of days provided there have been no objections raised.--Ymblanter (talk) 20:19, 26 February 2017 (UTC)
Oops, sotty, should have noticed earlier. Please make a couple of test edits.--Ymblanter (talk) 21:53, 28 February 2017 (UTC)
We'll do a couple of test edits and I'll get back to you (this may be a few weeks to get to the top of the stack). JayWacker (talk) 18:18, 1 March 2017 (UTC)

YBot[edit]

YBot (talkcontribsSULBlock logUser rights logUser rights)
Operator: Superyetkin (talkcontribslogs)

Task/s: import data from Turkish Wikipedia

Code: The bot, currently active on trwiki, uses the Wikibot framework.

Function details: The code imports data (properties and identifiers) from trwiki, aiming to ease the path to Wikidata Phase 3 (to have items that store the data served on infoboxes) --Superyetkin (talk) 16:42, 12 January 2017 (UTC)

It would be good if you could check for constraint violations insteaf of just blindly copying data from trwiki. These violations are probably all caused by the bot. --Pasleim (talk) 19:26, 15 January 2017 (UTC)

EaasServiceBot[edit]

EaasServiceBot (talkcontribsSULBlock logUser rights logUser rights)
Operator: Sharmeelaashwin (talkcontribslogs)

Task/s: Bot which talks to EaaS(Emulation-as-Service) to store and retrieve the rendering software and OS for a file format. This helps in opening the files used in Digital Preservation

Code:

Function details: It contains following APIs:

  1. . This Bot contains an API to store the file formats in WikiData. This API will be called when the user decides to save this file format information in EaaS
  2. . This Bot contains an API to read the rendering software's information from WikiData to open the file formats in EaaS

--Sharmeelaashwin (talk) 15:08, 10 January 2017 (UTC)

Which statements do you plan to add? As far as I know there isn't yet a "rendering software" property. --Pasleim (talk) 19:40, 15 January 2017 (UTC)
I would like to add a new page which stores all these information(file format, rendering software and environment). When an user opens a file format with a particular software, we will store this information in Wikidata and when another user tries to open the same file, we will fetch data from Wikidata and open the file with the software name retrieved from Wikidata. I will also store the environment(OS and dependent softwares) information in Wikidata. --Sharmeelaashwin (talk) 11:08, 16 January 2017 (UTC)
  • There is readable file format (P1072), but I don't quite see how you'd store here which one gets used if several render the same format.
    --- Jura 06:22, 17 January 2017 (UTC)
  • How about creating example items manually? ChristianKl (talk) 07:20, 17 January 2017 (UTC)
How much data do you plan to add? ChristianKl (talk) 07:20, 17 January 2017 (UTC)
  • @Jura: "Readable file format" stores the list of file formats that can be opened in a software. I would like to do just the opposite i.e, if I have a file format, I would like to have a list of softwares that can open this file format and also the OS. This has the following advantages
    1. If a user tries to open a file is EaaS(Emulation as Service) application, then from the file format, EaaS can query Wikidata and get a list of softwares that can open the file requested by user.
    2. If any Wikidata user knows that a particular file format can be rendered by a software, then he/she can directly update it in Wikidata which is much easier when compared to updating it i@n PRONOM.
@ChristianKl : I will manually add example items and let you know. In the initial phase I am intending to add a major file formats like .doc, .jpeg, .ppt, .tx but the final goal is to store all the file formats to be stored in Wikidata. I plan to create a table in a Wikidata page and keep updating the same. -- Sharmeelaashwin (talk) 09:16, 18 January 2017 (UTC)
Please make some test edits.--Ymblanter (talk) 22:23, 27 January 2017 (UTC)
I am not really happy with this performance--Ymblanter (talk) 00:00, 5 March 2017 (UTC)

DiscogsBot[edit]

DiscogsBot (talkcontribsSULBlock logUser rights logUser rights)
Operator: Ocram89 (talkcontribslogs) and AndreaNocera (talkcontribslogs)

Task/s: Update wikidata entries using Discogs (Q504063) artists dump (just Complete and Correct data).

Code: The code will be, hopefully, uploaded on github in a couple of days.

Function details: The Bot uses a filtered XML data dump about artists from Discogs (Q504063), the data used is the one that have the <data_quality> element as "Complete and Correct". Once the data is parsed, the bot check if there already exists an entity, this step is done through a SPARQL query which get all the musician (Q639669) or musical ensemble (Q2088357) with the name (or alias, or name variations) got from the XML dump. If the entity already exists, then new statement can be inserted (e.g. if a band does not have its members, this can be inserted using member of (P463) in the entity), if the entity does not exists a new item is created. If there are more entity with the same name, nothing is changed, to avoid involuntarily wrong statement. --DiscogsBot (talk) 11:32, 12 December 2016 (UTC)

Could you do a few test edits? Which statements, labels and descriptions will you add to a new created item? --Pasleim (talk) 13:09, 12 December 2016 (UTC)
We are doing tests on test.wikidata. The new item will have label, description and aliases and it will have statements Discogs artist ID (P1953) and if it's a band all the members or if it's a member of a group the name of the group. We are also trying to analyze the profile to get some other data like instruments, occupation etc. AndreaNocera (talk) 13:26, 14 December 2016 (UTC)
The edits done on test.wikidata.org look good. But I would still prefer if you could do around 100 edits here on Wikidata to see if you can dedect reliably already existing artist items. --Pasleim (talk) 19:51, 15 January 2017 (UTC)

DoctorBot[edit]

DoctorBot (talkcontribsSULBlock logUser rights logUser rights)
Operator: DoctorBud (talkcontribslogs)

Task/s: Import ZFIN gene information and create (or augment) a corresponding Item in Wikidata

Code: Experimental and not yet public

Function details:

  • Import TSV data from http://zfin.org/downloads/gene.txt
  • Extract two columns of that data, one which will identify an Item (a Gene), the other a property of that Gene
  • Create an Item in Wikidata for the Gene
  • Create a Statement in Wikidata that binds the property to the Gene

--DoctorBot (talk) 03:00, 27 November 2016 (UTC)

@DoctorBot: The bot owner must use a different account from the bot itself.--Jasper Deng (talk) 03:00, 28 November 2016 (UTC)

DoctorBud DoctorBud (talkcontribslogs) is now declared as the Operator of DoctorBot in this Request.

Could you please make some test edits?--Ymblanter (talk) 16:03, 8 December 2016 (UTC)
@DoctorBud, DoctorBot: Are you still interested in this request?--Jasper Deng (talk) 08:44, 20 December 2016 (UTC)
@Jasper Deng: Yes, I'm still working on DoctorBot's code, but my request for DoctorBot being a Bot operated by DoctorBud is still important, if that's what you are asking. Thanks. --DoctorBud (talk) 00:51, 21 December 2016 (UTC)

WikiLovesESBot[edit]

WikiLovesESBot (talkcontribsSULBlock logUser rights logUser rights)
Operator: Discasto (talkcontribslogs)

Task/s: Miscellaneous tasks associated to photo upload campaigns promoted by WM-ES:

  • Assignment of commons categories to items handled in the campaigns (for example Wiki Loves Earth, Wiki Loves Folk, Wiki Loves Monuments, Photographs from Spanish Municipalities without pictures, and the like.
  • Sourcing of statements for items handled in the campaigns...

Code: Global repository is in here. Bot code is here.

Function details: The bot takes as input a series of lists (so called annexes in the Spanish Wikipedia, see example here and extracts necessary information: mainly wikidata item and commons category. If found, the bot does as follows:

  • Look up the Wikidata item.
  • Determines whether "Municipality of Spain" statement is available in P31 claim. If not, it creates the statement. If available, the statement is sourced to Spanish Wikipedia.
  • If the source (the list in the Spanish Wikipedia) provides a category, the bot determines whether a claim for Commons-category is available. If not, it creates the claim. If available, the claim is sourced to Spanish Wikipedia.
  • Finally, a commons sitelink for the category provided in the source is inserted if not available. If a gallery was already provided as commons sitelink, it's not modified.
  • Inconsistencies are logged during the process.

--Discasto (talk) 10:25, 3 July 2016 (UTC)

Symbol support vote.svg Support I strongly support this request. --Rodelar (talk) 22:04, 3 July 2016 (UTC)
Symbol support vote.svg Support I also support. --Harpagornis (talk) 15:00, 4 July 2016 (UTC)
Symbol support vote.svg Support I support this request. Ivanhercaz (talk) 16:17, 4 July 2016 (UTC)
Symbol support vote.svg Support I support this request. --Bauglir (talk) 16:28, 4 July 2016 (UTC)
Symbol support vote.svg Support I support this request. --ElBute (talk) 16:47, 4 July 2016 (UTC).
Symbol support vote.svg Support I support this request.--Pedro J Pacheco (talk) 20:14, 4 July 2016 (UTC)
Symbol support vote.svg Support The bot operator is reliable and knows what he does Poco2 21:07, 4 July 2016 (UTC)
Symbol support vote.svg Support I support this request. The operator has done a good work with other bots in different projects. --Millars (talk) 15:47, 5 July 2016 (UTC)
Symbol support vote.svg Support I support this request. --Dorieo (talk) 17:41, 6 July 2016 (UTC)
Sorry, Jura, I missed your comment. I have to say that I don't fully understand your comment (mainly that part related to the amount of municipalities mismatch). With regard the second part, I will patch the code to consider also subclasses. However, parroquia (Q3333265) does not apply, as a parroquia is a subdivision of a municipality. The lists we're handling have been reviewed several times by the WM-ES members and all the items are actually municipalities. Smaller subdivisions can be considered in next editions, but not now. Therefore, my only concern relates to the subclasses (I didn't actually consider that possibility). Best regards --Discasto (talk) 22:02, 22 July 2016 (UTC)
And I didn't notice your answer. There are several possible reasons for the mismatch in the number of municipalities: we could have already an item for the municipality, but it just isn't linked to eswiki. The easiest way to solve this would be to add the statements and then check the result for duplicates (it could also be done in advance, but this may be more complicated).
As far as "concejo of Asturias" is concerned, you could add both or replace it. Whatever suits interested editors best.
The "parroquia" question seems minor (11 items currently): If you look at the query result you will notice that some items have this in P31 in addition. This can mean that the article in some other wiki is about the parroquia or there is some other mixup. These items may need to be split.
--- Jura 08:42, 17 August 2016 (UTC)


  • It'll be great if some active editors of Wikidata could give their opinions. Canvassing of users with a low amount of contributions doesn't help. Sjoerd de Bruin (talk) 14:55, 7 July 2016 (UTC)
I took a look at contribs - it looks like a lot of entries have already been made, but the bot was blocked as unapproved. From my review of the entries made the bot seems to be operating reasonably. However, adding a reference of "imported from xx wikipedia" is barely better than no source at all, I'm not sure this is really helpful. If there's an actual es.wikipedia.org page that is the source of the information, providing that via "reference URL" and "retrieved on" properties would be more useful. An external source for this data would be much better. ArthurPSmith (talk) 14:42, 8 July 2016 (UTC)
I have no strong opinion on this. I do agree on providing an external source if available. It's not the case in most of the situations we're handling. Therefore, I'll simply skip this step. In fact, the core functionality (which I'm currently doing by hand) was related to setting commons categories. As we're handling all the items in the list, it seemed sensible to add sources. If you feel it's useless (unless a proper source is provided), I'll skip this step. Thanks for providing feedback --Discasto (talk) 22:45, 12 July 2016 (UTC) PS: yes, it's been blocked in the middle of a task that nowadays I have to do by hand. I don't really understand this block. Seems to me the typical bureaucratic behaviour that harms more than helps
I am going to approve the bot tomorrow provided there have been no objections.--Ymblanter (talk) 09:46, 13 July 2016 (UTC)
It would be good to have an answer to my question. We don't want to end up with even more duplicates.
--- Jura 12:34, 15 July 2016 (UTC)
@Ymblanter, Discasto: Please see my comment above.
--- Jura 08:43, 17 August 2016 (UTC)
@Jura1: I saw it weeks ago (and I answered :-), see answer on 22 July... I assumed you had this page in your watch list) --Discasto (talk) 08:52, 17 August 2016 (UTC)
@Discasto: Well, generally I notice, but here I missed it. Bot requests aren't exactly my preferred stuff ; ). Did you notice my comment from today?
--- Jura 08:54, 17 August 2016 (UTC)

Pictogram voting comment.svg Comment I drop this request. However, may I ask the account to be unblocked? It will not be active, but being blocked sincerely mean an overkill. Best regards --Discasto (talk) 21:50, 23 August 2016 (UTC)

@Jura1:, @Discasto:: The task seems useful, is there any chance you can agree and proceed with the task?--Ymblanter (talk) 07:59, 24 August 2016 (UTC)
I think it's essentially a question of checking the result. This could be done after addition. A way to flag former municipalities needs to be determined (by end date and/or with some Q19730508 item). In the meantime, Abián is working with Spanish municipalities (Wikidata:Bot_requests#Mayors_of_Spain).
--- Jura 08:42, 26 August 2016 (UTC)

MatSuBot 6[edit]

MatSuBot (talkcontribsSULBlock logUser rights logUser rights)
Operator: Matěj Suchánek (talkcontribslogs)

Task: Convert HTML entities in terms and maybe statements to regular text.

Code: Not yet decided on the implementation.

Function details: The biggest problem is at the moment querying for items which have such errors (if I don't find any other possibilty, I will try to combine SQL and PWB). --Matěj Suchánek (talk) 19:12, 1 July 2016 (UTC)

Please make some test edits.--Ymblanter (talk) 14:47, 5 July 2016 (UTC)

1-Byte-Bot[edit]

1-Byte-Bot (talkcontribsSULBlock logUser rights logUser rights)
Operator: 1-Byte (talkcontribslogs)

Task/s: Import census data from the Turkish Statistical Institute.

Code: Based on pywikibot

Function details:

--1-Byte (talk) 15:22, 2 March 2016 (UTC)

Update: Currently on hold as it's not entirely clear how to cite the data. --1-Byte (talk) 08:58, 3 March 2016 (UTC)

Phenobot[edit]

Phenobot (talkcontribsSULBlock logUser rights logUser rights)
Operator: Jjkoehorst (talkcontribslogs)

Task/s: The first step will be to improve the lineage annotation of organisms including taxon identifiers, correct species names and corresponding references using the UniProt Taxonomy database. The next step will be to include missing organisms into Wikidata and phenotypic information such as biosafety level, oxygen requirements and other features. Continuous discussion can be found here User:Phenobot/Discussion

Code:https://bitbucket.org/jjkoehorst/wikidatabots

Function details:This bot is based upon the basis of the ProteinBoxBot framework. It will use the UniProt Taxonomy SPARQL end point for data extraction and initially will work on completing existing entries as much as possible with correct names and taxon identifiers and missing species will be added to WD. For strains with existing phenotypic information this can be complemented from various sources which are currently under investigation such as GOLD or DSMZ. --jjkoehorst (talk) 15:13, 4 February 2016 (UTC) Abbe98
Achim Raschka (talk)
Brya (talk)
Dan Koehl (talk)
Daniel Mietchen (talk)
Delusion23 (talk)
Faendalimas
FelixReimann (talk)
Infovarius (talk)
Joel Sachs
Josve05a (talk)
Klortho (talk)
Lymantria (talk)
Michael Goodyear
MPF
PhiLiP
Andy Mabbett (talk)
Prot D
pvmoutside
Rod Page
Soulkeeper (talk)
Tinm
Tommy Kronkvist (talk)
TomT0m
Pictogram voting comment.svg Notified participants of WikiProject Taxonomy

@Succu: Can you have a look at this request? --Pasleim (talk) 10:32, 5 February 2016 (UTC)
I have some problems with the task "correct species names" NCBI is not a nomencatural database. It contains spelling errors like other databases too. And I have problems with this kind of sourcing. The NCBI-ID is allready referenced, nothing is imported from UniProt. The Disclaimer tells us „The NCBI taxonomy database is not an authoritative source for nomenclature or classification - please consult the relevant scientific literature for the most reliable information.“ --Succu (talk) 11:40, 5 February 2016 (UTC)
Here the Bot removed taxon name (P225). --Succu (talk) 12:10, 5 February 2016 (UTC) PS: Pseudomonas putida 10-23 (Q22661287) P225 is missing. --Succu (talk) 07:24, 6 February 2016 (UTC)
I agree with Succu. Why go change species names, based on UniProt? Could do serious damage. And indeed that kind of sourcing is unwanted and adds nothing: database is slow enough as it is. - Brya (talk) 11:58, 5 February 2016 (UTC)
This proposal does not seem to be mature. The Uniprot taxonomy database is a customized version of the NCBI taxonomy database, which itself is not reliable for taxonomy anyway. It is currently not clear if the bot owner knows enough about taxonomy and nomenclature to understand the issues associated with Wikidata taxon items. Also the proposed use of imported from (P143) does not seem appropriate.
Nevertheless my understanding is that many of this bot's contributions would be made in microbiology, and the issues would be a little different if its contributions were limited to this area. Otherwise I see no reasons to prevent the bot from adding “biosafety levels, oxygen requirements and other [such] features”.
Tinm (d) 18:29, 5 February 2016 (UTC)
Yes the main basis of this bot will be within microbiology and I can restrict the bot to remain within prokaryotes. About the naming, what I am currently doing is to leave the name alone if it exists in UniProt taxonomy as either other name or scientific name. But I can leave the name as it is as I am mostly relying on the taxonomic identifier from the NCBI/UniProt. My main priority is to have the NCBI Taxonomy identifier correct / filled in so that I can include he phenotypic characteristics and also easily can verify wether an organism page has been created and if not create as such. I can also skip adding references if one is already available. --jjkoehorst (talk) 06:45, 6 February 2016 (UTC)
Yes, this taxon name is pretty bad. And again, the fact that the rank is that of species does not need a reference (this is so by definition), and as there is a link to NCBI, the fact that the taxon name is accepted by NCBI does not need to be repeated in the form of a reference to taxon name. - Brya (talk) 07:44, 6 February 2016 (UTC) -also beyond understanding - Brya (talk) 07:51, 6 February 2016 (UTC) - And "instance of taxon" means that "taxon name" is present in the item. UniProt cannot know anything about that, so adding a reference to "instance of taxon" is pure misrepresentation. - Brya (talk) 07:58, 6 February 2016 (UTC)
Sorry about those naming, ill restrain the bot then to only prokaryotes if you prefer and to only update missing naming and NCBI Taxonomy information. When that works out good i'll make some property requests for the phenotypic information as stated earlier, ok? --jjkoehorst (talk) 09:31, 6 February 2016 (UTC)
If that means 1) only missing names of prokaryotes and 2) sourcing only for NCBI Taxonomy information, then yes, OK. - Brya (talk) 13:00, 6 February 2016 (UTC)
Looks like the databases are out of sync. NCBI Taxonomy ID (P685)=208964 gives Pseudomonas aeruginosa PAO1 (www.ncbi.nlm.nih.gov/taxonomy) and Pseudomonas aeruginosa (strain ATCC 15692 / PAO1 / 1C / PRS 101 / LMG 12228) (www.uniprot.org/taxonomy). This explains „adjustments“ like this one. --Succu (talk) 11:05, 6 February 2016 (UTC)
Looks like UniProt provides five separate names, rolled up into one entry? - Brya (talk) 13:00, 6 February 2016 (UTC)
There is a mapping between NCBI Taxonomy ID (P685) and a so called „Official (scientific) name“ used by UniProt. So maybe we need a qualifier for P685 to indicate this name. --Succu (talk) 16:44, 6 February 2016 (UTC)
Yes I had an email conversation with uniprot and this was a reply about that case: The idea is not to use a concise name. A same strain may be known by different names because it has been deposited in different organizations (institutions, private companies, etc) with different names. So we try to track these co-identical strain names used by the major concerned organizations for a specific strain. This name is stored as scientifcName and all the variances are stored among other names. --jjkoehorst (talk) 19:30, 6 February 2016 (UTC)
So what's your conclusion? BTW: I stumbled over User:Phenobot/Discussion, which looks like an outline of the intended bot task, but not mentioned here. --Succu (talk) 20:28, 6 February 2016 (UTC)
Well one way it makes sense to use a general nomenclature which encapsulates all possible extra namings but it is not the true scientific name. Maybe a taxon synonym name entry could be used which lists other names belonging to this organism.Yes the discussion page is to discuss the roadmap after the general taxon identification and naming is completed sorry that I did not mention it here but it was not completed yet to my opinion but feel free to comment on it if you like... --jjkoehorst (talk) 08:02, 7 February 2016 (UTC)
Strictly speaking these are not scientific names at all. The ICNP does not cover names at a rank lower than subspecies. AFAIK there is no formal system for naming strains, so this may well happen on an ad hoc basis, or according to a local standard. In fact, it would help somewhat not to put these in "taxon name". - Brya (talk) 08:29, 7 February 2016 (UTC)
Then I would suggest that the names currently in WD should correspond to the NCBI nomenclature or to any of the Uniprot (scientificnames/othernames) if this is not the case then it should be either the scientific name from the NCBI or from UniProt if there is no reference available. What do you think? And where would you place the other names? As a common name or something else? --jjkoehorst (talk) 08:40, 7 February 2016 (UTC)
? The names in NCBI/Uniprot are not scientific names (not regulated by a Code of nomenclature). The most obvious way to handle strains would be to have a property "strain name" (perhaps to be combined with "parent taxon", etc). - Brya (talk) 09:33, 7 February 2016 (UTC)
My consideration are the same. --Succu (talk) 10:10, 7 February 2016 (UTC)
I agree a strain property should then be created which specifies the name of a strain? However taxon name then becomes obsolete for strains at least if I am correct. The elements that are obligatory for strains are then parent taxon, taxon rank, NCBI Taxonomy ID, general labels and instance of. Anything that else that can be used with the current properties? --jjkoehorst (talk) 11:49, 7 February 2016 (UTC)
Yes, this new property should be used instead of P225. This would reduce "Format" violations of P225 too. --Succu (talk) 12:54, 7 February 2016 (UTC)
Sounds good, who is going to propose for a new property for taxon name and can this taxon name then also contain multiple values, such as synonyms of the strain name or should another property be made for that? --jjkoehorst (talk) 14:47, 7 February 2016 (UTC)
I think we need a second property UniProt name to modell the relationship to the NCBI id. In case of strains we could use aliasses to add the name variants. You can propose them at Wikidata:Property proposal/Natural science. --Succu (talk) 18:49, 7 February 2016 (UTC)
A property "UniProt" to link to the UniProt-entries may be handy. Not sure what else you mean, as UniProt-entries may concern regular taxa as well as strains and whatever else UniProt includes. - Brya (talk) 06:40, 8 February 2016 (UTC)
I am not much in favour of multiple names in one item, and including out-of-use names beside the current name seems like a recipe for disaster. But we really do need a separate property "taxon synonym (string)" beside the present "taxon synonym [item]". - Brya (talk) 15:53, 7 February 2016 (UTC)
Yes we should request for a taxon synonym string variant. Then by default it would be the scientific name of the NCBI nomenclature if no better name is available? --jjkoehorst (talk) 19:50, 7 February 2016 (UTC)
Synonyms are an area full of hidden dangers. What we may really need are:
  • "taxon synonym, homotypic (item)"
  • "taxon synonym, heterotypic (item)"
  • "taxon synonym, homotypic (string)"
  • "taxon synonym, heterotypic (string)"
Especially heterotypic synonyms may vary strongly, depending on point of view (references!). Brya (talk) 06:40, 8 February 2016 (UTC)
I looked into: Property:P1843 which is a common name for a given taxon. As basis we could use the NCBI nomenclature for strains (and/or others?). And over time add the homotypic/heterotypic naming. Shall I run a test with the restricted settings I have now? Only bacteria, no name updating if there is a name available and no reference adding if the value is already present? --jjkoehorst (talk) 08:01, 8 February 2016 (UTC)
@Brya: Regarding how to handle synonyms, I have thought of a way of doing things that would solve a very big part of the issues we encounter with the current one. I'm going to make a post about that on the project talk page when I'll have a bit of time. It would imply significant changes but I really believe it would answer many issues efficiently. Anyway, I guess you will see when I put it up. —Tinm (d) 02:34, 9 February 2016 (UTC)
I will be most interested to see what you come up with. - Brya (talk) 06:13, 9 February 2016 (UTC)

Greetings all. I am part of the GeneWiki team and I am adding genes and proteins for bacteria under our MicrobeBot (talkcontribslogs) account. see: MicrobeBot Task Page For my project it is important that there remain distinct strain items with NCBI taxonomy identifiers so I can link genes and proteins to them via found in taxon (P703). Just a thought, but we could distill some of the views here in a mockup of a Wikidata strain item in this table below? Using Pseudomonas aeruginosa PAO1 (Q21065234) as an example. I added some of the basics that are there for strain items now. I personally think a new 'NCBI strain name' type of property would be a good thing to have as these strain names are directly linked to the NCBI Taxonomy ID. Putmantime (talk) 18:46, 9 February 2016 (UTC)

Property Description Datatype Expected value

(if not listed, see property definition)

P225 taxon name String Species name? From NCBI, UniProt?
P??? strain name String Strain name From NCBI, UniProt, etc...
P171 parent taxon Item Bacterial species item e.g. Pseudomonas aeruginosa (Q31856)
P105 taxon rank Item Strain e.g. strain (Q855769)
P685 NCBI Taxonomy ID String 208964

What we are talking about is this:

Property Description Datatype Expected value

(if not listed, see property definition)

P??? strain name String Strain name From NCBI, UniProt, etc... e.g. Pseudomonas aeruginosa PAO1 (Q21065234)
P171 parent taxon Item Bacterial species item e.g. Pseudomonas aeruginosa (Q31856)
P105 taxon rank Item Strain e.g. strain (Q855769)
P685 NCBI Taxonomy ID String 208964
P??? UniProt ID String from UniProt, different from UniProt ID (P352)

- Brya (talk) 04:42, 10 February 2016 (UTC)

I agree. P225, P1420 and P1843 should not be taken form NCBI, UniProt? No items should be created on this basis. --Succu (talk) 06:51, 10 February 2016 (UTC) PS: I added UniProt ID (P352) and miss now something like UniProt name. --Succu (talk) 08:02, 10 February 2016 (UTC)
Not sure what you mean by "UniProt name". Is this something like "Pseudomonas aeruginosa (strain ATCC 15692 / PAO1 / 1C / PRS 101 / LMG 12228)", which to me does not look like a name but five names, for what may be (deemed to be) one strain. - Brya (talk) 11:39, 10 February 2016 (UTC)
Yes, the so called „Official (scientific) name“ used by UniProt mapped to NCBI Taxonomy ID (P685). --Succu (talk) 12:01, 10 February 2016 (UTC)
It is long list, and many names are regular scientific names. Could you point out a few examples? - Brya (talk) 12:07, 10 February 2016 (UTC)
  • 634452 ← Acetobacter pasteurianus (strain NBRC 3283 / LMG 1513 / CCTM 1153)
  • 4024 ← Acer saccharum
  • 441768 ← Acholeplasma laidlawii (strain PG-8A)
  • 237531 ← Actinomycete sp. (strain K97-0003)
  • 928294 ← Human adenovirus C serotype 1 (strain Adenoid 71)
  • 262698 ← Brucella abortus biovar 1 (strain 9-941)
  • 48984 ← Pantoea agglomerans pv. gypsophilae
  • 45222 ← Parana mammarenavirus (isolate Rat/Paraguay/12056/1965)
--Succu (talk) 12:23, 10 February 2016 (UTC)

But not all these names are unique to UniProt. For example, Acer saccharum is a regular botanical name, and Pantoea agglomerans pv. gypsophilae appears to be in fairly widespead use, as is Brucella abortus biovar 1 (strain 9-941). - Brya (talk) 17:32, 10 February 2016 (UTC)

My thought was that jjkoehorst want's to integrate these names somehow. If the speclist is important for the planned bots job I can provide some statistics. --Succu (talk) 18:36, 10 February 2016 (UTC)
Eventually I would like to create a most comprehensible but still useful taxonomy resource where people can easily search for organisms and their phenotypic characteristics. Also that when a new strain is sequenced its information can easily be integrated into WD according to a defined data model. However for this a solid ground needs to be established first and that is what I was thinking of. In general the primary identifier is the NCBI Taxonomic number. Which can be completed with information from NCBI scientific names and UniProt scientific / other names. If for obvious reasons this would introduce too many errors or is not according to the idea of how we should define a strain than this is perfectly fine to me. What was driving me from the beginning is that I want to connect phenotypic information from multiple resources to taxonomic identifiers and corresponding genetic makeup. I of course can do this on my own machine on my own little project and this would work out fine but no one else could benefit from this and thats why I started working on the idea of this phenobot (hence the name...).. In the discussion of the bot as mentioned by Succu I am expanding this idea further with possible phenotypic characteristics that I can get my hands on and could theoretically be integrated into WD but I am still writing on this User:Phenobot/Discussion. --jjkoehorst (talk) 21:04, 10 February 2016 (UTC)
As an example these are statements that would be interesting to add. Not all have properties and I am preparing for that.
Property Description Datatype Expected value
P1604 biosafety level Item Level 1 Q18396533

Level 2 Q18396535 Level 3 Q18396538 Level 4 ... see Q21079489

Property: P2043 length / size string 902320 bp Q21481789
P??? GC content float
P??? Gram staining item Gram positive Q857288

Gram negative Q632006

P??? Pathogenic to item Human, Plant, Animal, etc...
P??? Motility item Chemotactic (Chemotaxis) Q658145

Motile Q3359 Nonmotile (not yet found)

P??? Environment item or string soil, seawater, marine sediment, forest soil, etc...
P??? Temperature range item Hyperthermophile Q1784119

Mesophile Q669652 Psychrophile Q913343 Thermophile Q834023

Property: P2076 Temperature (optimal temperature) Q21079489

--jjkoehorst (talk) 09:11, 11 February 2016 (UTC)

If all that is to be included in an item, it becomes understandable that Succu would like a UniProt name, and (presumably?) a separate item for each such UniProt entity. - Brya (talk) 17:26, 11 February 2016 (UTC)
If I understand you correctly you mean to store the Biosafety/Gram/Temp/etc.. in a UniProt item? These are generic features from different sources (DSMZ/GOLD/etc) and are linked via the NCBI Taxonomy ID and in that case would not make sense to store these items under a uniprot name entry. --jjkoehorst (talk) 19:46, 11 February 2016 (UTC)

Back to the roots[edit]

Symbol oppose vote.svg Oppose: Back to the roots. „Code“ is protected. I see no reactions on error reports. The task is obscure. jjkoehorst, please rollback your bots contributions. --Succu (talk) 22:32, 11 February 2016 (UTC)

Code is unlocked and all revisions are drawn back. Please lets continue on what kind of shape would be acceptable for phenotypic information --jjkoehorst (talk) 06:51, 18 February 2016 (UTC)

I think there is great value in elements of what are proposed and it would make the microbial data on wikidata a much richer resource. Meta data such as Biosafetly level, gram -/+ etc.. would be very useful, but getting Taxonomy identifiers and names from UniProt may not be the best source. I think it would benefit this proposal to have a clear picture of what the scope of the project would be, and a clear definition of each bot task. Putmantime (talk) 23:16, 11 February 2016 (UTC)

Putmantime, mind to help? --Succu (talk) 23:21, 11 February 2016 (UTC)
Succu Yes definitely...can we keep the discussion going on this proposal? I think it has merit, but needs to be clearer. The naming issue for subspecies items seems to have thrown a wrench in things. I think NCBI is a good authority for strain names personally, because the name was submitted by the researcher that submitted sequence data to NCBI, and that is when the NCBI Taxonomy ID was generated as well as genome IDS. Not a scientific name though or consistently formatted. I view it as an appropriate label, and maybe a new 'strain name' property, but see it shouldn't be a taxon name. Any synonyms could be aliases, IMHO Putmantime (talk) 23:34, 11 February 2016 (UTC)
I am in the process of rolling back the changes made by the bot. I think the focus of the conservation has been shifted towards the naming issues which still exists and need to be discussed thoroughly. Currently existing names will not be modified by the bot and its main focus is on the metadata that is available at various resources through the NCBI taxonomic identifier which will not interfere with current information. I know that I initially started about the naming but the main focus is on the metadata. Hopefully we can keep the discussion going on the naming scheme and microbial metadata to come to a good agreement to improve the quality of information in Wikidata. --jjkoehorst (talk) 17:36, 12 February 2016 (UTC)
In the NCBI Taxonomy strains have no rank. We should find a consens that stating taxon rank (P105)=strain (Q855769) is OK. Otherwise we can use instance of (P31)=strain (Q855769) with taxon rank (P105)=novalue. --Succu (talk) 18:51, 12 February 2016 (UTC) E.g. Shigella flexneri 2a str. 301 (Q21102941), Putmantime. --Succu (talk) 22:13, 12 February 2016 (UTC)
There are similar cases elsewhere: "virus" as a subspecific entity is not regulated by a Code of nomenclature. This goes also for "forma specialis", "pathovar", etc. We should have a structure for this. - Brya (talk) 06:17, 13 February 2016 (UTC)
Yes we should. If I remember right f.sp. is used by IF and MycoBank as a rank. Strongly related to this bots task is the question of Candidatus (Q857968). --Succu (talk) 19:18, 13 February 2016 (UTC)
Yes, forma specialis is used by IF and MycoBank as a rank, but that does not make it a rank. And, yes, "Candidatus" is a similar problem case. - Brya (talk) 09:55, 14 February 2016 (UTC)

Hkn-bot[edit]

Hkn-bot (talkcontribsSULBlock logUser rights logUser rights)
Operator: HakanIST (talkcontribslogs)

Task/s: clean up invalid authority property links in person items , harvest date of births from articles

Code: based on addwiki framework (php) currently being written

Function details: Bot will periodically check P2458 , P2446 , P2447 ,P2448 , P2449 if ids specified are valid by visiting the generated url, if not validated it will remove claims and will report the list. As errors often occur due to invalid data at source wiki or mismatch of footabller id with a manager id. Secondly bot will harvest birth dates items with these properties from imported wikis. Using variations of this wdq generated list , will add date of birth property if there is none.

-- Hakan·IST 18:50, 16 January 2016 (UTC)

Please make some test edits.--Ymblanter (talk) 08:18, 20 January 2016 (UTC)
@Ymblanter: : Ran the bot for the second task (harvesting day of birth from article) for 20 items, throttled to 10second per change. Hkn-bot contribs.-- Hakan·IST 15:54, 20 January 2016 (UTC)
I see that e.g. here you added data but they are unreferenced. Is there any way to add a reference as well?--Ymblanter (talk) 16:36, 20 January 2016 (UTC)
I've been working on adding references for sometime now, but have not got it to work yet.-- Hakan·IST 21:30, 20 January 2016 (UTC)
Your bot added a date of birth of "1 October 1987" for Q3801812, despite the itwiki page stating "10 gennaio 1987" (10 January 1987). I'm afraid a similar issue occurred on a few other items, for instance Q5889913 or Q6771150. --Alphos (talk) 22:37, 20 January 2016 (UTC)
These errors were caused by wrong Transfermarkt footballer ID (P2446) statements, so the bot is not malfunctioning. It would be however really good if you could add references. This helps, amongst others, to spot the error source more easily. --Pasleim (talk) 20:02, 15 January 2017 (UTC)

RollBot[edit]

RollBot (talkcontribsSULBlock logUser rights logUser rights)
Operator: Alphos (talkcontribslogs)

Task/s: Revert all edits made by users (crucial because of Quick Statements) or bots gone awry, once they've been temporarily blocked or stopped

Code: Github The bot is written in PHP, and I know it's not ideal, but it's a language I'm very familiar with.

Function details:

Whenever any user (including but not limited to bots) edits a big number of pages or entities in a short amount of time, and makes a mistake over every single one of them, reverting all pages to their former state is going to be rapidly mind-numbing for a human. We therefore need a bot to perform such an action.

A plain rollback is not a viable way of reverting for at least two reasons :

  • it will erase valid edits by the same user prior to going awry ;
  • it won't erase edits if any other user has edited the page or entity afterwards.

RollBot takes the first "wrong" edit as a parameter, and reverts all edits made by the same user since the time of that first "wrong" edit. The interface currently only resides on my computer - it is not possible to automatically start RollBot on a task : this is a security feature to avoid using it to vandalize legitimate edits.

It first lists all pages/entities edited by the target. Then for each of them, gets the content of the version immediately before first edit on that page by the target since it started going awry, and takes it as the version to revert to, should the page be reverted. Then establishes a list of all contributors since that version, to make sure no other users edited that page.

It can merely list the pages he couldn't revert without overwriting edits by users other than the target, OR overwrite said edits. This is decided on a per-request basis.

After completing its task, it publishes a complete report page in its userspace, with the name of the target and start time in the title.

That report holds a list of pages/entities it edited, a list of pages/entities requiring human check (all revisions by the target were deleted ; page was edited by another user ; page was edit-blocked ; those are the most likely explanations), and a list of pages/entities bound for deletion (created by the target after going awry - deleting could be performed by RollBot, but I'd rather make sure it has community acceptance for its base function before requesting sysop rights Face-wink.svg ). You can see an example of the report here, that was created by the bot in its current version.

Whether or not RollBot finally acquires sysop rights, I see it becoming a very useful tool for the admins' noticeboard.

Wikidata is the first WMF wiki where I implement it (although it technically has the ability to edit other wikis) because of the relatively limited userbase, and the relatively high edit/minute any user (and not just bots) can reach using external tools like Quick Statements. I do plan on running it on other wikis once it has proven its worth.

--Alphos (talk) 17:01, 14 January 2016 (UTC)

First series of test edits was short, and allowed me to spot one bug and one issue to address that cannot be considered a RollBot bug - I'll explain after properly investigating it, it seems some constraint was not met with the sitelinks in the "good version".
  • See this section of the admin's noticeboard.
  • RollBot successfully reverted all pages to their former state after being asked to do so - which resulted in a bunch of null edits, since they all already had been reverted to their former state.
  • Most of them except one were already back to their former state, so RollBot essentially performed a null edit. As expected, it successfully listed all pages and the editors (of a hardcoded for now limit of 1 + 5 who will have their nickname or IP address) it "reverted" - or intended to -, with the "first bad" revision, the "previous good" revision and its author.
Alphos (talk) 21:03, 18 January 2016 (UTC)
There are two very similar issues with :
addshore (IRC : Freenode / #wikidata) helped me greatly in isolating the two issues (well, he pretty much solved the whole thing by himself ^^' ). He also suggested (but this is unrelated to RollBot) we find a way to list all similar issues on Wikidata.
I don't know what happened with Q12189183, due to a STUPID mistake on my part : I overwrote my error log instead of appending to it. That is now fixed. I cannot reproduce the error, but, should another one ever arise, we'll definitely know about it ! : all errors occurring when attempting to edit will from now on be visible in RollBot's reports.
Alphos (talk) 23:09, 18 January 2016 (UTC)
On a suggestion by GZWDer, I implemented an optional end timestamp parameter.
After removing a tiny kink that I thought I never added, I started RollBot again (Report).
Despite having a minimum of 5 seconds of doing nothing after an edit (thus a very strict minimum of 5 seconds between edits, usually more), it sadly got throttled for quite a few attempted edits. I take comfort in the fact he successfully listed those failed attempts in the "Pages requiring human check" section, with a default explanation.
The real API error messages are in a file on my machine, and, although there is a lot of throttled edits, there is also a fair amount of failed saves due to wikilink conflict. It's usually because the page linked (in the entity RollBot is editing) is a redirect to a page linked in another entity.
I'll post a list of those in RollBot's report.
Alphos (talk) 18:29, 19 January 2016 (UTC)
Found 2 of those conflicting redirecting wikilinks :
The API simply prevents the bot from editing in case of such conflicts, there is no workaround that I know of. Good thing such conflicts are listed in the "Pages requiring human check" section of the bot's reports Face-wink.svg
Alphos (talk) 19:08, 19 January 2016 (UTC)
What is the current situation with the bot? Is it ready for approval?--Ymblanter (talk) 21:05, 28 January 2016 (UTC)
I've been facing a heavy bout of a fairly serious medical condition these past few days, it's a bit on hold.
It's given me the idea to give a group of users (most likely admins/sysops/whatchamacallits) the ability to trigger the bot when I'm in that state which may (and probably will) reoccur, but i'm sad to say dev is a bit on hold for the next few days - hopefully not more than a week (pleeeeaaaaase, I can't take this much longer !).
However, if I'm not mistaken, the bot in its current state is functioning as it should - if you don't count disability of its operator.
Alphos (talk) 11:34, 4 February 2016 (UTC)

@Alphos: In addition to the bot flag, which rights would this bot need? If administrator access is needed, then not only would you need a request for administrator access for the bot, but you yourself would have to run for adminship first.--Jasper Deng (talk) 09:28, 27 March 2016 (UTC)

Sorry for the late reply, been suffering from Q166907 for the past few months, which made programming a bit difficult - although I luckily did have the energy to make RollBot during an easier week -, and my treatment needed a few tweaks in the past few weeks - I'm not there yet, but hopefully I should soon be able to resume work on the bot and other projects !
I'm not aiming for bot adminship yet. The bot needs an extensive period of testing - it seems to behave adequately for now, barring the unconfirmed user limits it inadvertently hit, but I'm a prudent person, and I'd rather be damn sure it won't add work for the current human admins by performing unneeded sysop actions Face-wink.svg
If however, I'm satisfied (and I'm a hard person to satisfy) with its ability to perform the tasks in their entirety, including deleting items based on the conditions given to it, I plan indeed on requesting sysop rights. If that means I need to be an admin first, I'd apply as well of course.
Thanks a lot for your consideration Face-smile.svg
Alphos (talk) 13:40, 6 April 2016 (UTC)
@Alphos: Any news on this? I think it's advisable to first only perform rollbacks and then later add the deleting abilities which require sysop rights. --Pasleim (talk) 10:11, 17 January 2017 (UTC)
Not much news, based in part on my condition but also on the lack of a testbed, quite possibly because nobody thinks of it Face-wink.svg
As for the deletion rights, I already made clear I don't intend for RollBot to have them immediately, as I don't believe I'm infallible and it's probably working as intended but I'd much rather be sure first.
I do intend to request admin rights for myself first, and that can't happen currently because I'm not yet able to fulfill that role on Wikidata. With a bit of luck and a bigger dose of treatment come February, I'll advise on that matter.
In the mean time, if you find a testbed for RollBot, as in a set of contributions that needs batch revoking, please inform me soon as you can, here and/or on IRC ^^ I'm still trying to find a way for other people to control RollBot without leaving a possibility for abuse, so for now I'm still in exclusive control of the bot.
Alphos (talk) 13:57, 19 January 2017 (UTC)

Dexbot[edit]

Dexbot (talkcontribsSULBlock logUser rights logUser rights)
Operator: Ladsgroup (talkcontribslogs)

Task/s: Auto-transliterating for names of humans

Code: Based on pwb, probably publish it soon.

Function details: The codes analyses dumps of Wikidata and can create an auto-transliterating system for any given pair of languages based on that. I started with Persian and Hebrew (some edits for test [12] [13]) --Amir (talk) 18:14, 7 April 2015 (UTC)

  • Pictogram voting comment.svg Comment, please let me know when you try your system for some cyrillic language. I'd like to see it myself. --Infovarius (talk) 14:10, 8 April 2015 (UTC)
@Infovarius: I work in pair of languages like fa and he (which the bot adds Persian transliteration based on Hebrew and vice versa) which pair of language do you suggest? en and ru? Amir (talk) 11:54, 9 April 2015 (UTC)
Probably you should have stated this in your request. Your phrase "I started with" has encouraged me :) No, I don't suggest Russian as I understand the complexity of the task. --Infovarius (talk) 13:16, 10 April 2015 (UTC)
@Infovarius: I don't think Russian is too complicated to abandon. I took care of lots of different issues including country of citizenship, etc. so It's not hard for this bot. I asked you what language do think is the best pair for Russian *to start with* Amir (talk) 21:11, 10 April 2015 (UTC)
Will the bot be able to dedect delicate labels as in King An of Han (Q387311)? --Pasleim (talk) 19:24, 13 April 2015 (UTC)
It probably skips them or make a correct transliteration (depends on the language) but I can't say for sure. Let me test Amir (talk) 13:33, 15 April 2015 (UTC)
Are we ready for approval here?--Ymblanter (talk) 16:08, 15 April 2015 (UTC)
  • Just a caveat when when dealing with Chinese languages: Chinese to Latin script (and vice versa) transliterations are rarely standardized. For example, Alan Turing's given name might be transliterated into 艾伦 or 阿兰 (as in the case of Alan Moore (Q205739)) or 亚伦 (as in the case of Alan Arkin (Q108283)). These Chinese characters are roughly resembles "Alan" when pronounced, but due to regional differences (i.e. mainland China, Taiwan, Hong Kong, etc), they result in different transliterations. Even when two people's names are transliterated by the same region, they can be different. There is simply no standardization on this matter. —Wylve (talk) 14:53, 23 April 2015 (UTC)
    hmm, User:Wylve: Just a question: Is it wrong to put "亚伦" for Alan in Alan Turing? Amir (talk) 12:36, 25 April 2015 (UTC)
    It's not wrong, but it might not be the only way people call Alan Turing in Chinese. The lead sentence of Turing's article on zhwiki mentions that "Alan" is also transliterated as 阿兰. —Wylve (talk) 20:48, 25 April 2015 (UTC)
    @Wylve: I made 50 auto-transliterations [14], please check and say if anything is wrong or unusual. Thanks Amir (talk) 20:05, 16 May 2015 (UTC)
    I can't verify every name, since some of those people aren't mentioned in Chinese news sources. My standard of what is "wrong" or "unusual" is whether the transliterations you've produced are used predominantly in reliable and reputable sources. It is hard to judge sometimes, as there is a variety of transliterations used. For instance:
  • Jonathan Ross is transliterated as 强纳·森罗斯 and also 喬納森·羅斯
  • Leonard B. Jordan is also transliterated as 萊昂納德·B·喬丹
  • Jimmy Bennett is also transliterated as 吉米·本内特, 吉米班奈, 吉米班奈特.
  • Jason Lee is also named 杰森·李.
  • "Scott" from A. O. Scoot is also transliterated as 史考特.
  • All of your edits should be fine if read in Chinese, as they all sound like their English name. Also, I have found this page ([15]), which documents Xinhua News Agency (Q204839)'s official transliterations of names. These transliterations are considered official only in Mainland China. —Wylve (talk) 21:58, 16 May 2015 (UTC)
    Amir: Sounds cool. Regarding the he-fa pair
    Tagging Amire80 and Eldad who may add some other advices. Eran (talk) 18:53, 4 January 2017 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── @Ladsgroup, Wylve: Does this look okay for an approval, or is there something we're missing? I don't speak (or read, for that matter) Chinese  Hazard SJ  05:40, 28 December 2015 (UTC)

Well, last time people talked in this page was a year and half ago. I need to search to find the script and check. I'll do it soon Amir (talk) 19:12, 5 January 2017 (UTC)

KunMilanoRobot[edit]

KunMilanoRobot (talkcontribsSULBlock logUser rights logUser rights)
Operator: Kvardek du (talkcontribslogs)

Task/s:

  • Add french 'intercommunalités' on french communes items (example)
  • Add french communes population
  • Correct Insee codes of french communes

Code:

Function details: Takes the name of the 'communauté de communes' in the Insee base and adds it if necessary to the item, with point in time and source. Uses pywikipedia. --Kvardek du (talk) 19:27, 21 January 2014 (UTC)

Imo the point in time qualifier isn't valid here as the propriety isn't time specific. -- Bene* talk 15:10, 22 January 2014 (UTC)
Property:P585 says "time and date something took place, existed or a statement was true", and we only know the data was true at January 1st, due to numerous changes in French organization. Kvardek du (talk) 12:18, 24 January 2014 (UTC)
Interesting, some comments:
  • Not sure that "intercommunalités" are really aministrative divisions (they are built from the bottom rather than from the top). part of (P361) might be more appropriate than located in the administrative territorial entity (P131)
  • Populations are clearly needed but I think we should try do it well from the start and that is not easy. That seems to require a separate discussion.
  • INSEE code correction seems to be fine.
  • Ideally, the date qualifiers to be used for intercommunalité membership would be start time (P580) and end time (P582) but I can't find any usable file providing this for the whole country. --Zolo (talk) 06:37, 2 February 2014 (UTC)
Kvardek du : can you add « canton » and « pays » too ? (canton is a bit complicated since some cantons contains only fraction of communes)
Cdlt, VIGNERON (talk) 14:01, 4 February 2014 (UTC)
Wikipedia is not very precise about administrative divisions (w:fr:Administration territoriale). Where are the limits between part of (P361), located on terrain feature (P706) and located in the administrative territorial entity (P131) ?
Where is the appropriate place for a discussion about population ?
VIGNERON : I corrected Insee codes, except for the islands : the same problem exists on around 50 articles due to confusion between articles and communes on some Wikipedias (I think).
Kvardek du (talk) 22:26, 7 February 2014 (UTC)
@Bene*, Vogone, Legoktm, Ymblanter, The Anonymouse: Any 'crat to comment?--GZWDer (talk) 14:37, 25 February 2014 (UTC)
I'm still not familiar with the "point in time" qualifier. What about "start date" because you mentioned the system has changed to the beginning of this year? Otherwise it might be understood as "this is only true/happened on" some date. -- Bene* talk 21:04, 25 February 2014 (UTC)
Property retrieved (P813) is for the date the information was accessed and is used as part of a source reference. point in time (P585) is for something that happened at one instance. It is not appropriate for these entities which endure over a period of time. Use start time (P580) and end time (P582) if you know the start and end dates. Filceolaire (talk) 21:19, 25 March 2014 (UTC)

Symbol support vote.svg Support if the bot user uses start time (P580) and end time (P582) instead of point in time (P585) --Pasleim (talk) 16:48, 28 September 2014 (UTC)

@Kvardek du: Do you still plan to run the bot? If so, could you please do agian some test edits with the use of start time (P580), end time (P582) instead of point in time (P585)? --Pasleim (talk) 07:52, 24 May 2015 (UTC)
@Pasleim: : it's planned, but not for the moment... The problem I have with french data is that you only have the membreship at a moment t, and not with a start time (P580). Kvardek du (talk) 13:20, 25 May 2015 (UTC)
Kvardek du then use retrieved (P813) in the reference and leave out start time (P580) and point in time (P585). Joe Filceolaire (talk) 08:33, 23 July 2015 (UTC)
Filceolaire : yeah but I have a retrieved (P813) t2 which is different from my point in time (P585)... Kvardek du (talk) 15:47, 24 July 2015 (UTC)
If you don't know the 'start time' then leave it out. If you want then you can create a separate item for the document that the data comes from and add the point in time statement to that item then reference the item for that document in the references for the 'located in ... entity' statements. Look on it as the 'point in time' date relates to the info in the document (true on that date).
Note that population figures should have a 'point in time' qualifier to say when that population figure applies since the population figure is not true for a period; it is only true for the day it was measured. Joe Filceolaire (talk) 00:55, 25 July 2015 (UTC)

AviBot[edit]

AviBot (talkcontribsSULBlock logUser rights logUser rights)
Operator: Offthewoll (talkcontribslogs)

Task/s: Retrieve information about universities on Wikidata. Code: Can provide upon request. Function details: Retrieve information about universities on Wikidata. This bot is for reads only, no editing. Using a small Python script I've written to get a list of entities using the wdq API and then get information about each one using wbgetentities with the Wikidata API. --Offthewoll (talk) 21:29, 17 May 2016 (UTC)

fabot[edit]

fabot (talkcontribsSULBlock logUser rights logUser rights)
Operator: Fabris Los (talkcontribslogs)

Task/s: simple python test

Code:

Function details: --Fabris Los (talk) 20:14, 4 August 2016 (UTC)

@Fabris Los: Could you please clarify what you want to do with your bot? Thanks, --Vogone (talk) 19:52, 12 February 2017 (UTC)

florentyna[edit]

florentyna (talkcontribsSULBlock logUser rights logUser rights)
Operator: Florentyna (talkcontribslogs)

Task/s: see Wikidata:Bureaucrats'_noticeboard I request hereby a bot flag to reduce the oversighting work for the admins here on Wikidata in the RECENT CHANGES section. Code: Function details: --Florentyna (talk) 19:22, 13 November 2016 (UTC)

@Florentyna: Could you please clarify your request? What exactly is your bot going to do? --Vogone (talk) 19:51, 12 February 2017 (UTC)

InteliBOT[edit]

InteliBOT (talkcontribsSULBlock logUser rights logUser rights)
Operator: Miguel2706 (talkcontribslogs)

Task/s: Move categories in eswiki

Code: Little modification of pywikibot.

Function details: Example 1 Example 2 --Miguel2706 (talk) 20:13, 16 January 2015 (UTC)

@Miguel2706: Could you please indicate whether bot permissions are still needed? Thank you. --Vogone (talk) 19:49, 12 February 2017 (UTC)

mahirbot[edit]

mahirbot (talkcontribsSULBlock logUser rights logUser rights)
Operator: Mdmahir (talkcontribslogs)

Task/s:

  1. To Update Descriptions for Tamil Films for both English and Tamil for almost 5000+ items.
  2. To Update Wikimedia category for Tamil Films (5000+ items)


Code: http://tools.wmflabs.org/wikidata-todo/quick_statements.php

Data source: wikidataquery


Function details:

  • Description(English): Tamil Film (2014)
  • Description(Tamil): தமிழ்த் திரைப்படம் (2014)

Note: 2014 is production year of the film

Because its 5000+ items, I prefer to use bot account with community consensus. Thanks --Mdmahir (talk) 04:22, 25 February 2016 (UTC)

pywikibot[edit]

pywikibot (talkcontribsSULBlock logUser rights logUser rights)
Operator: Corinaflorescu (talkcontribslogs)

Task/s:

Code:

Function details: --Corinaflorescu (talk) 17:45, 11 June 2016 (UTC)

@Corinaflorescu: Could you please clarify what you want to do with your bot? --Vogone (talk) 19:48, 12 February 2017 (UTC)

SaamDataImportBot[edit]

SaamDataImportBot (talkcontribsSULBlock logUser rights logUser rights)
Operator: Rkbrasse (talkcontribslogs)

Task/s:Import Smithsonian American Art Museum related data, including video, publications, exhibitions ...

Code:in the works

Function details: First set of data imported will be about videos related to exhibitions and artists that we have published on youtube. We will branch out to general museum data, exhibitions data, objects data and publications data if not already present. I will be more specific once the function is more nailed down and tested.

Video data mapping[edit]

  • We need a new Entity Type called Online Video that will contain the following properties
  • Title maps to P1476
  • url maps to streaming media
  • thumbnail needs to map to an image property

--Rkbrasse (talk) 18:45, 20 April 2016 (UTC)

welvon-bot[edit]

welvon-bot (talkcontribsSULBlock logUser rights logUser rights)
Operator: Welvon-bot (talkcontribslogs)

Add properties to wikidata item by mining the text of the wikipeida articles that belongs to the item. Task/s:

Not implemented yet! Code:

1-Scanning the first or/and second paragraph in wikipedia article which usually defines the article. 2-The text scanned from the article is the input to the model which will analyse the text. 3-The model output should a the properties of the wikidata's item. 4- Using API the properties of the wikidata's item is updated. 5- restart from the step 1 Function details: --Welvon-bot (talk) 08:56, 1 May 2016 (UTC)