User talk:Andrawaag

From Wikidata
Jump to navigation Jump to search
Logo of Wikidata

Welcome to Wikidata, Andrawaag!

Wikidata is a free knowledge base that you can edit! It can be read and edited by humans and machines alike and you can go to any item page now and add to this ever-growing database!

Need some help getting started? Here are some pages you can familiarize yourself with:

  • Introduction – An introduction to the project.
  • Wikidata tours – Interactive tutorials to show you how Wikidata works.
  • Community portal – The portal for community members.
  • User options – including the 'Babel' extension, to set your language preferences.
  • Contents – The main help page for editing and using the site.
  • Project chat – Discussions about the project.
  • Tools – A collection of user-developed tools to allow for easier completion of some tasks.

Please remember to sign your messages on talk pages by typing four tildes (~~~~); this will automatically insert your username and the date.

If you have any questions, don't hesitate to ask on Project chat. If you want to try out editing, you can use the sandbox to try. Once again, welcome, and I hope you quickly feel comfortable here, and become an active editor for Wikidata.

Best regards! --Tobias1984 (talk) 18:10, 1 August 2014 (UTC)Reply[reply]


I think your bot is editing logged out, I'm going to block the IP until your bot is logged back in. See here. --AmaryllisGardener talk 14:03, 11 October 2014 (UTC)Reply[reply]

~ 1000 duplicates[edit]

Hello, looks like your bot created ~1000 duplicate items yesterday. Please see Wikidata:Database reports/Constraint violations/P351#"Unique value" violations. — Ivan A. Krestinin (talk) 17:34, 16 October 2014 (UTC)Reply[reply]

@Ivan A. Krestinin: Thanks for noting. I will work towards a fix Andrawaag (talk) 19:56, 16 October 2014 (UTC)Reply[reply]
Hey Andrawaag. You could use WikidataQuery to check if an item with a certain claim already exists. For example does [1] return all items with the claim Entrez Gene ID (P351)=1017. However, this will probably slow down your bot significantly. Another option is to get with [2] all items and P351 values within one call. --Pasleim (talk) 07:09, 18 October 2014 (UTC)Reply[reply]

Hi, your bot imported some values yesterday that created some violations; I tried a few of them with just the number (not the MTHU part), but they all still seem invalid, so I'm wondering if maybe you used the wrong property for this import? Jon Harald Søby (talk) 15:35, 12 February 2015 (UTC)Reply[reply]

Please see the comment above. --- Jura 09:47, 9 May 2015 (UTC)Reply[reply]

@Jura: Thanks for noticing. The duplicates were unfortunately created in a bot effort on may 5th. I did respond to the comments by requesting a Bulk deletion request, unfortunately that didn't went well. Since I have been travelling I haven't been able to respond earlier. I am working towards a fixAndrawaag (talk) 22:48, 10 May 2015 (UTC).Reply[reply]

p2175 and p2176[edit]

medical condition treated (P2175) and drug or therapy used for treatment (P2176) are ready. Some discussion about labels and descriptions still needed. --Tobias1984 (talk) 12:29, 5 October 2015 (UTC)Reply[reply]

ProteinBoxBot Mistake?[edit]

I think that this edit by ProteinBoxBot is a mistake because the formerly related article en:Huntingtin perfectly fits in the discrption of the label. And the UniProt-ID is also the same (P42858).--Sonabi (talk) 17:55, 8 October 2015 (UTC)Reply[reply]

The bot is removing wikidata entries in many items just to add the entry on another item, whereby the first item describes the protein and the second the related gene, like here and for the gene describing item here. What is the sense of that? --Sonabi (talk) 22:58, 8 October 2015 (UTC)Reply[reply]

@Sonabi: It is not a mistake, it is done on purpose and is motivated by the release of arbitrary access. The Wikidata model only allows the inclusion of a wikipedia page only once. This is limiting in the sense that a protein page can contain a lot of gene information. Due to this one-page limit, a link to the appropriate Wikipedia page can't be set to the Wikidata item on that gene. However, with arbitrary access in place, gene information can be harvested from Wikidata items on genes to be used on Wikipedia pages on proteins. Moving the Wikipedia links from protein Wikidata items to the appropriate gene items, is needed to start using the arbitrary access in our workflows. Andrawaag (talk) 14:10, 9 October 2015 (UTC)Reply[reply]
Yes, but the actually problem which I forgot to mention is that the bot is not transferring the entries of the other languages, while transferring the English entries. The result is that the connection between the related articles in the different languages is going to be lost. Or will the bot also move these entries in future? --Sonabi (talk) 14:53, 9 October 2015 (UTC)Reply[reply]
@Sonabi: Its crucial for successful use of wikidata content within gene articles on all the Wikipedias that we make use of a stable data structure. After quite some discussions on wikidata 1 2 and on EN Wikipedia [3] , the community decided on a model where the structured information that would typically be consolidated on a single textual article is distributed in multiple, interlinked wikidata items corresponding to genes, proteins (or other gene products), and orthologous genes. To make use of that structure to build wikipedia infoboxes, we need the interwiki links to originate in the gene. During an initial import of wikipedia articles into wikidata items, many of the gene pages were brought over and classified as protein articles. These edits are correcting that and improving the data. Note that nothing is being lost. All the data in the protein items that aren't more appropriate for the gene (e.g. the gene expression images) are still there and can be reached from the gene via the encodes property, including the labels in the other languages. We now have two test articles up in EN Wikipedia that use this structure to compose their infoboxes directly from wikidata. See ARF6 and RREB1. It should be possible to re-use these patterns for any language Wikipedia, you would just need to set the interwiki link from the gene item to the correct page in that language wikipedia. We know what these should be in EN Wikipedia, but not necessarily in other language wikis. If you can help us identify the correct links in your languages, we could help with those mappings. --I9606 (talk) 16:10, 9 October 2015 (UTC)Reply[reply]
@Sonabi: Just to emphasize that the connections are not lost. If a Wiki article is interwiki-linked to a protein data item and properties are moved from the protein item to a gene item, all of that information can still be retrieved by requesting it through the 'encoded by' connection between the protein and the gene. The conversion that Andrawaag is implementing is just adding content and giving it a better semantic structure that has been agreed upon by the wikidata molecular biology community.--I9606 (talk) 16:28, 9 October 2015 (UTC)Reply[reply]

RE: Proteins and genes should not be merged[edit]

Hello. I didn't intend to merge proteins with genes, I'm well aware of their differences. It happened that I was reading Tafazzina on and noticed that the same article do exist on (Tafazzin), only unlinked, so I linked one with the other. Just compare the two links, these are the same protein - not the corresponding gene, called TAZ - so maybe the problem lies on Wikidata infos. But those are the same thing, that's for sure.Khruner (talk) 13:56, 17 November 2015 (UTC)Reply[reply]
EDIT - For some reason the Italian article seems to encompass both the protein and the gene, so maybe the problem lies here. Khruner (talk) 14:02, 17 November 2015 (UTC)Reply[reply]

Phenotype information[edit]

I am learning now how the pywikibot interface works. And then the first step that would seem interesting is to start adding taxon ids to most of the genomes as still many genomes do not have this information. How shall I begin with this? Make my own taxon script and dsmz script or should I merge this into the bot you have or are there other guidelines? I tried to find your email but was unsuccessful to that end.

--jjkoehorst (talk) 06:10, 8 January 2016 (UTC)Reply[reply]

@Jjkoehorst: Our Bot is based on a python framework we call ProteinBoxBot. It basically is a 2 layer framework consisting of a core layer which takes care of communicating and dealing with different wikdiata issues (e.g. duplicate resolution etc). On top of this core layer - called PBB_Core - resource specific scripts are developed and maintained. @Sebotic has written a nice blog that might get your started. Extensive documentation is maintained on our projects Bitbucket repository. However, having the bot written is only part of the solution. We typically follow the following workflow with a new resource.
  1. Make sure the data license attached to the source allows distributing content on Wikidata (CC0)
  2. Check for existing records from your resource in Wikidata and make sure they are all correct and accurate
  3. Model 1 of 2 representative records from the resource under consideration
  4. During the modeling process it will become clear whether or not all needed properties do exist in Wikidata. If not, you need to propose the requires properties
  5. When all properties are in place either develop your bot or run your developed bot on your model items. These should not be broken by your bot.
  6. Run on 10 items
  7. Run on 100 items
  8. Run on 1000 items
  9. Once confident enough run on all.

I typically leave time in between the subsequent runs for possible issues to surface.

I am a bit hesitant to share my email address here, but if wanted you can reach me with a DM on twitter (handle: @andrawaag) --Andrawaag (talk) 16:42, 8 January 2016 (UTC)Reply[reply]

bot deleting and altering data[edit]

Hello Andrawaag.

The bot User:ProteinBoxBot made an enormous amount of changes recently, amoung of these are the following: edits with the tag "wikisyntax", where it deletes most of the data (for example [3]) and in others following it, it changes the type from enzyme to protein family (for example [4]), where it doesn't seem to be correct. can you halt and verify this activity? Hummingbird (talk) 23:18, 11 January 2016 (UTC)Reply[reply]

It even undid one of my edits: -- numbermaniac (talk) 01:22, 12 January 2016 (UTC)Reply[reply]
@Numbermaniac: Just to quickly replicate my reply also here: Sorry for that, I tried to move the interwiki link to a new item by undoing earlier changes, because the Wikipedia page deals with the enzyme class and not with the human specific type of this enzyme. Will create a new item manually.
@Hummingbird: Hi, sorry for the confusion, I would like to explain what is going on right now. In recent days, I took care to clean up the Wikidata data model for genes and diseases in order to align it with what was discussed in the Wikidata project molecular biology [5]. According to this discussion, interwiki links from Wikipedias should go to Wikidata gene items (subclass gene) and only if the topic is really only about the species-specific protein, the link should got to the protein. This data model is also required to be able to populate the Gene Wiki info boxes with our new 'info box gene' module in English Wikipedia [6] which will build the info box entirely from data fetched from Wikidata. See also our preprint here, explaining details: [7].
What specifically happened in recent days:
  • I merged all orphaned items which were "found in taxon": 'human' and 'subclass': 'protein' but did not have identifiers except their label, into Wikidata human gene items. This affected ~ 2,800 items and it also unified ~800 interwiki links of different Wikipedia languages on the human gene items. I did these merges via script supported manual curation, so most of them should be correct now.
  • There were also ~350 protein items which had interwiki links on them linking to Wikipedia Gene Wiki pages. Unfortunately, some of these links also went to enzyme classes or protein families, not to Gene Wiki pages. Some of these got hijacked recently and transformed the enzyme class/protein family items into human protein items. In the first case, I moved and unified the interwiki links onto the Wikidata human gene items. In the second case, I reverted the changes made earlier to reestablish the enzyme class/protein family. This is what you saw as deleted information in your example, but in total not more than ~100 items were affected by these deletions. The deleted protein information will be re-added to Wikidata as new items in the coming days and linked to the human genes accordingly. This seemed to be the best solution to untangle the protein family/human protein problem. For the enzyme classes and protein families, I will go through all of them and add the enzyme classification numbers and other useful information, so these can be used as subclass of and instance of values on Wikidata species specific protein items like human or mouse. I did these merges also by script assisted manual curation, so this should be quite reliable.
  • Gene ontology term cleanup: I also did extensive Gene Ontology term cleanup to remove wrong terms from Wikidata human and mouse protein items. You can see that because almost all constraint violations for Gene Ontology terms now are cleared [8][9][10][11]. In the coming days, we will also do a fresh import of proteins directly from Uniprot, also involving Gene ontology terms.
In summary, most Gene Wiki Wikipedia pages should now link to their correct Wikidata human gene item and most of the orphaned items, which would confuse users and do not make sense in a human gene/human protein and mouse gene/mouse protein data model, as described above, have now been unified and cleaned up. Except for the enzyme classes/protein families, no data has been deleted, and for those, we are about to re-add the data. I hope this gives you an overview of what I did, looking forward to your comments/suggestions. Best, Sebotic (talk) 07:34, 12 January 2016 (UTC)Reply[reply]
Hello Sebotic. In the specific cases I had mentioned, it didn't make sense to me, so I just wanted to make sure that it wasn't a case of a bot that got out of control. As long as this is a planned action, I'm calm. thanks for reply. Hummingbird (talk) 10:19, 12 January 2016 (UTC)Reply[reply]

Modification of items[edit]

Hello, I am transfering data from botulinum toxin group (Q208413) to botulinum toxin type A (Q4095199) because there are different types of botolotoxin. botulinum toxin group (Q208413) will focus on the general features of all toxins (type A to G) and botulinum toxin type A (Q4095199) will be focus on botolotoxin type A. I don't know how this can affect your bot about drug so please take care later if you are doing an update of the data. Thank you Snipre (talk) 13:36, 19 January 2016 (UTC)Reply[reply]

@Snipre: Hi! Thanks, that's an important cleanup step to perform. ProteinBoxBot will not touch any item which does not have at least one of a set of unique core identifiers (either Drugbank ID, ChEBI, ChEMBL, Pubchem, UNII), so in the Botox case, it would only touch item Q4095199. If the identifiers cannot be mapped reliably, no data will be written, but a conflict will be logged for manual inspection of the item. In case no item with the appropriate identifiers can be found, a new item will be created.
I guess the Chembl ID on the general botox item Q208413 should also be transferred to Botox A or deleted? I did a similar cleanup recently, cleaning up the generalized topic of Vitamin B and the actual chemical compound Cyanocobalamin. What also seems to require a lot of cleanup is stereoisomeric compounds e.g. for amino acids and sugars. Very frequently, I see Wikidata items containing a mix of identifiers for all 3 possible cases (e.g. the L-, D- and DL-mix forms). We will not come around manual cleanup here, I think. Best Sebotic (talk) 20:10, 19 January 2016 (UTC)Reply[reply]
Sebotic I cleaned the general item about Botulinum Toxin so you won't find any identifier about the type A there. Snipre (talk) 20:43, 19 January 2016 (UTC)Reply[reply]
By the way can have a look at these 2 items, neurotensin (Q419576) and NTS (Q14904891) ? One is the gene and one is the protein but they have the same PubChem ID 16129680. I have some trouble to define what is the correct molecule. Thanks. Snipre (talk) 21:01, 19 January 2016 (UTC)Reply[reply]
@Snipre: I fixed that one, in addition to the fact that the PubChem ID was outdated, a Pubchem ID should certainly not be on a Wikidata gene item. This info was added by KrBot, which seems to take data from Wikipedia info boxes and add these to Wikidata. I am not sure if these kinds of imports are really useful anymore for domains like genes/proteins or drugs where we do the imports from primary, authoritative databases. Thx! Sebotic (talk) 22:03, 19 January 2016 (UTC)Reply[reply]

Update item Q19856779[edit]

Hello, I transfered some data from mitomycins (Q417625) to mitomycin (Q19856779) but without the reference. Please add that item in your next update session. A added only the identifiers used to extract the other ones from external databases. Snipre (talk) 20:23, 24 January 2016 (UTC)Reply[reply]

Please check if these items can be merged[edit]

Hello, can you check if your bot didn't create duplicates for the following cases:

Thanks Snipre (talk) 22:08, 2 February 2016 (UTC)Reply[reply]

@Snipre: Hi! The bot can create duplicate items on purpose under clearly defined circumstances. The bot searches for items which have a certain set of unique IDs (Drugbank, Pubchem, ChEMBL, CHEBI, KEGG, Inchi_key). If it does not find an existing item on that basis, it creates a new item. This is what happened here. Item ephedra (Q20817199) got created on 13th August 2015, but item ephedra (Q13530468) only got added the Drugbank ID on 13th October, so before 13th October item ephedra (Q13530468) could not have been recognized as the appropriate item by my bot. This is done, because matching labels or aliases can cause extensive problems by writing to the wrong item(s). For data consistency, it is better to create a new item and merge it later on, than to produce Wikidata items with the wrong data on them. Should at some point both items have one of the IDs mentioned above, they will be detected by my bot. You can either merge these items or I will do it in a few days, so you have time to recapitulate how these duplicate items came into existence. Sebotic (talk) 23:06, 2 February 2016 (UTC)Reply[reply]
@Sebotic: Ok, thanks for the explanation. For Ephedra we can keep both items separated, one for the drink (medicinal preparation and one for the molecule. But I didn't find a clear definition of the molecule so that why I am wondering if the molecule really exists. For me DrugBank is not clear about that the difference mixture of molecules and one molecule. Snipre (talk) 08:20, 3 February 2016 (UTC)Reply[reply]


exact match (P2888) is ready. --Tobias1984 (talk) 19:28, 5 June 2016 (UTC)Reply[reply]

Problems with ProteinBoxBot[edit]

Hello, Andrawaag. I've just described a problem with ProteinBoxBot on its talk page. (It's topic #28). I don't know how often you check that page, but perhaps you'd like to take a look. Thanks for your attention. Akhooha (talk) 20:22, 4 July 2016 (UTC)Reply[reply]

Hello, Akhooha I have just responded to you on that page --Andrawaag (talk) 20:36, 4 July 2016 (UTC)Reply[reply]

Share your experience and feedback as a Wikimedian in this global survey[edit]

  1. This survey is primarily meant to get feedback on the Wikimedia Foundation's current work, not long-term strategy.
  2. Legal stuff: No purchase necessary. Must be the age of majority to participate. Sponsored by the Wikimedia Foundation located at 149 New Montgomery, San Francisco, CA, USA, 94105. Ends January 31, 2017. Void where prohibited. Click here for contest rules.

Unused properties[edit]

This is a kind reminder that the following properties were created more than six months ago: MGI Gene Symbol (P2394), UCSC Genome Browser assembly ID (P2576). As of today, these properties are used on less than five items. As the proposer of these properties you probably want to change the unfortunate situation by adding a few statements to items. --Pasleim (talk) 19:08, 17 January 2017 (UTC)Reply[reply]

@Pasleim: Thanks for the reminder. MGI Gene Symbol (P2394) has been added to the workflow, and contains more items now. UCSC Genome Browser assembly ID (P2576) points to reference genomes. For now we only cover 4. We plan to extent in the near future, but for now I hope it is okay to have this small number of items. --Andrawaag (talk) 17:54, 19 January 2017 (UTC)Reply[reply]

Your feedback matters: Final reminder to take the global Wikimedia survey[edit]

(Sorry to write in Engilsh)

Disambiguation pages standing in Dutch election[edit]

Great work on adding the election candidates. However, there are a few places where you've marked the details on a disambiguation page rather than the actual candidate:

SELECT ?item ?itemLabel
  ?item wdt:P3602 ?election; wdt:P31 wd:Q4167410 .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "nl" . }
Try it!

--Oravrattas (talk) 07:06, 5 March 2017 (UTC)Reply[reply]

Thanks, for reporting these. They are fixed now.

--Andrawaag (talk) 14:15, 5 March 2017 (UTC)Reply[reply]


Hoi Andrawaag, descriptions zijn niet het begin van een zin en dus normaal niet met een hoofdletter. Voor items zoals Jan Baas (Q28872044) kan je waarschijnlijk beter iets als "Nederlands politicus" doen. Multichill (talk) 18:02, 13 March 2017 (UTC)Reply[reply]

Ik heb een overzicht op User:Sjoerddebruin/Dutch politics/Tweede Kamerverkiezingen 2017 gezet. Begint er goed uit te zien! Multichill (talk) 19:54, 13 March 2017 (UTC)Reply[reply]
En mocht je de smaak te pakken hebben: 2012. Multichill (talk) 20:31, 13 March 2017 (UTC)Reply[reply]
@Multichill: Eerlijk gezegd heb ik de smaak best te pakken. Maar misschien is het interessanter om eerst de uitslag per kandidaat toe te voegen, zodra deze beschikbaar zijn. Daarnaast moet ik ook nog wat tijd vinden. Nu is het relatief arbeidsintensief omdat ik alles via Quickstatements doe. De bot accounts waar ik toegang toe heb, hebben geen task permissie voor verkiezingsdata. Dat gezegd hebbende, zullen we samen een bot account aanmaken, specifieke voor verkiezingsdata? Het zou sowieso interessant zijn om te zien of de Wikidata integrator platform die we nu gebruiken rondom genen, eiwitten en ziektes ook generiek toepasbaar is op andere domeinen. --Andrawaag (talk) 21:59, 14 March 2017 (UTC)Reply[reply]
Politici die landelijk actief zijn in Nederland is een domein waar al behoorlijk wat werk is verzet. Daarom had ik het ook onder User:Sjoerddebruin/Dutch politics gehangen.
Je kan een bot account aanmaken voor je eigen projectjes zoals bijvoorbeeld de politici. Ik heb ondertussen geloof ik wel een dozijn bots voor allerlei verschillende taken.
Ik ben zelf in allerlei domeinen actief, maar de laatste tijd voornamelijk schilderijen. Daar hebben we er ondertussen ook al meer dan 200.000 van :-) Multichill (talk) 17:57, 15 March 2017 (UTC)Reply[reply]
Heb jij een handige manier op [12] aan de juiste persoon toe te voegen? Multichill (talk) 15:16, 1 April 2017 (UTC)Reply[reply]

affirmé dans : Banque-Carrefour des Entreprises[edit]

Cette référence manque de précision : ouvrage avec page, url ? --Jmh2o (talk) 13:45, 14 May 2017 (UTC)Reply[reply]

Tout a fait.J'ai ajouté le lien a Q16626729.--Andrawaag (talk) 10:33, 15 May 2017 (UTC)Reply[reply]

any reason to use P31 and P279 for the same value?[edit]

Is true that every Q14912958-gene is indistinguishable from other Q14912958-gene?

I would claim grain of sand as class (P279 of some other class), even we cannot distinguish wast majority of them. Because we can isolate any grain and claim P31 grain of sand. d1g (talk) 10:06, 15 May 2017 (UTC)Reply[reply]

biological pathway[edit]

I'm not sure: can we merge biological pathway (Q28864279) (that you created) and biological pathway (Q4915012)? Thank you, Tubezlob (🙋) 19:49, 16 May 2017 (UTC)Reply[reply]

Belgian post code[edit]


There is a discussion right now on Wikidata:Bistro (Topic:Trdng8v51r0sq785) about the unusual duplicates adding of post code you made on several belgians communes. Could you explain why you did this (in French or in English, as you prefer).

Cdlt, VIGNERON (talk) 11:20, 28 May 2017 (UTC)Reply[reply]

Replacing values and descriptions[edit]

I'm not sure if I'm happy with edits like Special:Diff/528268556 and Special:Diff/528268773.

  • You're replacing a valid French label with a screaming label
  • The new descriptions lack capital letters
  • You're deleting valid P31 values
  • You're adding coordinates as seperate statements on companies, while those should be qualifications of headquarters location
  • The date of inception differs, still you delete the current value instead of adding another one

Sjoerd de Bruin (talk) 14:21, 29 July 2017 (UTC)Reply[reply]

I also think that the organization for example behind a music festival should have its own item. Sjoerd de Bruin (talk) 14:22, 29 July 2017 (UTC)Reply[reply]

User:Sjoerddebruin Thanks for noticing. ]]

Indeed, this is the official name as they have registered, but that is not an excuse. I will decapitalise here.

In both cases the values were only deleted if a proper reference is missing, the inception dates now added are the official inception date according to governmental data. My reasoning was, that once you have a official inception date, you can delete the non-referenced one.
Is there a policy here? It is not always a headquarter location. Per Belgian Enterprise Number there can be multiple coordinates applicable. I was actually considering all as a list under P625
I might agree with you here. The thing I did in this exercise, was adding additional properties from the Crossroadsbank of enterprises, based on already entered enterprise numbers. So the festival already had a enterprise number. In these case the I would argue that the for the festival page itself the enterprise ID would be removed and a link would be made to the organisation on a separate WD item  – The preceding unsigned comment was added by Andrawaag (talk • contribs).

Same for Vrije Universiteit Brussel (Q612665) or Federal Public Service Budget and Management Control (Q636971) etc.

Please fix all these errors in all your edits and next time please consider to consul such import at Wikidata:Project chat or Wikidata:WikiProject Companies talk.Jklamo (talk) 09:58, 1 August 2017 (UTC)Reply[reply]

@Jklamo: Yes I have stopped overwriting non-referenced p31 statements. I disagree that those are valid P31 statement, because it is hard to see if a P31 statement is a valid statement if there is no reference added to back that claim. Having said that, as you righteously mention, the business enterprise P31 statement I added, also does not make sense, since non commercial enterprises do exist in the source. So I am not in a position to strongly disagree. I need to figure out a better approach here. So yes I will leave the P31 statements in place.
Can you point me to where the headquarter - coordinates pair was decided. I am wondering is this can be discussed. This pattern in it self is more error prone then using direct coordinates as statement, since the next to geocoding - records typically use strings to describe places - there is an additional step to resolve the location name to its Wikidata identifier. Locations can have multiple names and multiple locations can have the same name.. So having to resolve the identifiers after geocoding introduce an additional step the might lead to duplicates being generated. I am currently figuring out if this can be solved using SPARQL, but that only works if for all locations additional properties such as zipcodes are available.
Either way I will consult the project chat on my next attempts, this was not so much a import as it was to try to make sense of all statements in Wikidata that do have statements with [Belgian Enterprise Number (P3376)] --Andrawaag (talk) 19:12, 1 August 2017 (UTC)Reply[reply]

  • This looks like item re-purposing with a misleading edit summary. Please stop both.
    --- Jura 12:47, 1 August 2017 (UTC)Reply[reply]

Double WikidataID[edit]

Hi Andra,

Let me introduce myself: I'm a PhD student of Egon (no last name needed I think ;)....) Today, I found for entry Q24745328 a duplicate, Q31202267. Are you okay with removing the latter? And, small, question, is it possible in Wikidata to flag duplicates?

Kind regards,




Your bot add two self-referenced statement on Actrapid (Q2034113)

Could you look at it and correct it, if possible.

Cdlt, VIGNERON (talk) 13:17, 5 August 2017 (UTC)Reply[reply]


Hello. I have noticed that you have described Q29576195 as a business enterprise (Q4830453) because it is present in KBO/BCE. Imho, this is not correct. Q29576195 is an ASBL/VZW and can be primarily described as a research centre. More generally, the fact that an entity is present in KBO/BCE (even if the O means ondernemingen) does not mean that it is a business enterprise, because all the Belgian legal entities are present in there. For example: Federale Overheidsdienst Financiën, KUL, RVA..., which are not business entreprises. Best regards, BrightRaven (talk) 09:31, 9 August 2017 (UTC)Reply[reply]

It seems that you have made all those changes with a bot. This is really not good. Why have you removed all statements in Property:P31 and replaced them by Q4830453? This makes no sense: Q313966 is a business enterprise OK, but it is better described as a bank, as it was before your action. Same for Q3512080, better described as a transport company and a government-owned corporation. Same thing for the label: the official name in KBO/BCE is not always the usual name. There was no reason for this massive, automatic action. Please avoid such things in the future. I have added all those Belgian enterprise number in those items. I regret this has allowed you to erase a lot of correct useful data. BrightRaven (talk) 10:21, 9 August 2017 (UTC)Reply[reply]
@BrightRaven: I agree with you that stating something to be a business enterprise because it is mentioned in the KBO is inaccurate. I have stopped my efforts until I have a more accurate way of describing what is stored in the KBO. This requires some subsequent analysis. I have not deleted all the P31 statement, only those were overwritten if the original record did not have a reference because there is no way of knowing if that statement is accurate. I understand that is frowned upon, so in next efforts, I will leave them in tact. Having said that in the long term I do intend to start a discussion on how to deal with reference less statements. The do provide some noise, because there is no way in validating whether such a statement is accurate. So in the long term, I would argue to overwrite P31 statements if no proper reference is given, provided the replacement does. As long as we did not have this discussion, I will maintain reference less statements on p31 --Andrawaag (talk) 10:32, 9 August 2017 (UTC)Reply[reply]
Even the European Parliament had become a business company. This is really good referenced data! More seriously, Property:P31 is often difficult to reference. In most case, this must be done on case-by-case basis, in reviewing sources (often the sources of the Wikipedia articles). Except in limited cases, it cannot be done automatically with a bot. BrightRaven (talk) 10:38, 9 August 2017 (UTC)Reply[reply]

The automatic replacement of the labels is also wrong. Sometimes it is fully capitalized, as Sjoerd de Bruin noticed here above, but there are other problems: your action on SONACA item had the consequence that this item was impossible to find by writing "SONACA" (the most common name of this company) in French ("SONACA" was the label, you erased the label with the official name, so there was no "SONACA" any more in French in the labels or aliases). Moreover, the official name is not always the most common name. The subsidiaries of SRWT are rarely named by their official names (see [13] for example). Same for Filigranes: the official name is never used in communication to the public. Please review Help:Label: "The label is the most common name that the item would be known by." So do not replace any more the labels with the official names from KBO/BCE. This can be done only on case-by-case basis. (However, I think it could be added as an alias automatically.) (And labels do not have to be referenced.) BrightRaven (talk) 10:54, 9 August 2017 (UTC)Reply[reply]

Hey Andrawaag, can you please comment in Wikidata:Requests for deletions#Q28031601? Thanks, MisterSynergy (talk) 05:46, 17 August 2017 (UTC)Reply[reply]

Property for[edit]

Hi, I think you might be interested in this topic, and I'd like to know your views. Thanks. --~~ Yayamamo (talk) 01:47, 26 August 2017 (UTC)Reply[reply]

Description of excessive length[edit]

Hi, I keep reverting the edit of your ProteinBoxBot for the umpteenth time. Can you please train it not to add excessively long descriptions? According to Help:Description these bits of information should be short, there is no need to describe the whole chemical process in this field. Thanks. Csigabi (talk) 12:28, 2 October 2017 (UTC)Reply[reply]

It would be very polite if the bot refrains from replacing descriptions and labels. Sjoerd de Bruin (talk) 13:52, 2 October 2017 (UTC)Reply[reply]
The bot is adapted to prevent this in the future: [14] --Andrawaag (talk) 14:49, 3 October 2017 (UTC)Reply[reply]

ProteinBoxBot uses of P794 (P794)[edit]

Hello there. Your ProteinBoxBot created large numbers of items with statements in the format regulates (molecular biology) (P128)biological process (Q2996394)P794negative regulation of biological process (Q22260640) which contains the qualifier P794 (P794) that we are trying to deprecate due to untranslatability. Your input will be most welcome at this discussion. Deryck Chan (talk) 18:07, 2 November 2017 (UTC)Reply[reply]

Problematische aanmaak[edit]

Hallo Andrawaag, Ik zie dat je verschillende items hebt aangemaakt die nauwelijks informatie bevatten, bv Q56556174. Dit gaat niet goed en vormt een probleem. Dit item heeft véél te weinig gegevens om te kunnen bestaan. Minimaal zou aangegeven moeten worden (naast de coördinaten) wat het is, het land waar het zich bevindt en de bestuurlijke eenheid. Zie ook: w:nl:Wikipedia:Wikidata#Geografische_objecten. Zou je al je aangemaakte items minimaal kunnen voorzien van deze minimale gegevens? Dank! Romaine (talk) 18:14, 16 September 2018 (UTC)Reply[reply]

PS: Het voorbeeld dat ik net gaf heb ik aangevuld. Oja, "monument" is eigenlijk nooit een goede aanduiding met instance of (P31). Romaine (talk) 18:16, 16 September 2018 (UTC)Reply[reply]
Hallo @Romaine, deze item komen rechtstreeks binnen via Wikishootme, dus als ergens een probleem is moet je bij Wikishootme zijn. Ik ben het echter niet met je eens dat dit problematisch is, integendeel. Wikishootme biedt een hele makkelijke manier om monumenten uit de inventaris onroerend erfgoed toe te voegen. Door de kaart zie je meteen de context en het aanmaken van de items geschiedt met een klik op de knop. Ik ben het met je eens dat het wat vollediger kan en doe dat ook in batches. Maar punt blijft dat je netto meer items via wikishootme maakt, dan via directe input via Wikidata. Dus wees gerust het komt allemaal goed. Ik neem wel aanstoot aan de pedante toon in je communicatie. Veelvuldig gebruik van "!", "Probleem", "véél" etc. Mag het in het vervolg wat vriendelijker? Ik gebruik "erfgoed" als P31 property. --Andrawaag (talk) 18:57, 16 September 2018 (UTC)Reply[reply]
Hallo Andrawaag, Sorry, nee! Je bent vrij om een tool te gebruiken, maar je bent zelf verantwoordelijk voor de bewerkingen, niet de tool. Dat de tool makkelijk is, dat je een kaartje kan maken, etc, is leuk en aardig, maar dat doet niet af van het probleem, dat geen mening is maar een feit. Je maakt meer items aan, vrijwel lege items. Met lege items help je Wikidata niet, maar creëer je een bron van problemen. Ik neem aanstoot eraan dat je mijn bericht wegzet als "mening". Dus normaals mijn vraag: Zou je al je aangemaakte items minimaal kunnen voorzien van de minimale gegevens? Dank! Romaine (talk) 19:07, 16 September 2018 (UTC)Reply[reply]
Eh @Romaine waar stel ik dat jouw bericht een "mening" is? En terwijl je dit bericht schreef heb ik bij de nog ontbrekende items de P31 toegevoegd. Dus geen probleem meer :D --Andrawaag (talk) 19:24, 16 September 2018 (UTC)Reply[reply]
Hier: "Ik ben het echter niet met je eens dat dit problematisch is, integendeel."
Niet probleem opgelost, maar probleem verergerd/toegevoegd. Je hebt dus bij al die items een verkeerde P31 toegevoegd. Nota bene schreef ik nog in mijn eerdere bericht dat "monument" ook niet correct is. Ik run nu een tool om al die verslechteringen weg te halen.
Je gebruik van de tool Wikishootme met nieuwe items blijft een probleem. Andere gebruikers kunnen dus jouw rommel opruimen. Vind je het woord "rommel" onaardig? het is ook onaardig om er deze puinhoop van te maken. En sorry dat ik niet tactvol ben, dat wordt je vanzelf met gebruikers die niet snappen wat er nodig is voor een kwalitatieve kennisdatabase en dan vervolgens leuk roepen dat ze het ergens mee oneens zijn. Romaine (talk) 20:18, 16 September 2018 (UTC)Reply[reply]
En met dit alles vergat ik nog te vermelden dat we een database import in voorbereiding hebben voor al deze monumenten, maar we daaree verder kunnen als deze rommel dus opgeruimd is. Romaine (talk) 20:23, 16 September 2018 (UTC)Reply[reply]
Over welke rommel heb je het?! De statements die aangemaakt zijn, zijn gewoon conform. Ik zou graag de angel uit deze discussie willen halen. Dat ik geen verstand van kwaliteit heb is een kwalificatie waar ik mij niet in herken, en stel je ook gewoon om te beledigen. Maar laten we de toxiciteit uit de discussie halen. Wijs mij de structurele datamodel beschrijving, en ik zal die volgen. --Andrawaag (talk) 21:20, 16 September 2018 (UTC)Reply[reply]
Hallo Andrawaag, Excuses voor mijn geuite frustraties. We hebben ondertussen via een ander kanaal gesproken. We hebben vooral wat gedachten gewisseld, maar wellicht is het handig om enkele principes te beschrijven, wellicht wordt het dan helderder wat ik bedoelde te zeggen.
  • Als iemand tool gebruikt, is diegene zelf verantwoordelijk voor wat tool uitspuwt, niet de toolbouwer.
  • Voorzichtig zijn met data is prima, echter zorg er wel minstens voor dat je de minimaal benodigde data hebt, anders kan er nog niet begonnen worden met het aanmaken van items.
  • Het is de bedoeling dat de aanmaker van een nieuw item zorgt dat de minimale data is toegevoegd, niet iemand anders of op een later moment.
  • Minimale data is met erfgoed wat het object is (bv woonhuis), in welk land het zich bevindt, in welke gemeente, coördinaten en monumentnummer. Tevens is het nodig dat de omschrijving kort en krachtig beschrijft waar het onderwerp over gaat zodat er duidelijk onderscheid gemaakt kan worden.
Het kan dus niet zo zijn dat items aangemaakt worden, zonder met deze zaken dus rekening te houden. Als items enkel vrijwel leeg worden toegevoegd, dan ontbreekt er data die wel benodigd is. In die gevallen dienen dan géén nieuwe items aangemaakt te worden, omdat dit andere gebruikers opzadeld met extra werk (en dat is niet de bedoeling). Vrijwel lege items zijn een probleem.
Als deze output het resultaat is, is het met deze tool nieuwe items aanmaken waarschijnlijk niet geschikt om ter introductie te tonen met introducties, want dan wordt er een verkeerd voorbeeld gegeven.
Ten aanzien van data-import is de eerste stap het vergaren van data, de tweede stap reconcilieren met bestaande data en de derde stap import in Wikidata. Maar zoals gezegd, de derde stap kan pas geschieden als alle minimale data vergaard is en klaar is om toegevoegd te worden.
Kijkende naar de items die aangemaakt zijn, dan waren die nog niet geschikt om aangemaakt te worden wegens gebrek aan data. Het verkrijgen van die data zou niet zo moeilijk moeten zijn: ik heb op basis van de beschikbare data de items allemaal gecorrigeerd zodat ze nu wel de minimale data bevatten die een item minstens dient te hebben. Ook heb ik op basis van de beschikbare data de beschrijvingen herzien, zodat ze niet meer vreemd zijn en overeenkomen met wat gangbaar is. Graag dus niet opnieuw items aanmaken met te weinig data. Groetjes - Romaine (talk) 03:19, 17 September 2018 (UTC)Reply[reply]

Prevalence -> Number of cases[edit]

I have changed some of your edits on Ebola hemorrhagic fever (Q51993) from "prevalence" to "number of cases". No need to take action. Regards!--Micru (talk) 11:47, 7 November 2018 (UTC)Reply[reply]

I've disabled 'echo-subscriptions-email-article-linked'[edit]


Just a note to say that I've disabled the "Notify me when someone links to a page I created from another page." preference on your wikidata account, as it was generating a loooot of emails (over thousands), and being throttled by your email provider. This was due to User:GeneDBot creating/updating a lot of items that linked to Q20747295... Which, I'm guessing you don't really want that any emails about it.

You will also have a load of (echo) notifications too, I would guess. You might want to disable that too (but I'm not as it's not causing any problems).


Reedy (talk) 19:41, 27 March 2019 (UTC)Reply[reply]

Adding Commons compatible image available at URL (P4765) to items already illustrated[edit]

I'm not understanding the reason for adding Commons compatible image available at URL (P4765) to items that already have abundant illustrations on Commons. Jabiru mycteria (Q17970) currently has over 80 images on Commons: what is the utility of adding an external link to an apparently arbitrary photo? Why not just upload that image to Commons? -Animalparty (talk) 04:54, 7 April 2019 (UTC)Reply[reply]

@Animalparty: I did this not to flood commons. There are currently 1.2M photos of species in iNaturalist that have commons compatible images. So these could indeed by uploaded. However, not all of these images are really of good quality. I know that there exist similar questionable content on commons. So I thought of using property Commons compatible image available at URL (P4765) as an in-between. By using this we can decide in a later stage whether or not an image behind a URL in these statements can be uploaded. However, I have stopped adding new URLs to this property, on the request of @Multichill:, I think they are working on a different approach in leveraging the great content from iNaturlist. --Andrawaag (talk) 09:41, 7 April 2019 (UTC)Reply[reply]

Airline destinations[edit]

Hi Andrawaag, please use scheduled service destination (P521) instead of connecting line (P81) for adjacent airports, e.g. on Amsterdam Airport Schiphol (Q9694).

Biodiversity bot Belgian Species List additions[edit]

Hi Andrawaag - could you set your bot to add Belgian Species List entries under existing entries, rather than create new ones, please? Example, at Greater Scaup, it added a new English name entry, rather than as a new reference to the existing English name entry. I've changed it here to improve it. The problem is, with two English name entries, it results in very annoying superfluous duplication of the English name in the {{VN}} lists on Commons. Thanks! - MPF (talk) 13:28, 13 June 2019 (UTC)Reply[reply]

@MPF: The bot is actually designed this way un purpose, i.e. it reflects exactly how something is stored in the source of the statement. Different resources might have different conventions on how a species name is written. If this is an error where the convention is not followed, this should be fixed in the respective sources, not in Wikidata. I checked some random sources where vernacular names are printed and it seems that the Belgian Species list is actually following a convention where they are all printed without capital letters. So the initial edit where the common name is used with small-cap seems to be correct. so I would appreciate if you could revert your edit. But again I believe that this is a discussion that needs to happen with the curators of the primary sources, not at Wikidata. BTW it is specifically the plan to create a feedback loop where these different variants of the same name are reported back to the primary data curators, so I would appreciate if you could revert your edit to not spoil this. Andrawaag (talk) 14:36, 13 June 2019 (UTC)Reply[reply]
BTW I am inspired that like with, gene and protein names changing all names to all first letter being upper-case has drastic effects in its meaning, emphasizing the need that Wikidata should exactly reflect what is stated in the primary source. Andrawaag (talk) 14:41, 13 June 2019 (UTC)Reply[reply]
I'd be happy to undo my edits, AFTER the wikidata import into Commons can be fixed so that it doesn't duplicate variants like this into the {{VN}} lists on Commons, so they do not become massively cluttered with these senseless duplicates. Can that be done? Is there some way so that individual entries in the wikidata list can be tagged not to be exported? - MPF (talk) 17:50, 13 June 2019 (UTC)Reply[reply]
If you have any influence with the people managing the Belgian Species List, could you tell them that English names are by standard capitalised, too? - MPF (talk) 17:50, 13 June 2019 (UTC)Reply[reply]
@MPF: I have already shared this issue, with them. Do you have a reference to an authoritative source that shows VN are indeed by standard capitalised in English? Andrawaag (talk) 18:07, 13 June 2019 (UTC)Reply[reply]
Thanks! For e.g. birds, IOC, for mammals, MSW, for plants, BSBI - MPF (talk) 22:07, 13 June 2019 (UTC)Reply[reply]
@MPF: I am afraid it is not as simple as it looks. The conventions you list wrt species names are one of many. With the many naming conventions applicable, don't you think that demanding primary sources to adapt their naming convention to the ones you listed, actually is in violation with the NPOV principle? You see the same issue with capitalization within Dutch VNs. There the Dutch spelling rules state that species (with some exceptions when geographical names are part of the VN) are all small caps. I have been told by many native English speakers that this is also the case with English spelling rules. I think we should respect these different writing systems in place and I would propose to introduce qualifiers indicating which writing system is applicable. As an example, I edited one entry this way, where there are two Dutch VNs (one with capital, the other without) For the latter I added both the applicable writing system and a reference to these rules. --Andrawaag (talk) 12:26, 15 June 2019 (UTC)Reply[reply]
Hi Andrawaag - not sure I understand you there? Primary sources are outside of wikipedia, so are of course not subject to wikipedia's NPOV policy. But I would say it is reasonable to ask a secondary source (like English names given in the Belgian bird list) to follow exactly the primary source it derives from (in this case, English names decided by the English bird authority; i.e., IOC). But also to come back to Commons; that too is not bound by Wikipedia's NPOV policy (Commons:Project scope/Neutral point of view); for successful taxon categorisation of images, it has to follow a single authority for that taxon, or else the categories become a muddle of duplication. Commons follows IOC for bird taxonomy and naming; it is not helpful to have auto-imports from wikidata that disrupt this. - MPF (talk) 17:17, 15 June 2019 (UTC)Reply[reply]
@MPF: No it seems that the English bird authority is one authority, English grammar rules are one other. Don't get me wrong I have no preference for either one of them, but changing data in Wikidata because it fits your narrative better, is not a good way forward. I understand that issue with importing into Commons from Wikidata, that is why I proposed to use qualifiers to distinguish between different spelling rules, where you can pick the convention that fits the preference of commons, and I am happy to work with you on adding those qualifiers so you can choose which source to prefer in your automatic import. The question of VN seems to be a complex one, where there are multiple languages, which follow multiple conventions and writing rules and as said even in English. --Andrawaag (talk) 20:52, 15 June 2019 (UTC)Reply[reply]
Thanks; I see where you're coming from, though it does open the question of the definition of an 'authority': can a general dictionary really be considered an authority on bird names? Perhaps, but definitely a lower grade of authority than the gathered top expertise of the IOC. Thanks for the offer to help on the VN transfer to Commons issues - I have some ideas, but don't have the computer coding expertise to put them into effect; in rough order of importance, most important first: (1) create a new language en-us to use for names where American differs from English (wikidata already has en-ca and en-gb, but oddly not en-us, even though that differs far more often than others), and perhaps also en-au [Australian], en-in [Indian], en-nz [New Zealand] and en-sa [South African] (and likewise for Spanish dialects; Portuguese already has pt-br for Brazilian); (2) means to tag names from a particular authority (e.g. an authority having official status in a country) as being more important than those from other sources; (3) conversely, to be able to tag and block harvest of vernacular names which are archaic (e.g. names imported from copyright-expired books and no longer in use), offensive (see e.g. Racist Relics: An Ugly Blight On Our Botanical Nomenclature), or taxonomically out-of-date; (4) create the ability to sort the order of names on wikidata (currently strictly by date of addition); and (5) means to restrict the import into Commons to 1, or 2, or whatever, names per language, and to reject duplicates differing only negibly in capitalisation or hyphenation. Hope this helps! - MPF (talk) 23:54, 15 June 2019 (UTC)Reply[reply]

Andrawaag, it seems like you merged a two different paintings into one. Those paintings have different locations and sizes and even look quite different (much less detail in the bright area in the larger one). Can you unmerge? --Jarekt (talk) 15:59, 21 June 2019 (UTC)Reply[reply]

@Jarekt: I apologize and reverted them. Someone asked me how to merge duplicates, and I demonstrated it using this example was given. I should have checked better. --Andrawaag (talk) 07:52, 22 June 2019 (UTC)Reply[reply]

Community Insights Survey[edit]

RMaung (WMF) 17:38, 10 September 2019 (UTC)Reply[reply]

Reminder: Community Insights Survey[edit]

RMaung (WMF) 19:54, 20 September 2019 (UTC)Reply[reply]

schema entities clean up[edit]

hello.-- Hakan·IST 15:09, 27 October 2019 (UTC)Reply[reply]

WikidataIntegrator and maxlag[edit]

Currently WikidataIntegrator retries edits 10 times after hitting maxlag. This is too low after phab:T221774 as it may take up to one hour to get maxlag back to normal. See Topic:Vbgypuu9k0q1pvz5 and Topic:Vbgzje7sb25vd0b4 for more information. Also ping @Sebotic, Gstupp:.--GZWDer (talk) 11:40, 21 November 2019 (UTC)Reply[reply]

@GZWDer: Thanks for reaching out. I have increased it to 25. I have noticed the issue yesterday and took the running bots down. This this number affected the issues and if so how? Is increasing the number to 25 sufficient? --Andrawaag (talk) 13:59, 21 November 2019 (UTC)Reply[reply]
Not sufficient. "it may take up to one hour to get maxlag back to normal" - You may need to retry up to 60*60/5=720 times until a success edit. Previously I suggest 1000, but this would be a better choice.--GZWDer (talk) 16:27, 21 November 2019 (UTC)Reply[reply]
P.S. Pywikibot retries infinite times.--GZWDer (talk) 16:28, 21 November 2019 (UTC)Reply[reply]
@GZWDer: WDI already has that incremental wait in place. See for example this log. With each retry the time increments. e.g. "16:56:54 Backing off 3600.0 seconds afters 17 tries calling function with args". I am not sure we want infinte retries. I rather have an abort and wait until the api or the wdqs stabilizes. --Andrawaag (talk) 16:37, 21 November 2019 (UTC)Reply[reply]

Gender enzo[edit]

Volgens mij heb jij een mening over Wikidata:Property proposal/sex. Nu is het moment om die daar te ventileren. Multichill (talk) 21:02, 11 December 2019 (UTC)Reply[reply]

Your merge of Bradykinin and its InterPro family[edit]

In this edit you merged two different concepts, the specific peptide and a whole family. I'll revert this now but please refrain from such merges in the future. --SCIdude (talk) 07:11, 21 December 2019 (UTC)Reply[reply]



I'm thinking about improving Lexemes in French. First, I'm doing some checking about existing Lexemes and I stumble upon Conard (L697) (as nouns are not supposed to start with an uppercase letter). Is this lexeme about Conard (Q36904494)? or is it something else?

Cheers, VIGNERON (talk) 12:58, 12 February 2020 (UTC)Reply[reply]

Would you like to contribute to a WikiProject COVID-19 ?[edit]


First, thank you very much for the slide deck you made available about making bio data fair via Wikidata. Clarified many things to me a few months ago. With this whole corona situation, and given your experience in Wikidata, I was wondering if you would be interested in helping to create a Wikidata WikiProject COVID-19.

The goals would be initially (of course, they can be changed):

create a data model for instances of disease outbreak (Q3241045).

monitor the quality of the pages about national outbreaks listed in 2019–20 COVID-19 outbreak by country and territory (Q83741704).

curate the wikidata items relevant for describing the outbreaks and the virus itself.

curate and improve the information on Wikidata about scientific articles regarding the coronavirus (similar to the Wikidata:WikiProject_Zika_Corpus).

think and develop ways to process these items to improve access to information (for example, via automated articles in languages that currently do not have pages about country-specific outbreaks).

Would you like to participate in this effort?

I am trying to gather the Wikidata editors actively involved in the topic.I believe that if we act together, we can have a shot at aiding the global effort in containing the pandemic.


TiagoLubiana (talk) 00:29, 16 March 2020 (UTC)Reply[reply]

@TiagoLubiana: Happy to contribute. How do I help and where do we start? --Andrawaag (talk) 08:26, 16 March 2020 (UTC)Reply[reply]
@Andrawaag: Great! The Wikidata:WikiProject_COVID-19 is still being built, but I have limited experience with organizing such an effort. At this point, any intellectual input on how to organize the effort is more than welcome. For example, ideas on how to structure a subpart regarding the scholarly articles on the topic (as the Zika Corpus) would be super welcome.

Hi Andrawaag! Please check your current batch. All taxa at this list have allready the reference ICTV Master Species List 2018b.v2 (Q62075759). --Succu (talk) 15:20, 9 April 2020 (UTC)Reply[reply]

Hello! Please separate different names (separated by comma) into different values of property. Thanks. --Infovarius (talk) 22:15, 17 April 2020 (UTC)Reply[reply]

Call Wikiproject COVID-19 tomorrow (Monday, May 4th)[edit]

Hello Andra,

How are you doing?

I can imagine you have been busy, but I am here to invite you for the Wikiproject COVID-19 call tomorrow (Monday, May 4th). If you are able to meet, the call (in this link: will happen tomorrow at 15:00 UT.

As a note, 15:00 UTC is:

  • 11:00 AM in New York, USA
  • 12:00 AM in São Paulo, Brazil
  • 4:00 PM in Tunisia
  • 6:00 PM Eastern European Summer Time (EEST)

If you want, also feel free to add topics at the meeting page.


Cheers, TiagoLubiana (talk) 22:46, 3 May 2020 (UTC)Reply[reply]

Invitation for the WikiProject COVID-19 call tomorrow (Monday, 15 of June)[edit]

Hello Andra,

I would like to invite you to the tomorrow (Monday, 15 of June) call of the Wikiproject COVID-19 at 15:00 UTC .

This is the link for the call:

This is the link for the etherpad:

The WikiProject seems to be losing momentum. Perception of the size of COVID-19 as an enormous problem is fading (at least in Brazil). That might be a sign of things getting better, but it is also very dangerous (as they might not be that better yet).

In tomorrow's call, I would like to discuss with you two important and related questions:

- How can this WikiProject best serve the anti-COVID-19 effort?

- What can this project offer for the post-COVID-19 Wikidata world?

As usual, if you have any topics to add, you can do so either before at the Project Meeting Page or at the meeting. or at the meeting.

I hope you can make it!

All the best,


User Research - Looking for volunteers[edit]

Hey Andrawaag,

We talked very briefly at WikidataCon 2019 in Berlin. In fact, we are now starting a research project and I would appreciate your input, especially on Wikidata:

For a current research project we are looking for active community members in Wikipedia, Wikidata and Wikimedia Commons who are interested in filling out a survey and participating in an interview. The aim of the research is to identify opportunities for participation and access for new volunteers in the Wikimedia projects. For this purpose we would like to learn from the experiences of the already active community members and find out success factors for participation in the projects. Participation includes filling out a preliminary survey, a self-study and an interview.

If you are interested in participating, simply fill out this survey. The participants in the interviews will be selected on the basis of the answers in order to be able to consider as many different perspectives as possible.

All selected interview partners will receive a book or photo voucher of 25 euros as a thank you.

Feel free to forward this call to other people you think might share interesting experiences about Commons. If you have any questions you can of course contact me or check the project page :) On the Project Page you can also find more information.

Regards --Merle von Wittich (WMDE) (talk) 10:22, 18 June 2020 (UTC)Reply[reply]

Pharmaceutical product[edit]

Hi Andrawaag,

I noticed ProteinBoxBot created a lot of pharmaceutical product (Q28885102) instances like this one: Lincocin (Q47521770). Most instances has a RxNorm Id that corresponds to RxNorm Brand Name concept which differs from the "pharmaceutical product".
As an example I created this instance: LINCOCINE 500 mg, capsule (Q100908567). I would like to add additional information such as its dosage form (capsule), the country in which it received an authorization (France), its brand name ("LINCOCINE")...
Is it ok for you if I create many French instances of pharmaceutical product (Q28885102) like this one ?


Scossin (talk) 22:01, 26 October 2020 (UTC)Reply[reply]

Mass-deletion of external ids in favor of exact match (P2888)[edit]

Hi, in Wikidata:Property proposal/exact match property was approved as an undisruptive addition for existing system. But what is going on here then? Why did you delete almost all external identifiers in favor of exact match (P2888)? Was it some old unique constraint violation I don't see, was it a code mistake? Or were these changes intended? There are so many issues with these edits, I don't even know where to start from. It almost questions the necessity of "exact match" property in Wikidata. --Lockal (talk) 12:10, 20 February 2021 (UTC)Reply[reply]

@Lockal: Thanks for reaching out. There are three things going on here, but let emphasize that there in intentional effort going on to remove exteral ids in favour of exact match (P2888). External identifiers and exact match (P2888) serve different purposes. In the latter mappings to other URIs have the exact same meaning are stored to allow fast and efficient federation between wikidata and other linked data resources. Using external identifiers for this purpose is not always possible. This, because only the string of the identifier is stored and the location of those identifiers are build by the mediawiki extention, these are not available when relying on only the data. There is an exception where normalised values are rendered but is not stored in all external identifier, nor is it possible to point to synonymous URIs for the same external identifier. So in my efforts I am adding both (external identifiers and exact match (P2888). I have recently been removing external identifiers from Wikidata because the initial source no longer support that claim. This bas been the case in identifiers provided by the Disease Ontology and the Monarch Disease Ontology. Both resources have different levels of identifier mapping and initially those mappings were added using the mapping relation type (P4390) as qualifier and than indicating the level of mapping (i.e. broad match, exact match, narrow match). This turned out to be rather problematic for two reasons. First the semantics of the mappings were ambiguous in the source. It wasn't always clear if a mapping was an actual mapping or a more a reference to a different source. Recently the semantics of identifier mapping were made more explicit in for example the disease ontology. Secondly, using the qualifier mapping relation over complicated the use case in Wikidata. Using those qualifiers require complex query patterns to extract identifiers respecting the possible nuances. With these changes wrt identifier mapping I have been updating the representation of data comming from the Disease Ontology and the MoNDO ontology in Wikidata. i.e. only add an identifier if there is an exact match with other identifiers in Wikidata and also store URI's of the same concepts if they exist. Regarding the example you provide, that is clearly an error. I need to investigate what went wrong there. I suspect some sloppyness on my sight where I have used fowler syndrome as a driving example. I will take responsibility and clean up. --Andrawaag (talk) 15:41, 22 February 2021 (UTC)Reply[reply]

See also Wikidata:Schema proposals.--GZWDer (talk) 15:14, 10 September 2021 (UTC)Reply[reply]

Call for participation in a task-based online experiment[edit]

Dear Andrawaag,

I hope you are doing good,

I am Kholoud, a researcher at King's College London, and I work on a project as part of my PhD research, in which I have developed a personalised recommender system that suggests Wikidata items for the editors based on their past edits. I am collaborating on this project with Elena Simperl and Miaojing Shi.

I am inviting you to a task-based study that will ask you to provide your judgments about the relevance of the items suggested by our system based on your previous edits.

Participation is completely voluntary, and your cooperation will enable us to evaluate the accuracy of the recommender system in suggesting relevant items to you. We will analyse the results anonymised, and they will be published to a research venue.

The study will start in late January 2022 or early February 2022, and it should take no more than 30 minutes.

If you agree to participate in this study, please either contact me at or use this form

I will contact you with the link to start the study.

For more information about the study, please read this post:

In case you have further questions or require more information, don't hesitate to contact me through my mentioned email.

Thank you for considering taking part in this research.


Kholoudsaa (talk) 20:43, 24 January 2022 (UTC)Reply[reply]


Hallo Andra, Met plezier heb Tarsier ontdekt, dat je hebt ontwikkeld. Mooi en ik heb er een paar goede foto's mee binnen kunnen halen op Commons. Ik heb twee suggesties: het zou goed zijn om een Template:Tel toe te voegen op Commons, waarmee de vrije licentie kan worden gecontroleerd. Maar eigenlijk controleert het programma dat je hebt gemaakt dat al, dus misschien is het mogelijk om die check botwise te laten doen (zoals bij iNaturalistReviewBot dat doet bij uploads vanaf iNaturalist). Tweede ding is dat ik een upload kreeg waarin auteursnamen gescheiden werden door een pipe "|" en daar ging het "Information"-sjabloon niet goed mee om. Misschien kun je die automatisch laten vervangen door bijvoorbeeld een komma. Maar mooi werk! Dank! Lymantria (talk) 17:00, 1 February 2022 (UTC)Reply[reply]

@Lymantria: Fijn dat Tarsier waarde voor je heeft. Ik neem je suggestie ten harte en zal de Template:Tel toevoegen aan de volgende update. Omdat tarsier zijn beelden haalt vanuit verschillende bronnen, elk met hun eigen conventies is het niet altijd duidelijk, met licenties kan je niet voorzichtig genoeg zijn. Het zelfde geldt voor hoe auteur vernoemd worden. Kan je een van die beelden hier noemen, waar het mis ging het "|" teken? --Andrawaag (talk) 17:20, 1 February 2022 (UTC)Reply[reply]
Zeker, het ging om c:File:Rhigognostis annulatella (Curtis, 1832) 3320304459.jpg. Lymantria (talk) 21:38, 1 February 2022 (UTC)Reply[reply]
@Lymantria: De "|"s worden nu vervangen door een ",". [15]. Dank voor het aankaarten. --Andrawaag (talk) 12:54, 8 February 2022 (UTC)Reply[reply]
Mooi, fijn dat ik heb kunnen helpen. Lymantria (talk) 21:18, 8 February 2022 (UTC)Reply[reply]
Misschien nog een kleinigheid die je al had gezien of niet: Bij het toevoegen van Category:Reuse images with Tarsier gebruikt Tarsier {{}} in plaats van [[]]. Lymantria (talk) 21:44, 14 February 2022 (UTC)Reply[reply]

Request for Mentorship[edit]

Hello Andrawaag, I would like to be mentored by you. am from Ghana and I hope you will accept me to be your mentee so I can contribute to the Ga Language Wikipedia. Heatrave (talk) 03:50, 28 April 2022 (UTC)Reply[reply]

Hi Andrawaag, I would like to be mentored by you. I am from Ghana, west Africa in UTC:+00 time zone.I hope you will accept me to be your mentee.Alhassan Mohammed Awal (talk) 08:28, 15 April 2022 (UTC)Reply[reply]

Hi @Alhassan Mohammed Awal:, in that case let's start. --Andrawaag (talk) 18:15, 22 April 2022 (UTC)Reply[reply]
Alright. I'm ready Alhassan Mohammed Awal (talk) 18:22, 22 April 2022 (UTC)Reply[reply]

Hi Andrawaag, I would like to be mentored by you. I am from Nigeria, west Africa in UTC:+00 time zone.I hope you will accept me to be your mentee.Samstringz (talk) 02:48, 7 may 2022 (UTC)

@Samstringz I am honoured to be your mentor. With the other mentee's we are currently in the stage of collecting observations from the biodiversity using the biodiversity app iNaturalist Can you join us there?

Hello Andrawaag I would like to be mentored by you my name is AgnesAbah From Nigeria. I hope you will accept me to be your mentee. Thank You AgnesAbah (talk) 09:55, 26 May 2022 (UTC)Reply[reply]

@AgnesAbah I noticed that you are already active on our iNaturalist project. I am honoured to be your mentor. With the other participants in that project we are currently collecting (as you are) observations in Nature. Soon we will progress to use that observational data on the different Wiki's. Andrawaag (talk) 13:05, 26 May 2022 (UTC)Reply[reply]
@AndrawaagThank You for accepting me as one of your mentee's am highly honoured looking forward to ur mentorship AgnesAbah (talk) 20:29, 26 May 2022 (UTC)Reply[reply]

Removal of statement[edit]

Please note that in removing the MeSH descriptor ID (P486) statement here from CAMP (Q24739493), you did not move it to cathelicidin (Q110971613). You also did not sort out the description or aliases. Charles Matthews (talk) 06:08, 21 May 2022 (UTC)Reply[reply]

Further, the only justification I see for making the item to be one about a gene rather than a protein family, as created, is the 2020 edit by ProteinBoxBot. This seems very clumsy editing, given the numerous incoming links to the page relating to the protein family. Charles Matthews (talk) 07:20, 21 May 2022 (UTC)Reply[reply]

Wiki mentor Africa (WMA) - Requesting for mentorship[edit]

Hello @Andrawaag:,I am Accuratecy051, I'm from Nigeria. I would like to be mentored by you. I hope you will accept me to be your mentee 07:32, 30 May 2022 (UTC).Reply[reply]

Redundant items[edit]

Dear Andra,

please don't add duplicates like milk or milk based food product (Q112224246) – this is called dairy product (Q185217) in proper English.--Muselweib (talk) 13:08, 3 June 2022 (UTC)Reply[reply]

In the FIDEO ontology there seems to be a distinction between the two (see: To maintain the ontological relationships I created it as a subclass of diary product. --Andrawaag (talk) 13:19, 3 June 2022 (UTC)Reply[reply]

Mentorship Request[edit]

WMA - Request For Mentorship[edit]

Hello @mentor's username:, I would like to be mentored by you. I hope you will accept me to be your mentee 07:12, 21 June 2022 (UTC).Reply[reply]

Bot watchlist size[edit]

Regarding ProteinBoxBot: Hey there. Looks like your bot's watchlist has grown to a very large size, and our website's database administrators are interested in wiping the watchlist to save space. 1) Do they have your permission to do this? You can post a quick statement of permission in this Phabricator thread, or ping me and I'll tell them. 2) If your bot doesn't need to use its watchlist, can you look into making your bot not watchlist pages? This may involve logging into your bot's account, going to preferences, going to the watchlist sub-tab, and unticking a setting such as "Add pages I create and files I upload to my watchlist". Thanks. Novem Linguae (talk) 23:51, 20 July 2022 (UTC)Reply[reply]

Please see
Bot maintenance is no longer done. Also, the Watchlist only comes from page creations and is almost certainly not used by the bot. SCIdude (talk) 15:05, 21 July 2022 (UTC)Reply[reply]
@Novem Linguae I have followed your instruction and removed the settings for adding items to the watchlist. In the process I noticed the option to remove the full watchlist, I did select that option. Eitherway, you have permission to wipe the watchlist if that did not take care of that. --Andrawaag (talk) 15:33, 21 July 2022 (UTC)Reply[reply]

Not approved bot[edit]

Hi Andra, please apply for permission for user:AndrawaagBot at Wikidata:Requests for permissions/Bot. Multichill (talk) 11:28, 8 January 2023 (UTC)Reply[reply]

Nevermind, looks like the template is broken and doesn't link to Wikidata:Requests for permissions/Bot/AndrawaagBot 1. Multichill (talk) 11:32, 8 January 2023 (UTC)Reply[reply]
+1 Andrawaag (talk) 11:36, 8 January 2023 (UTC)Reply[reply]
Hi Multichill, I have already applied for permission for this bot, which was granted: Wikidata:Requests for permissions/Bot/AndrawaagBot 1 --Andrawaag (talk) 11:34, 8 January 2023 (UTC)Reply[reply]

Wrong OBO link[edit]

Hi Andra,

the ontology link at Connects (Q16869054) doesn't work. Big bushlips (talk) 19:22, 25 January 2023 (UTC)Reply[reply]