User talk:Daniel Mietchen

From Wikidata
Jump to navigation Jump to search
On this page, old discussions are archived after 90 days. An overview of all archives can be found at this page's archive index. The current archive is located at 2020/06.
The Echo notifications system on Wikimedia sites is usually maxed out for me across Wikimedia sites due to Wikidata-related notifications, which means I may not see it when you ping me, and Special:Watchlist on Wikidata has similar issues.

Wikidata weekly summary #292[edit]

Wikidata weekly summary #293[edit]

Wikidata weekly summary #294[edit]

Wikidata weekly summary #295[edit]


I've made a version in Module:Cite/sandbox. Tonight I got familiar with what the code was doing and added the "display (and link to) the Wikidata ID" functionality. See User:Daniel Mietchen/Interesting publications/Part 1 #Paper collection Obviously it can be formatted to your taste. I'll try to knock off your wish-list as time allows. --RexxS (talk) 19:47, 18 January 2018 (UTC)

@RexxS: That looks good — thanks! --Daniel Mietchen (talk) 01:53, 19 January 2018 (UTC)

Wikidata weekly summary #296[edit]

Item with P106=researcher only[edit]

Hi Daniel Mietchen,

Is there a way to add more statements to items like [1]? Their state makes it somewhat hard to match them with wp articles (sample).
--- Jura 10:18, 24 January 2018 (UTC)

The best tool we have for such cases is probably Scholia, which often identifies a more concrete field of work and through which papers can be found that might have affiliation information. I am keeping an eye on such items and adding additional information when I can. --Daniel Mietchen (talk) 22:42, 24 January 2018 (UTC)

Wikidata weekly summary #297[edit]

Wikidata weekly summary #298[edit]

Wikidata weekly summary #299[edit]

Wikidata weekly summary #300[edit]

Wikidata weekly summary #301[edit]

Wikidata weekly summary #302[edit]

Wikidata weekly summary #303[edit]

Please slow down[edit]

@Daniel Mietchen: Hi, you're currently making about 70 (rather large) edits per minute. This is stressing the change dispatch infrastructure, causing dispatch lag. Could you please slow down to under 30 edits per minute? Cheers, Hoo man (talk) 20:37, 14 March 2018 (UTC)

Thanks for the ping. My edit rates are fluctuating, but I have adjusted the sleep periods such that the maximum should not go above 30 for long. --Daniel Mietchen (talk) 21:52, 14 March 2018 (UTC)

Wikidata weekly summary #304[edit]

Wikidata weekly summary #305[edit]

Wikidata weekly summary #306[edit]

Academic article retractions[edit]

There is a property you proposed for errata to academic papers, but retraction is two items, Q7316896 and Q45203135. How would one identify an article as retracted? I recall some software from four years ago to flag retracted papers when one goes to cite them. HLHJ (talk) 23:42, 6 April 2018 (UTC)

@HLHJ: Not sure corrigendum / erratum (P2507) is the best way to go here in the long run, but it can certainly be used for experimentation, perhaps with some qualifier. In terms of which retraction item to use, I would go for the document (retraction notice (Q7316896)) rather than the process (retraction (Q45203135)). To identify retractions, the best source is probably Retraction database (Q51603165) (with currently about 17k entries). --Daniel Mietchen (talk) 00:47, 8 April 2018 (UTC)
Thank you for the database link. Forgive me, I'm ignorant here. How would one best attach the retraction notice (Q7316896) to the scholarly article (Q13442814) that it's about?
I've found only one article labelled as retracted, RETRACTED: A light-emitting field-effect transistor. (Q30955338). It uses significant event (P793) with value retraction notice (Q7316896). Are you saying that corrigendum / erratum (P2507) would be better than significant event (P793) here? For comparison, the article Evolution in Mendelian Populations (Q5418627) has a statement with the property corrigendum / erratum (P2507) holding the value Evolution in Mendelian Populations (Q20746731) (confusingly not titled "Correction to 'Evolution in Mendelian Populations'").
Separately, I've notice that automated OA-signalling seems to be running for some cites on Wikipedia, hurrah and congratulations! Where could I read about how it works? HLHJ (talk) 01:57, 8 April 2018 (UTC)
I gave that paper a try and think the significant event (P793) model works better. It should be using retraction (Q45203135), though, i.e. the action, not the document. Re OA Signalling, most of that automation is actually about papers that are free to read, not open in the sense of Open Definition (Q21605525) and thus not reusable on Wikimedia projects. In any case, you can read about the template approach (signalling the accessibility of the paper) here and the bot approach (linking to free-to-read versions) here. I welcome these efforts but do not contribute much, since my efforts in this space are focused on openly licensed stuff that can actually be (re)used on Wikimedia projects. Further, both approaches are focused on the English Wikipedia (though the templates are finding their way into other languages too), and I think information about accessibility should be curated on Wikidata and reused from there across Wikimedia projects, which is one of my motivations behind engaging with Wikidata:WikiProject Source MetaData and the closely related WikiCite. --Daniel Mietchen (talk) 03:22, 8 April 2018 (UTC)

Wikidata weekly summary #307[edit]

Wikidata weekly summary #308[edit]

Wikidata weekly summary #309[edit]

Wikidata weekly summary #310[edit]

Wikidata weekly summary #311[edit]

Wikidata weekly summary #312[edit]

Invalid DOIs imported from PubMed[edit]


I just noticed that Research Bot imports invalid DOIs from PubMed - I think it would make sense to make sure they start with "10." before adding them:

Pintoch (talk) 16:36, 18 May 2018 (UTC)

  • Thanks for checking. This is a known issue, and I have pinged PubMed and PMC about it. Agree it would be useful to fix on our end. --Daniel Mietchen (talk) 22:54, 18 May 2018 (UTC)
Nearing two years now, still seeing some of the old ID's. Has there been a response from PubMed and PMC. Saw there was some cleanup, but still sing more like Q52592126 - (not all from Research Bot, but wondering if there is a plan or process started to clean these up or should these just be removed and rerun? Wolfgang8741 (talk) 10:34, 19 April 2020 (UTC)

Wikidata weekly summary #313[edit]

Wikidata weekly summary #314[edit]

Wikidata weekly summary #315[edit]

Wikidata weekly summary #316[edit]

Parsed citations of the English wikipedia[edit]

Hi Daniel, Here is the dataset I was talking about: It does include author links in a parsed format. For instance:

{"PublisherName": "International Group of San Francisco", "Title": "Towards Anarchism", "URL": "", "Authors": [{"link": "Errico Malatesta", "last": "Malatesta", "first": "Errico"}], "ID_list": {"OCLC": "3930443"}, "Periodical": "MAN!", "PublicationPlace": "Los Angeles"}

Pintoch (talk) 08:16, 18 June 2018 (UTC)

Representation of Wikidata at the Wikimedia movement strategy process[edit]

Hi Daniel, I'm contacting you because I would like your support and your comments on my proposal to represent the Wikidata community at the Wikimedia movement strategy process. I'm contacting you in private because you are a member of the Wikidata Community User Group and I thought that this could be relevant for you.--Micru (talk) 18:47, 18 June 2018 (UTC)

Hi Micru. Saw this only now. Will take a look. --Daniel Mietchen (talk) 06:04, 26 June 2018 (UTC)

Wikidata weekly summary #317[edit]

Wikidata weekly summary #318[edit]

Wikidata weekly summary #319[edit]

Wikidata weekly summary #320[edit]

Wikidata weekly summary #321[edit]

Wikidata weekly summary #322[edit]

Wikidata weekly summary #323[edit]

Wikidata weekly summary #324[edit]

Wikidata weekly summary #325[edit]

Wikidata weekly summary #326[edit]

Translated titles from pubmed being incorrectly put as titles?[edit]

Hi, I noticed that the translated title on works originally in a different language such as Studies on human physical capability and gross energy transfer during industrial and traditional work in tropical climate (Q52332146) added by User:Research Bot are being added verbatim both as the title and the item title, including the brackets. Wondering if there's a way to fix this - at the very least we shouldn't have the brackets even if we don't have the original language title? Mvolz (talk) 16:36, 26 August 2018 (UTC)

@Mvolz: Thanks for checking this. It is a known problem that I do not know how to fix. The best I currently have in this regard is a query that will catch such cases. --Daniel Mietchen (talk) 19:32, 26 August 2018 (UTC)

Wikidata weekly summary #327[edit]

Wikidata weekly summary #328[edit]

Academic article conflicts of interest[edit]

I've just added Sugars, obesity, and cardiovascular disease: results from recent randomized control trials. (Q56479527) View with Reasonator View with SQID as an example for articles with conflicts of interest, and I'd be glad of your comments on what I got wrong. I wasn't sure how to name the supplements; when I made a list of them, I made up my own numbering, here I expanded from the Pubmed metadata. The editor, publication funder, and lead author are the same person, and are paid by industry groups with a financial interest in the paper topic. HLHJ (talk) 18:58, 5 September 2018 (UTC)

@HLHJ: The article already has a Wikidata item at Sugars, obesity, and cardiovascular disease: results from recent randomized control trials. (Q37521442), and I think the modelling there is OK. I am open to the idea that volumes and issues of series might get their own items, but haven't looked at that in detail. As for modelling the conflicts of interest, I think we need some new properties first, and I'm not sure we have an appropriate WikiProject to collaborate around such matters. --Daniel Mietchen (talk) 08:30, 6 September 2018 (UTC)
I'm sorry, I though I had searched for that. Thank you for the information. I merged the two, resulting in a web and a paper publication date, two representations of the author, and two of the supplement, which I will try and figure out how to fix. If supplements do not get their own items, would each article need tagging with its sponsor? The motivation is a project by Headbomb to make a list of unreliable sources, called en:Wikipedia:CRAPWATCH; see en:Wikipedia talk:WikiProject Medicine#WP:CRAPWATCH: Early version for details. HLHJ (talk) 00:59, 9 September 2018 (UTC)
Yes, we tend to express sponsorship/ funding on a per-article level. --Daniel Mietchen (talk) 04:18, 9 September 2018 (UTC)
Reply at Wikidata talk:WikiProject Source MetaData#Conflict-of-interest metadata. I had previously written about COI metadata there, as it seemed the most relevant Wikiproject, but got no response, and as I obviously don't know what I'm doing, I was hesitant to do much unadvised. Least you are wondering why I am bothering you, en:Conflicts of interest in academic publishing has a photo of you discussing this unphotogenic topic, so I knew you took an interest in it. Thank you for your patience. HLHJ (talk) 19:36, 9 September 2018 (UTC)

Wikidata weekly summary #329[edit]

Wikidata weekly summary #330[edit]

Wikidata weekly summary #331[edit]

Wikidata weekly summary #332[edit]

Suggestion for Research Bot[edit]

I have a suggestion regarding your bot when creating articles like Q38917924: is there a way for you to automatically include the Google Scholar paper ID (Property:P4028)? When you search for the article title on Google Scholar, it can be found in hidden in the "Cited by..." link:,50&sciodt=0,50&hl=en. Maybe you can think of a way to automatically extract it and include it during entry creation. --Bender235 (talk) 22:06, 3 October 2018 (UTC)

I can think of such ways, but Google does not welcome their implementation. --Daniel Mietchen (talk) 05:49, 4 October 2018 (UTC)
You mean when the scrapping is done at high frequency, or in general? If so, what shall we do about it? Google Scholar is certainly and popular and helpful tool. We should include this identifier (somehow) in my opinion. --Bender235 (talk) 14:21, 4 October 2018 (UTC)
They basically block anyone who tries to do such things at scale. --Daniel Mietchen (talk) 21:21, 4 October 2018 (UTC)
That's unfortunate. I posted a question on WikiProject Google about this. Maybe somebody is aware of a way to circumvent this. --Bender235 (talk) 13:53, 5 October 2018 (UTC)

Wikidata weekly summary #333[edit]

Unused researcher items[edit]

Any reason not to delete Q56960351 and Q56960986, which were part of one of your QS batches? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:34, 12 October 2018 (UTC)

No. Thanks for checking. --Daniel Mietchen (talk) 16:06, 13 October 2018 (UTC)

Karel Berka[edit]

Karel Berka (Q43370830) has an ORCID iD, so cannot be Karel Berka (Q884500), who died in 2004. I have demerged them. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:29, 13 October 2018 (UTC)

For a similar reason, I reverted you here. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:35, 13 October 2018 (UTC)
@Pigsonthewing: Thanks for checking, in both cases. --Daniel Mietchen (talk) 16:07, 13 October 2018 (UTC)
And again, twice more, on Q884500. Please can you find a way to exclude it from whatever is triggering these edits? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:37, 18 October 2018 (UTC)
These are ORCIDator batches that I started weeks ago, so cannot change them now. The problem seems to be that some papers have been assigned to the wrong person here, and the tool then associates the Wikidata entry for the author with the ORCID where the DOI of the paper is listed. So about some 70 paper items need to be cleaned regarding authorship — I put this on my to do list. --Daniel Mietchen (talk) 20:01, 19 October 2018 (UTC)
I have now fixed the authorship for all 77 papers currenttly indexed in Wikidata. --Daniel Mietchen (talk) 23:51, 20 October 2018 (UTC)

There's a similar issue on Alessandro Zorzi (Q2832828). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:40, 8 November 2018 (UTC)

And again on Q2832828. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:12, 13 November 2018 (UTC)
This seems to stem from multiple papers having been linked to this person: Special:WhatLinksHere/Q2832828. On that basis, ORCIDator then infers that this person must be the one on whose ORCID profile the papers were listed. The P50 statements on the papers need to be fixed. --Daniel Mietchen (talk) 19:44, 13 November 2018 (UTC)

Wikidata weekly summary #334[edit]

Wikidata weekly summary #335[edit]

Wikidata weekly summary #336[edit]