User talk:Daniel Mietchen

From Wikidata
Jump to: navigation, search


On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2017.

Understanding how the Wikidata community is editing items[edit]

Dear Daniel,

Together with some colleagues, I am running a survey to understand the way Wikidata editors edit items over time. We would like to know the extent to which you choose the items you want to edit, the criteria that you use to decide what to edit, the situations that trigger your edits, and the way these decisions change over time.

Before we conduct the survey, we would like to be sure that the questions are clear and we would like to get some feedback from Wikidatans. Given your expertise, I am writing you directly, to kindly ask you if you could answer the survey and give us feedback.

You will need around 10 minutes to complete the survey.

We are not interested in the edits of particular users, but rather in the editing strategies being followed by the community of editors. That’s why the responses to this survey are anonymous.

We plan to publish the anonymous results openly. We will share the results with the Wikidata community.

Thanks a lot in advance for your collaboration!

Link to the survey: https://docs.google.com/forms/d/e/1FAIpQLScmxdvsyupNDjhzV-JodQgiscXQShksczns0PGmQbLXWpb3cw/viewform

Cristina Sarasua <csarasua@uni-koblenz.de> --criscod (talk) Institute for Web Science and Technologies, University of Koblenz-Landau, Germany Member of Wikimedia Deutschland

Gianluca Demartini <g.demartini@sheffield.ac.uk> Information School, University of Sheffield

Just for the record, and to get the archiving working properly: the above entry was posted by User:Criscod on 28 August 2016. --Daniel Mietchen (talk) 15:13, 20 October 2017 (UTC)
Thanks. I should have added that in the signature back then, you are right. :) criscod (talk) 12:21, 22 October 2017 (UTC)

Wikidata weekly summary #280[edit]

Missing authors[edit]

Hi. I noticed that in Q25909434 there are many authors missing (you can verfiy this e.g. on ncbi). I don't know if that types of ommisions are common, but I am hoping you know some effortless way of fixing that. --22merlin (talk) 14:28, 6 October 2017 (UTC)

Thanks for the notification. It's not the only such case, but they are very rare. I have tried to fix it for this article, but the tools we have could not do it, since the author list provided by the PubMed API ends with "Abecasis, Gonçalo R", listing everyone else as investigator instead. The CrossRef API handles this differently, but I don't have a way to harvest that in a simple fashion. --Daniel Mietchen (talk) 23:11, 6 October 2017 (UTC)

Wikidata weekly summary #281[edit]

Taxon items with audio files[edit]

Hello, a volunteer and I have been working to share audio files from the Natural History Museum in London. As a result there are some 1,750 new audio files on Commons. Magnus Manske came up with this tool showing where file names correspond to items on Wikidata. As you've got a list of taxon items with audio files I thought this collection of files might be relevant. Thinking about adding the audio files to items, do you know of a relevant community who might be interested in that? Richard Nevell (WMUK) (talk) 12:34, 10 October 2017 (UTC)

Great work, and thanks for the ping! Yes, I'm definitely interested, and so is User:Pigsonthewing. It would be great to have a set of queries to identify accounts that have uploaded the audio files on that Wikidata list, or added them to items or other wiki pages. Worth a try might also be the various biology or audio WikiProjects. Great stuff for a hackathon/ editathon too. --Daniel Mietchen (talk) 12:53, 10 October 2017 (UTC)
PS: We could also think of opening a new channel in WikiRadio. --Daniel Mietchen (talk) 12:55, 10 October 2017 (UTC)
Good thinking about WikiRadio, that would be a nice way of presenting some of the audio.
Magnus has also created this tool which shows when an item on a taxon has an audio file but the corresponding Wikipedia page doesn't, which should help with filtering the files through to various wikis. Richard Nevell (WMUK) (talk) 13:42, 10 October 2017 (UTC)
Cool. I've done a few already and noticed that the metadata is very often in the audio track, which does not make it suitable for WikiRadio, nor for non-English wikis. --Daniel Mietchen (talk) 13:48, 10 October 2017 (UTC)
ah, you're right - most of them would need trimming for use in other wikis. Thank you for making those edits! Richard Nevell (WMUK) (talk) 14:06, 10 October 2017 (UTC)
@Richard Nevell (WMUK): They should be trimmed for use in Wikidata too; and the introductions transcribed into the text descriptions on Commons. The original files should be retained, and the trimmed versions uploaded as derivatives, under a new name. Do we have any idea of how many have spoken introductions; and which files are affected? If the answer to the latter is no, the first job would be to check, and add them to a specific category. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:38, 11 October 2017 (UTC)
It should be pretty much all of them, though there are a small proportion where there's already been some trimming so it's just the animal sound. Richard Nevell (WMUK) (talk) 16:01, 11 October 2017 (UTC)
Looks like the seed for an audio trim-athon... --Daniel Mietchen (talk) 16:12, 11 October 2017 (UTC)

Wikidata weekly summary #282[edit]

Wikidata weekly summary #282

Data about citations[edit]

When it comes to creating items of new papers, there are seldom statements about citations. Is that because your data source doesn't contain information about citations? ChristianKl (talk) 16:00, 18 October 2017 (UTC)

Yes. Also, while the bibliographic data is stable, citation data is less so (i.e. the cited references may or may not have a Wikidata item initially, and that can change over time, and due to things like Initiative for Open Citations (Q29188397), even the licensing of citation information may change), so it makes some sense to have different tooling for bibliographic and citation data. Most of the latter is being brought here by User:Harej in case you'd like to dig deeper into that. --Daniel Mietchen (talk) 18:15, 18 October 2017 (UTC)

stop for a while please[edit]

Hey :) We currently have a very high dispatch lag. This causes changes to show up on Wikipedia only very late. This is not ok. Can you please slow down for a while until it is down? You can check it here: Special:DispatchStats. --LydiaPintscher (talk) 18:59, 19 October 2017 (UTC)

OK, will do, though I don't think my bot's edits have an effect on the dispatch lag. --Daniel Mietchen (talk) 19:34, 19 October 2017 (UTC)
They should not, yeah but something is fishy and I am trying to narrow it down. Thanks a lot! --LydiaPintscher (talk)
Makes sense. I'll keep an eye on the edits and the job queue. --Daniel Mietchen (talk) 20:47, 19 October 2017 (UTC)
Cool! There is also https://grafana.wikimedia.org/dashboard/db/wikidata-dispatch?refresh=1m&orgId=1&panelId=17&fullscreen The graph at the bottom should be green for the lag to go down. --LydiaPintscher (talk) 20:54, 19 October 2017 (UTC)

Wikidata weekly summary #283[edit]

Wikidata weekly summary #282

Wikidata weekly summary #284[edit]

Wikidata weekly summary #285[edit]

Wikidata weekly summary #286[edit]

New items[edit]

Hi Daniel Mietchen. As I'm trying to create 100 items in sequence, I was wondering when your bot will be not running. Currently it fills Special:NewPages. It can wait a couple of days.
--- Jura 10:54, 17 November 2017 (UTC)

Interesting. What would you need 100 consecutive items for? Have you checked whether any user has created 100 (still existing) items in a row already? I guess a good number of bots would meet this criterion, probably including mine. It's currently running continuously, but I can of course stop it any time. A convenient time for me to do that would be on Tuesday Nov 28 after 9pm CET. Would that be still OK or too late for your purposes? --Daniel Mietchen (talk) 23:42, 17 November 2017 (UTC)
100 Decameron stories .. ordinals have them .. not that it's really important, but as there 100, why not. Ok for Tue.
--- Jura 23:47, 17 November 2017 (UTC)
Just saw that is Nov 28. Let's coordinate here then. --Daniel Mietchen (talk) 02:42, 18 November 2017 (UTC)
Oh in 10 days? Could we do this weekend instead?
--- Jura 07:28, 18 November 2017 (UTC)
Not sure - running an event in a different time zone. Will try tonight though, and ping you once I've stopped the bot. What time do you plan to be around tomorrow (Sunday) UTC? --Daniel Mietchen (talk) 21:22, 18 November 2017 (UTC)
@Jura1: — I just stopped it. Will resume tomorrow. --Daniel Mietchen (talk) 21:30, 18 November 2017 (UTC)
Thanks. It worked out.
--- Jura 15:30, 19 November 2017 (UTC)

comment on Pre-clustering of the B cell antigen receptor demonstrated by mathematically extended electron microscopy. (Q42128282)[edit]

Hi, I randomly (literaly) arrived at Pre-clustering of the B cell antigen receptor demonstrated by mathematically extended electron microscopy. (Q42128282) your academic bot created. I quickly foun that 3 of the 6 authors have a wikidata item (naturally, the last 3 named authors) and manually migrated them from author name string (P2093) to author (P50). I don't know how often this happens, but I thought you better know about it, maybe even find a way to minimize the loss of data. Good luck and thanks, DGtal (talk) 10:28, 20 November 2017 (UTC)

Thanks for checking. The bot takes the information from PubMed (Q180686) or PubMed Central (Q229883), and if author identification by way of ORCID iD (P496) is provided there for an author that has a Wikidata entry, the item about the paper will link to the item about the author by way of P50. Otherwise, just the string of the author's name will be recorded by way of P2093. No need to do the conversion in an entirely manual fashion — we have a tool that can help considerably, though manual oversight is still warranted. I am using it all the time, and I recommend that you give it a try. --Daniel Mietchen (talk) 03:23, 21 November 2017 (UTC)
Thanks for the info. Unfortunately ORCID is still much less common than it should be, so there are probably thousands of misses by now. DGtal (talk) 12:30, 21 November 2017 (UTC)
Yes — we have over 1 million P50 statements versus 43 million of P2093 statements, and we're actively reaching out to institutions and libraries to share with Wikidata the author disambiguation and related curation work they are already doing. If you could think of potential partners in Israel or elsewhere who would be interested in this, I'd be happy to dig deeper. --Daniel Mietchen (talk) 21:31, 21 November 2017 (UTC)
I can't elaborate too much but the current infrastructure in Israeli academia doesn't have the relevant data yet, but should have it in a few years, so we need to wait a while. DGtal (talk) 09:19, 23 November 2017 (UTC)

Wikidata weekly summary #287[edit]

Wikidata weekly summary #287 Global message delivery/Targets/Wikidata