User talk:Multichill/Archives/2014/December

From Wikidata
Jump to navigation Jump to search

Addition of P641 to players' entries

Why does the bot add sport (P641) to an instance of a human being? This is an example at Robert Huth (Q155461). Wouldn't it be better to add occupation (P106) and association football player (Q937857) to football players' entries where it doesn't (yet) exist? Jared Preston (talk) 16:29, 10 November 2014 (UTC)

Sorry Jared, looks like I didn't reply to you. I was hunting down items without claims. Turned out we had thousands and thousands of sport related items that didn't have any claims. I added the sport to those items and some other items that happened to be in the same category tree. With a sport I don't know what someone's role is. Is it a player? A coach? Etc. You should also add occupation to people. Hmm, looks like someone is replacing it. That's not good...... Multichill (talk) 09:56, 6 December 2014 (UTC)

Range maps imported as images

We have noticed there are items about animals with range map as image (P18) property instead of P181, and this is causing some concern because cawiki taxoboxes use Wikidata P18 as image by default.

The few ones that I've checked have been imported by BotMultichill from nlwiki (example: https://www.wikidata.org/w/index.php?title=Q2571127&diff=121810874&oldid=100179965 ), where, I suppose, somebody used the image field in taxoboxes to place range maps. I don't know how many wrong images have been imported, but in http://wdq.wmflabs.org/api?q=tree[729][150][171,273,75,76,77,70,71,74,89]&props=18 I can see a few files with "distribution" or "range map" in their names.

Not sure about how to fix the problem and how to prevent it to happen again.--Pere prlpz (talk) 11:26, 30 November 2014 (UTC)

Hi Pere prlpz, image (P18) says "relevant illustration of the subject; if available, use more specific properties". So it's very generic and might contain anything. I think in this case you should just replace it with taxon range map image (P181), I'm pretty sure the bot won't import it again. Multichill (talk) 10:08, 6 December 2014 (UTC)

Paintings with incorrectly added creators

I've found a some cases where BotMultichill added the wrong creator (P170), apparently when there wasn't yet an item for the true creator. (Cases below are now fixed, but you can see the previous state in the history).

It might be worth checking to see whether any more are similarly affected.

(Incidentally, are the last two pairs of different photographs of the same paintings, or original duplicates of the paintings themselves?)

All best, Jheald (talk) 01:01, 5 December 2014 (UTC)

Hi James, I expected some of those, but not too many. The bot just checks if the name is the same and if it appears to be a person. My idea is to do a bit more advanced heuristics afterwards to hunt down painters which have some as a painter who lived in a different period or doesn't have painter as occupation. I haven't gotten to that step yet.
The Amsterdam Museum (Q1820897) and Rijksmuseum (Q190804) is an interesting one. The Amsterdam Museum manages the collection of the city of Amsterdam. Some of these works are on long term loan to the Rijksmuseum. So one painting will show up in both collections and need to be merged. We already found quite a few of them. I'll merge these items. Multichill (talk) 09:21, 6 December 2014 (UTC)
@Multichill: These weren't near misses -- they were wildly different names. And I've seen three or four more since, that I didn't note down. The characteristic is that a correct creator is given as free text in the description field, but the Q-number given for the creator property is quite wrong -- see eg diff.
It looks as if the bot failed to find a correct Q-number, but added one anyway, perhaps using whichever Q-number it had previously been adding. Jheald (talk) 11:03, 6 December 2014 (UTC)
James, look at the history. My bot didn't add these claims, Jane did. Wrong talk page ;-). She'll see the ping. Multichill (talk) 11:11, 6 December 2014 (UTC)
My apologies. You're quite right. Jheald (talk) 11:15, 6 December 2014 (UTC)
Thanks for the alert! I'll take a look. Hopefully I didn't make too many mistakes!!! Jane023 (talk) 11:30, 6 December 2014 (UTC)
Hmm I thought I may have screwed up a run of about 50 but I could only find about 6 or 7. Let me know if you find any more, because I am still not sure how these happened. Jane023 (talk) 14:11, 6 December 2014 (UTC)

Update: I think I found the spot where I did this, and I think I have fixed it! I corrected a few more. Jane023 (talk) 09:39, 7 December 2014 (UTC)

Merge.js

Can you check it is working now? I remove delete function and i am doing 10+ tests and works. --DangSunM (talk) 04:38, 5 December 2014 (UTC)

Yes, DangSunM. Just merged two items. It works! Multichill (talk) 09:38, 6 December 2014 (UTC)
Thanks, now all done!:)--DangSunM (talk) 03:28, 7 December 2014 (UTC)

Duplicate painters

Your bot seems to be creating a fair number of duplicate items for Rijksmuseum painters, eg

just in the first couple that I looked at, that I went ahead and merged. Do you have a strategy for de-duplicating these (ie are you running an intentional policy to create first then search & merge, which can make some sense as an efficient strategy), or is it just that too many merge candidates are being missed?

One problem seems to be that creator death dates may not in the past have been systematically marked as 'approximate' or 'after this date' when they have been imported, which may mean that filters requiring exact matches are too strict, and creating too many inappropriate new items. Jheald (talk) 19:47, 5 December 2014 (UTC)

I got a list of Rijksmuseum painters from the Rijksmuseum. I keep track of them at User:Multichill/Rijksmuseum creators. It's bot created and bot edited so it has strict rules for humans (otherwise the bot just ignores it). This list gets updated based on the name of the painter. Updates need to be in the form of the regex ^\*\s(.+)\s->\s\[\[(Q\d+)\]\]$. I already matched a lot of them before I started creating the new items.
I figured the duplicate rate would probably be a bit higher (10%?), but the items would be pretty complete and merging is easy. Let's see if that holds up. This bug frustrates the process a bit because the merge might remove the label which is used by the bot to find the painter in the first place.
I'm probably first going to try to get all the paintings matched. Help appreciated. Probably next is to make a list of painters who have a work in the Rijksmuseum, but don't have a RKDartists ID (P650) statement yet. Multichill (talk) 09:49, 6 December 2014 (UTC)
That's smart. The Rijksmuseum seems to use the same standard name-forms as RKD-artists, so that may be quick to auto-match, and then as the RKD artists coverage continues to improve for the rest of the painters on Wikidata, you will be able to identify duplicates through the constraint violations. Clever. Jheald (talk) 11:31, 6 December 2014 (UTC)
James / Jane: Feel like a puzzle? Could use some help at User:Multichill/Rijksmuseum creators RKD. Multichill (talk) 23:54, 6 December 2014 (UTC)
Is this an output list? I added the RKDartists 85395 number to Jan Baptist Wolfaerts (Q18608189) - is this what you want? Or do you just want the numbers of the artist pages (it seems like you have made a backwards Mix-n-Match) Jane023 (talk) 09:56, 7 December 2014 (UTC)
Yes, Jane, I want to add RKDartists to these items. This quite easy because almost every link is a direct hit. Yes, it's similar to mix-n-match, but it's only a small targeted subset. We could do the same for Frans Hals or Teylers Museum. Multichill (talk) 14:46, 7 December 2014 (UTC)

DBNL

Zo te zien heb je destijds alleen de infobox-parameter geïmporteerd. Zou je ook nog het van het losse sjabloon link dbnl auteur de parameter kunnen importeren? Alvast bedankt, Sjoerd de Bruin (talk) 11:40, 9 December 2014 (UTC)

Zoals besproken op irc: Ik ben nu Digitale Bibliotheek voor de Nederlandse Letteren author ID (P723) aan het importeren. Behoorlijk wat hits. Ik heb het constraint rapportje op mijn volglijst gezet. Ben benieuwd of daar nog wat raars uit gaat komen. Multichill (talk) 16:24, 17 December 2014 (UTC)

Duplicate Italian communes?

I will take care of it in the next couple of weeks. --Dcirovic (talk) 14:39, 11 December 2014 (UTC)

Answered at User talk:Dcirovic#Duplicate Italian communes? to keep the conversation in one location. Multichill (talk) 16:29, 17 December 2014 (UTC)

Bot

Waarom staat ie stil? 😊 Sjoerd de Bruin (talk) 16:05, 28 December 2014 (UTC)

De bot is klaar. Het hele bestand is doorgeploegd en er zijn denk ik zo'n 140.000 nieuwe claims toegevoegd als ik even zo goed tel. Multichill (talk) 16:51, 28 December 2014 (UTC)
Mooi, helaas vandaag geen nieuwe rapportage maar die zal vanaf nu alleen nog maar kleiner worden! Sjoerd de Bruin (talk) 18:17, 28 December 2014 (UTC)