User talk:Magnus Manske

From Wikidata
Jump to: navigation, search


all talk archives and subpages user pages · m:special:CentralAuth · JIRA

Logo of Wikidata

Welcome to Wikidata, Magnus Manske!

Wikidata is a free knowledge base that you can edit! It can be read and edited by humans and machines alike and you can go to any item page now and add to this ever-growing database!

Need some help getting started? Here are some pages you can familiarize yourself with:

  • Introduction – An introduction to the project.
  • Wikidata tours – Interactive tutorials to show you how Wikidata works.
  • Community portal – The portal for community members.
  • Contents – The main help page for editing and using the site.
  • Project chat – Discussions about the project.
  • Tools – A collection of user-developed JavaScript tools to allow for easier completion of some tasks.

If you have any questions, please ask me on my talk page. If you want to try out editing, you can use the sandbox to try. Once again, welcome, and I hope you quickly feel comfortable here, and become an active editor for Wikidata.

Best regards! --Guerillero | Talk 21:59, 2 January 2013 (UTC)

Proposed improved/alternative version of Sourcery[edit]

Hi,

I like the intention sourcery tool, since adding reference statements on WD is very tedious to do by hand, and very difficult to automate. However, when trying to source from Wikipedia articles references, the signal to noise ratio is very low. IMHO, there is a much better source to work from : authority controls

Think of it as a mashup between Mix'n Match and the existing Sourcery : same principle as current tool, but presenting the page on the AC website according to the relevant property on the item, and sourcing to that AC.

Not all AC would be adapted for that, but some are very good. For instance, the BNF can have gender, nationality, profession, date/location of birth & death. It takes me a few mn to add by hand on a complete item, to be replaced by a few clicks.

What do you think? Feasible?

(and thanks again for all the tools)

--LBE (talk) 18:29, 11 June 2015 (UTC)

Sounds good. For items that have e.g. BNF ID but no source for gender, sources could even be added by a bot if matching Wikidata. Traveling now, will have a look later. --Magnus Manske (talk) 10:26, 12 June 2015 (UTC)
Symbol strong support vote.svg Strong support - BNF is a very good source for gender, work language, nationality, date/location of birth and death, and also, pseudonyms/birth names. Sometimes, also for family members and profession. :) --Hsarrazin (talk) 20:26, 29 June 2015 (UTC)

Reasonator[edit]

Hi Magnus,

"Birth name" with (OBSOLETE) birth name (use P1477) (P513) is being used by Reasonator. It's now being replaced by birth name (P1477). Would you kindly include that as well?

BTW, maybe you want to extend the layout for P31:Q5 to fictional human (Q15632617). --- Jura 17:59, 20 June 2015 (UTC)

Asked for that like a half year ago. Seems like the data type requires a lot of work. Sjoerd de Bruin (talk) 21:16, 20 June 2015 (UTC)
In this specific case, it could be included as the current string value. --- Jura 21:56, 21 June 2015 (UTC)

Had some RL issues, winding down now. Will look into both issues soon-ish. --Magnus Manske (talk) 12:29, 23 June 2015 (UTC)

Would be great! I've added some others to BitBucket. Sjoerd de Bruin (talk) 09:51, 24 June 2015 (UTC)
Did some work. --Magnus Manske (talk) 19:04, 24 June 2015 (UTC)
Nice. Thanks.
How can we make Google+ and reddit to link as well?
There is a problem with youtube: some addresses are youtube.com/user/<name> others are youtube.com/<name> or even just youtube.com/channel/dasfsdfs . I added the url as "reference URL". --- Jura 07:50, 29 June 2015 (UTC)
The social networks are different, as they are used as qualifiers and do not seem to have a URL schema. I'll have to add them manually to the code right now. --Magnus Manske (talk) 08:28, 29 June 2015 (UTC)
Do you think it would make sense to have separate properties (at least for the more common ones)? It would let us add constraints and URL patterns and would make them easier to enter (since you would only need to search for one property, not two properties and an item) and they're all separate sites with separate APIs so it doesn't really make sense to me to treat them as a single thing anyway. (@Hoo man: You mentioned wanting to split these on IRC, IIRC) - Nikki (talk) 09:09, 29 June 2015 (UTC)
I didn't yet find the time (and probably wont anytime soon), to start that large discussion… but I definitely think that we should split P553 rather sooner than later. Working with that is just awful. - Hoo man (talk) 01:32, 30 June 2015 (UTC)
Just adding the URLs directly as qualifier would probably be easier, but do we really want to encourage this type of content? --- Jura 09:54, 30 June 2015 (UTC)
I don't see why we would want to discourage it. It's factual data, sources can be found which say that it's the item's account. It's relevant data, millions of people use social networks. It's useful to other Wikimedia projects, e.g. to support templates like Template:Twitter (Q6741634) and Template:Facebook (Q5624646). It's useful to other users of the data, most of these sites have APIs.
I don't think adding the URLs themselves is a good solution. As I've said, these sites usually have APIs. Just like Magnus had to manually add URL formatting because we don't currently provide it, storing the URLs produces the opposite problem, people who need just the ID or username to query an API will have to manually extract it from the URL each time. It's best to store the data in a structured form. It also makes it much easier to maintain when sites change their URLs. - Nikki (talk) 11:40, 30 June 2015 (UTC)
I don't see how that solves the current youtube issue. Not sure if the current approach works for linkedin either. --- Jura 11:57, 30 June 2015 (UTC)
If the current situation doesn't work for some sites, isn't that just another reason to split this property? :) - Nikki (talk) 12:48, 30 June 2015 (UTC)
In three identifiers for YouTube? :) :) Note: pages at "youtube.com/user/<name>" can be different from "youtube.com/<name>". There is even some at "youtube.com/user/channel/dasfsdfs" that are different from "youtube.com/channel/dasfsdfs". --- Jura 15:57, 4 July 2015 (UTC)

The somewhat odd links for Freebase identifier (P646) don't seem to work on Reasonator. --- Jura 15:57, 4 July 2015 (UTC)

Listeriabot[edit]

I agree with Multichill, above. Love this bot. I also have two extra things I would like to be able to do with the lists:

  • Possibility to number the rows of the list.
  • Possibility tot sort names in a column to last name instead of first name

See for example this list, where I would like to have a list sorted by last name and numbered, so that I can see the number of Filipino senators in the blink of an eye, instead of having tot count the numbers of rows myself. Hope you can help out. Magalhães (talk) 06:00, 28 June 2015 (UTC)

QuickStatements : quantity[edit]

Hi Magnus,

I tried to add quantities with the following formats:

Adele Arakawa   P1971   +1,     S143    Q328
Anna Kinberg Batra      P1971   1       S143    Q328

Both work fine except that the "S143 Q328" part gets skipped. --- Jura 07:50, 28 June 2015 (UTC)

This was also reported by someone on Bitbucket [1]. It looks like a similar problem to the one in #URL in QuickStatements above, except here it seems to be because that part of the code doesn't know how to compare quantities. - Nikki (talk) 15:48, 28 June 2015 (UTC)
It already fails on qualifiers, making it impossible to add population data with the tool. --- Jura 16:24, 4 July 2015 (UTC)

P1959 for mix-n-match[edit]

Could Dutch Senate person ID (P1959) be added to mix-n-match? The persons are listed here, 50 per page. Thanks! Sjoerd de Bruin (talk) 09:04, 28 June 2015 (UTC)

On it, but it stops at 500 entries (last page I get). --Magnus Manske (talk) 11:18, 28 June 2015 (UTC)
Yikes. Adding a letter in the search box seems to help. Am I asking you too much to do that 26± times? :) Sjoerd de Bruin (talk) 12:10, 28 June 2015 (UTC)
I got 1126 entries. How many should there be? --Magnus Manske (talk) 13:18, 28 June 2015 (UTC)
Some easy calculating based on en:Historic composition of the Senate of the Netherlands gives me 2175 seats had been occupied until now. Some people seated more than one period, so I think we've got most of the people. Sjoerd de Bruin (talk) 13:26, 28 June 2015 (UTC)
They don't auto-match very well, but there you go... --Magnus Manske (talk) 13:33, 28 June 2015 (UTC)

ListeriaBot[edit]

Probably it's time to request an approval for this bot.--GZWDer (talk) 16:55, 30 June 2015 (UTC)

WDQ lag[edit]

Anything you could do to combat the 6 hour lag in WDQ? Sjoerd de Bruin (talk) 17:25, 30 June 2015 (UTC)

Restarted, slowly catching up. --Magnus Manske (talk) 15:06, 1 July 2015 (UTC)
Is it actually catching up? It appears to be updating, but it seems to be lagging more and more. :/ It was about 16 hours behind when I checked [2] around this time yesterday, now it's closer to 36. - Nikki (talk) 01:43, 3 July 2015 (UTC)
Not sure why updating is slow. Didn't change anything, so probably external. --Magnus Manske (talk) 15:29, 3 July 2015 (UTC)
Did it just regress to June 30 and stop? --- Jura 11:29, 6 July 2015 (UTC)
No, it's just slow. Waiting for new dump to appear. --Magnus Manske (talk) 12:10, 6 July 2015 (UTC)
Yesterday we had data from July 4, now it's much older. Several listeria lists regressed. --- Jura 16:08, 6 July 2015 (UTC)
Updated yesterday to latest dump. Still updating slower than real time. No idea why. --Magnus Manske (talk) 09:07, 7 July 2015 (UTC)

Bug resolving redirects[edit]

Reproduce: Type [[ISO 3166-1:ZW]] to "Wikitext" box of [3]

Intended result: Q954|Zimbabwe

Actual result: Q836|Burma

Reason: The code to resolve redirects only find the article with lowest Page ID in main namespace. en:ISO 3166-1:ZW contains a link to en:Burma whose Page ID is lower than en:Zimbabwe

Solution: Rewrite the redirect resolving mechanism to find the first (not needed to be in main namespace) link of a redirect page. Also quick_statements.php needs to be fixed

--GZWDer (talk) 18:45, 1 July 2015 (UTC)

WD-FIST[edit]

Thanks a lot Magnus Manske for WikiData Free Image search tool! Many visual artworks items have been completed with it. So usefull and so pleasant for discovering and sharing. I can't stop :D Shonagon (talk) 11:04, 5 July 2015 (UTC)

Adding the big ones like IMDB to Mix'n'match?[edit]

Hi Magnus, I was looking at IMDb identifier (P345) and I wondered if you ever considered adding it to Mix'n'match? Other big identifiers would be VIAF identifier (P214) or GeoNames ID (P1566). Multichill (talk) 17:28, 5 July 2015 (UTC)

I'd love to see Geonames! I've not used Mix'n'match so far because there's nothing on the list that I'm really interested in or know anything about, but linking to Geonames sounds like fun. :) Their data is already available in a tab-separated format (see allCountries.zip on http://download.geonames.org/export/dump/, the columns are described at the bottom of the page). I imagine we could generate a lot of suggested matches by looking for items with similar coordinates and a similar name, and preferably also a similar type (feature class, feature code). I've been considering mapping http://www.geonames.org/export/codes.html to Wikidata items but I haven't really had a reason to start yet. If it would be useful for that, do let me know. - Nikki (talk) 23:32, 5 July 2015 (UTC)
@Multichill, Nikki: See Wikidata:Bot requests/Archive/2014/10#Done. These databases are too large.--GZWDer (talk) 18:42, 16 July 2015 (UTC)
So how big is acceptable? TGN has 1.2 million entries and several others have over 100,000. If the entire dataset is too big on its own, it wouldn't be difficult to split it into smaller datasets and I'm quite happy to do that myself if someone tells me how small they need to be. - Nikki (talk) 21:05, 16 July 2015 (UTC)

RKDartists bulk removal[edit]

Hi Magnus, the Netherlands Institute for Art History (Q758610) removed about 50.000 records from RKDartists (P650). The quality of these records weren't very good and the persons not really in scope of the database. Can you rescan RKDartists (P650) in Mix'n'Match and remove items that are now a dead link? Take for example https://rkd.nl/nl/explore/artists/323299 . Would this also remove the links on Wikidata or do we need a separate bot run for that? Multichill (talk) 17:46, 7 July 2015 (UTC)

Unless the IDs are to be reissued (surely not!) the data is valid and a useful matter of record; and should be kept. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:58, 8 July 2015 (UTC)
What's the purpose of a dead link? Sjoerd de Bruin (talk) 09:08, 8 July 2015 (UTC)
No Andy, the data isn't valid and should be removed. Multichill (talk) 16:51, 8 July 2015 (UTC)
It is a valid statement that "this is the identifier that was used by RKD". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:58, 9 July 2015 (UTC)
The link is incidental. We can always submit a ticket to change the code so that formatter URLs are not applied if there is an end-date. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:58, 9 July 2015 (UTC)

AFAICT, these entries have just been removed; no new ID was assigned to these people. If that were the case, I'd see a point in replacing the old with the new IDs. As it stands, "had this ID" may be more useful than removing the ID. --Magnus Manske (talk) 09:03, 9 July 2015 (UTC)

d:Q4475073[edit]

Hello! It (commons:File:Ulyanov 1.jpg) is a portrait of Konstantin Balmont. --Sabunero (talk) 08:32, 9 July 2015 (UTC)

Reasonator ISNI[edit]

Hey, ISNI link is fine from Wikidata, but it is wrong from Reasonator. --JulesWinnfield-hu (talk) 11:10, 11 July 2015 (UTC)

Some values are preprocessed [4]. --JulesWinnfield-hu (talk) 22:32, 13 July 2015 (UTC)

Duplicity[edit]

Hi :) Duplicity is really cool! I've been using Articles without item but always found it quite slow to use, because it wasn't very easy to compare things and Duplicity makes that so much better. I do have a few requests and questions though:

  • Could you make the buttons always appear in the same place? Since they're displayed after the Wikipedia page title, they move about depending on how long the page title is, which is really bad for muscle memory. Fixed before I could even finish my comment. :)
  • Could you exclude English pages with en:Template:Wiktionary redirect? Those are soft redirects to Wiktionary and aren't notable in Wikidata, so shouldn't be linked to items. Similarly, the German templates de:Vorlage:Obsolete Schreibung and de:Vorlage:Falschschreibung are soft redirects to other items that shouldn't be linked to Wikidata items.
I am already excluding several of these, at update time (once per hour), though through categories rather than templates. Added "Redirects to Wiktionary".
  • An option to automatically skip pages with no suggestions would be useful. Adding new pages is more work than confirming that something matches, so if you're feeling lazy and are in the mood for something easy, only looking at things with suggestions would be nice.
No suggestions can come easily for non-English words when it's not a biography (title=name). I did put in the search box for a reason, so one could try more likely terms. Also, getting suggestions is "expensive", as it is using Wikidata search, so skipping could be slow.
  • On a similar note, perhaps there should be a way to say that the suggestion is not a match, without having to find the right match or create a new item? While trying it out I skipped several where I could tell that the suggestion was wrong but didn't feel like I knew enough to say what would be correct.
Just skip it; action is not mandatory ;-)
  • An easier way to say that something is a disambiguation page would be nice, e.g. a button which would create an item with "Wikimedia disambiguation page" as the description and a instance of (P31) Wikimedia disambiguation page (Q4167410) statement. Disambiguation pages seem to show up fairly often in Articles without item and I already got one in Duplicity too. Also fixed before I could finish my comment. Works nicely!
  • Sometimes the interface goes weird and really narrow, and then when I press skip, it goes back to https://tools.wmflabs.org/wikidata-todo/duplicity.php? Looking at the console, it shows the error duplicity.php?wiki=enwiki:53 Uncaught SyntaxError: Unexpected identifier. The line at that point is `var title = '2005 Greenlandic Men's Football Championship' ;` - it looks like the code needs to escape apostrophes.
Should be fixed now.
  • Are you planning to add support for more languages? How about namespaces? (templates and modules in particular are often copied between projects)

- Nikki (talk) 11:28, 12 July 2015 (UTC)

Will add more languages on demand. Will look into templates/modules/categories, but I feel they'd "spam" the list I have now. --Magnus Manske (talk) 11:41, 12 July 2015 (UTC)
Oh, Magnus, please add ruwiki. --Infovarius (talk) 21:18, 13 July 2015 (UTC)

Quickstate / unk[edit]

Is there a way to set "point in time" to "unknown" when adding numbers with QuickState? --- Jura 15:51, 13 July 2015 (UTC)

No. --Magnus Manske (talk) 15:53, 13 July 2015 (UTC)
Ok. Just realized that I can't add actual dates either (#QuickStatements_:_quantity). --- Jura 16:11, 13 July 2015 (UTC)


Multi-beacon[edit]

Hi Magnus,

I tried to do a look-up on P646 for "/m/015f0p" (and a series of others). Any idea why it doesn't map to Cincinnati Zoo and Botanical Garden (Q623333) ? --- Jura 21:39, 13 July 2015 (UTC)

/m/021c5h /m/02l8p9 /m/0gd1z /m/01djxm /m/0cd4d

A few more above. --- Jura 06:55, 14 July 2015 (UTC)

I used the other Beacon module instead. --- Jura 19:01, 14 July 2015 (UTC)

Location in Reasonator[edit]

For information, it seems that there is an issue with locations displayed in Reasonator: a public artwork in Wahsington D.C. could be located in Australia https://tools.wmflabs.org/reasonator/?q=Q16828178 , even if data don't seem to be wrong. The issue on Reasonator is certainly known and more difficult to be resolved than I can think; there are so many different cases. I hope a solution could be found. Best regards --Shonagon (talk) 17:49, 15 July 2015 (UTC)

Arabic terminator[edit]

Hi Magnus, I was playing around with Terminator to show Emnamizouni around and we noticed Arabic is missing. Could you please add it? Multichill (talk) 19:53, 18 July 2015 (UTC)

Added language code "ar", should show after next update. --Magnus Manske (talk) 23:18, 19 July 2015 (UTC)

Catscan and autolist[edit]

Hi, sorry to pester you with low-content remarks, but I must say that to me the ultimate maintenance tool would be an autlolist option in catscan (I mean: the corresponding item matches some query). --Zolo (talk) 17:08, 19 July 2015 (UTC)

As "Get the corresponding Wikidata item for the individual page" on http://tools.wmflabs.org/quick-intersection/index.php ? --- Jura 17:39, 19 July 2015 (UTC)
Second. And it is better if this can also create missing items. However Catscan2 in not finished ([5])--GZWDer (talk) 05:30, 20 July 2015 (UTC)
FWIW, I am working on something that would solve these issues and more. Will take some time though. --Magnus Manske (talk) 11:43, 20 July 2015 (UTC)

SourcererBot[edit]

I think your bot is not working correctly here. It says "International Standard Name Identifier", but is using GND identifier (P227). Sjoerd de Bruin (talk) 12:52, 21 July 2015 (UTC)

Thanks, fixed now. Running again to add proper values, but not sure how to remove the bad ones. Looks complicated... --Magnus Manske (talk) 13:44, 21 July 2015 (UTC)
I see the bot still adding strange qualifiers (already left a message on the talk page of the bot). Mbch331 (talk) 14:28, 21 July 2015 (UTC)
I think you still got the broken one (edit was before my reply here). Seems to work (here too). --Magnus Manske (talk) 15:18, 21 July 2015 (UTC)

problem with Sourcerer[edit]

Hello Magnus,

I can't have Sourcerer work properly on items. When I click on the link, I have "Checking external links... to go", then nothing more...

Can you fix it, or is this tool dead ?

Thanks for your help --Hsarrazin (talk) 18:26, 21 July 2015 (UTC)

Try it now. --Magnus Manske (talk) 19:55, 21 July 2015 (UTC)
sorry, no change :( --Hsarrazin (talk) 17:37, 22 July 2015 (UTC)

Listeria question[edit]

Concerning Listeria, that I begin to test more seriously, I'd like to know how to sort out items where the value of a claim has been deprecated, so that they don't appear in the list afterwards.

Is there a way to do it ? --Hsarrazin (talk) 17:39, 22 July 2015 (UTC)

Not yet. There are not many deprecated statements. I'll refactor listeria sometime. --Magnus Manske (talk) 19:23, 22 July 2015 (UTC)
there will be more and more, as I use it to add precise dates for French writers see Abad, top of list here ;)
Alphabetic sorting is my second problem, but can't be solved from wikidata, afaik :)
thanks for this very useful tool. --Hsarrazin (talk) 20:04, 22 July 2015 (UTC)

Duplicity tool[edit]

According to the intro of the tool article in bad categories are skipped. However I regularly come across articles that have been nominated for deletion on nlwiki. For example nl:Léon van den Haute it was nominated for deletion on July 8th 2015 and it was suggested to me just now. What is the reason the article was suggested? Mbch331 (talk) 09:38, 25 July 2015 (UTC)

Because I currently exclude "bad categories" from de and en? --Magnus Manske (talk) 12:47, 25 July 2015 (UTC)
That explains a lot. Didn't know that part, because the tool doesn't say you only exclude bad categories from de and en. They way it's mentioned at the tool it looks like you exclude bad categories from all wikis. Mbch331 (talk) 12:52, 25 July 2015 (UTC)

Catalogus Professorum Halensis[edit]

Hallo Magnus! Catalogus Professorum Halensis (P2005) ist jetzt vorhanden. Kannst du die Matches nach Wikidata übertragen? Jonathan Groß (talk) 06:43, 27 July 2015 (UTC)

Läuft! --Magnus Manske (talk) 10:22, 27 July 2015 (UTC)
Erzeuge auch mal eben "sub items" für alle, die noch kein Item haben. --Magnus Manske (talk) 10:25, 27 July 2015 (UTC)
Heißt das, items ohne Wikipedia-Artikel? Kannst du dann auch gleich eine Rotlinkliste für dewiki anlegen? Jonathan Groß (talk) 10:27, 27 July 2015 (UTC)
Keine Rotlinkliste (muss noch 'nen Tool schreiben!), aber ich kann die komplette Liste anbieten, mit Rotlinks kursiv. --Magnus Manske (talk) 11:32, 27 July 2015 (UTC)

Mitglieder der Heidelberger Akademie der Wissenschaften[edit]

Hallo Magnus, ich habe schon wieder was auf der Pfanne: Eine Liste der Mitglieder der Heidelberg Academy for Sciences and Humanities (Q833738) inklusive Angaben zu Art und Startzeit der Mitgliedschaft, Klassenzugehörigkeit und Sterbedaten. Hier ist die Originaldatenbank, ich hab aber schon alles rausgezogen, siehe hier. Wär das auch was für Mix'n'Match? Jonathan Groß (talk) 10:43, 27 July 2015 (UTC)

Klar! Warte auf Deine Genehmigung bei Google Docs. Mann, sind wir Deutsch! ;-) --Magnus Manske (talk) 11:54, 27 July 2015 (UTC)
Ich dachte, der Link genügt. Genehmigung ist jetzt raus :) Jonathan Groß (talk) 14:15, 27 July 2015 (UTC)
Ist importiert. Habe Deine Kalenderdaten nicht verwendet, die scheinen mit den Personen nix zu tun zu haben... --Magnus Manske (talk) 14:38, 27 July 2015 (UTC)
Die Kalenderdaten waren Sterbetage. Wohin hast du die Daten importiert? Direkt auf die Wikidata-Items? Dann fehlen aber noch weitere Qualifikatoren. Jonathan Groß (talk) 23:13, 27 July 2015 (UTC)
Importiert in Mix'n'match. Das Tool ist gut, um matches zu finden, nicht so sehr als Metadaten-Transport geeignet. --Magnus Manske (talk) 14:59, 28 July 2015 (UTC)