User talk:John Vandenberg

From Wikidata
Jump to: navigation, search
Logo of Wikidata

Welcome to Wikidata, John Vandenberg!

Wikidata is a free knowledge base that you can edit! It can be read and edited by humans and machines alike and you can go to any item page now and add to this ever-growing database!

Need some help getting started? Here are some pages you can familiarize yourself with:

  • Introduction – An introduction to the project.
  • Wikidata tours – Interactive tutorials to show you how Wikidata works.
  • Community portal – The portal for community members.
  • User options – including the 'Babel' extension, to set your language preferences.
  • Contents – The main help page for editing and using the site.
  • Project chat – Discussions about the project.
  • Tools – A collection of user-developed tools to allow for easier completion of some tasks.

If you have any questions, please ask me on my talk page. If you want to try out editing, you can use the sandbox to try. Once again, welcome, and I hope you quickly feel comfortable here, and become an active editor for Wikidata.

Best regards!

--Liuxinyu970226 (talk) 13:58, 7 September 2013 (UTC)

Merge[edit]

Hallo John Vandenberg,
When you merge items, please use the Merge.js gadget. It helps you merging, nominating, gives the option to always keep the lower number (which is older, so preferable) and makes it a lot easier for the admins to process the requests.
If you don't have account, you may have to Create Account. With regards,--by Revi at 09:04, 30 November 2013 (UTC)

OK, will do. Thanks for the tip. John Vandenberg (talk) 09:07, 30 November 2013 (UTC)

NLM Unique ID (P1055)[edit]

Hi,

The above property is now available and can be used on items. I noticed you participated in its discussion. --Ricordisamoa 20:21, 16 December 2013 (UTC)

chromosome (P1057)[edit]

The property chromosome (P1057) that you supported is available now. --Tobias1984 (talk) 23:18, 24 December 2013 (UTC)

Files[edit]

Hi, just to tell you files are non-notable items (read also WD:Notability), so please do not create new items to be used in that scope. Cheers, — TintoMeches, 22:38, 30 December 2013 (UTC)

Thank you for informing me. I was going to ask about that when I woke up, so you have saved me the trouble. John Vandenberg (talk) 00:12, 31 December 2013 (UTC)

Subpages[edit]

Please discuss about whether we can include some subtemplates at Wikidata:Requests for comment/Interwiki links for subpages.--GZWDer (talk) 05:30, 31 December 2013 (UTC)

MedalBot[edit]

Any update about Wikidata:Requests for permissions/Bot/MedalBot?--GZWDer (talk) 14:34, 25 February 2014 (UTC)

Wikimedia list article (Wikimedia list article (Q13406463))[edit]

Hi! "WMF list article" is a shortcut at Wikidata. לערי ריינהארט (talk) 07:01, 4 March 2014 (UTC)

WMF is a legal Foundation. I don't think the legal Foundation have any special relationship with these list articles. Also, I have yet to find any evidence that this entity is suitable on items other than WikiPedia list pages. Have you seen list items for Wikibooks or Wikivoyage? If so, are the Wikibooks/Wikivoyage items really of the same class as the Wikipedia list pages? John Vandenberg (talk) 10:22, 4 March 2014 (UTC)

Literary critics[edit]

Hi John, looks like your bot has identified literary critics as periodical literature (Q1092563). --Kolja21 (talk) 15:19, 5 March 2014 (UTC)

Hi Kolja21, yes I saw this and a few other problems and stopped the bot until I can undo my bots edits, and prevent this happening again. The cause is en:Category:Newspapers published in the United Kingdom contains en:Category:British journalists, which contains en:Category:British critics‎, which contains en:Category:British literary critics. That category tree is insane, but I shouldn't have my bot on that category without manually approving each edit. I will automatically revert the bot on the problematic data items. John Vandenberg (talk) 21:40, 6 March 2014 (UTC)
Good to know. Categories like that are really nasty. I've fixed "British journalists". --Kolja21 (talk) 21:45, 6 March 2014 (UTC)

What is the suffix /R /TS /TN etc mean[edit]

see en:Chinese Library Classification.--GZWDer (talk) 12:42, 2 April 2014 (UTC)

Historisk Tidsskrift[edit]

I think there might be an issue with Historisk Tidsskrift (Q15793533). Its a disambiguation page according to its linked English Wikipedia page en:Historisk Tidsskrift, yet have ERA. — Finn Årup Nielsen (fnielsen) (talk) 15:20, 14 April 2014 (UTC)

Great, nice find. There will be a few of these, as I aggressively linked to a Wikipedia page when creating new items if at all possible (if it matched on ISSN, or on title if the title contained a variation of the word 'journal' in either English or local language), rather than allowing the bot to create duplicates. John Vandenberg (talk) 15:38, 14 April 2014 (UTC)

statements at Wikimedia disambiguation pages[edit]

Hi! I found [1]. Can you please verify if the bot has added similar things? Regards gangLeri לערי ריינהארט (talk) 07:59, 27 April 2014 (UTC)

Hi, yes the bot has made a few mistakes like this. Thank you for finding and alerting me. The bot was in once-only aggressive heuristically mode, trying to identify all items which are periodicals or creative works in periodicals, and letting constraint exception reporting highlight the problems. See that item, where nl:Combat (tijdschrift) (a journal) was inappropriately linked with a disambiguation page by another bot.[2] ;-( I have done detailed verification of all items with a ERA Journal ID (P1058), most items with ISSN (P236) but not yet carefully reviewed all items with no label (P357). As we now have a good journal database, all ongoing periodical work presumes that any two conflicting pieces of data must be reviewed by bot operator rather than proceed with a constraint violation. John Vandenberg (talk) 08:21, 27 April 2014 (UTC)

Wikidata:Requests for permissions/Bot/MedalBot[edit]

was approved.--GZWDer (talk) 06:48, 4 May 2014 (UTC)

Thanks. I saw the email. I will begin work again when I have returned to Indonesia. John Vandenberg (talk) 02:40, 5 May 2014 (UTC)

New tool to find duplicate items[edit]

You may be interested in this :-) --Magnus Manske (talk) 13:27, 13 May 2014 (UTC)

@Magnus Manske: "Item Christmas Island (Q686310) has potential duplicates: Christmas Island (Q16351925)" It should have suggested Christmas Island (Q31063) is the item it is likely a duplicate of.
These are dups created because of your tools. Based on User talk:GZWDer#Creating duplicates unnecessarily, User talk:Daniel Mietchen#So many Widar enabled errors? (which could be as high as 70% error rate), and others, I wouldnt be surprised if we are talking hundreds of thousands of duplicates which are all going to grow into bot-populated items, until someone puts in a man month or more to clean up the mess. John Vandenberg (talk) 13:50, 13 May 2014 (UTC)
btw, my tool is far better at it, but it requires being operated by someone who is focusing on a specific topical area. John Vandenberg (talk) 13:59, 13 May 2014 (UTC)
"These are dups created because of your tools." That is, factually, a lie. Undoubtedly, some of them were created using my tools. How many, I do not know, and neither do you. FWIW, I looked at some of the dupes I merged, and didn't find any created though my OAuth tools; however, I spotted several created by User:GZWDer (flood). Maybe he'll be more receptive to your blame attempts. --Magnus Manske (talk) 15:23, 13 May 2014 (UTC)
@Magnus Manske: For your message, 90%+ of them are duplicates.--GZWDer (talk) 05:10, 20 May 2014 (UTC)

Re: Odd description[edit]

Nuvola apps edu languages.svg
Hello, John Vandenberg. You have new messages at Ricordisamoa's talk page.
You can remove this notice at any time by removing the {{Talkback}} or {{Tb}} template.

--Ricordisamoa 10:56, 24 May 2014 (UTC)

Re: Cassida vibex vs Cassida viridis bot problem[edit]

Nuvola apps edu languages.svg
Hello, John Vandenberg. You have new messages at Ricordisamoa's talk page.
You can remove this notice at any time by removing the {{Talkback}} or {{Tb}} template.

--Ricordisamoa 08:38, 26 May 2014 (UTC)

Dharma Drum Buddhist College[edit]

Excuse me for butting in on your work, but can you explain why you added P1188 Dharma Drum Buddhist College place ID to India (Q668) India? That seems like a strange property for a country to have? SpinningSpark 10:19, 10 June 2014 (UTC)

I think I have correctly mapped it to the record their database. http://authority.ddbc.edu.tw/place/search.php?code=PL000000048207 . My apologies if it is wrong. John Vandenberg (talk) 03:49, 13 June 2014 (UTC)

About GZWDer (flood) bot[edit]

Hi! You may be interested in this talk. Sincerely. Miniwark (talk) 12:30, 30 March 2015 (UTC)

Spotted a mistake[edit]

Hi, I hope that this message will help you improve your bot. A newspaper Moskovskij Komsomolets (Q1062623) was mistakenly labelled as an instance of scientific journal (Q5633421) in February last year. There also seem to be some other things labelled as scientific journals, but which don't actually publish scientific research... They should instead be instances of academic journal (Q737498). --BurritoBazooka (talk) 20:34, 21 September 2015 (UTC)

Not sure about that last bit, actually. I just happened to find one really easily and assumed there are more. It might have just been by chance. --BurritoBazooka (talk) 20:40, 21 September 2015 (UTC)
okay, I looked through the descriptions and labels of around 390 items labelled "scientific journal" (list was returned by AutoList2) and out of those found 7 (including that newspaper) which do not publish research, or research exclusively, or are not related to science, or science exclusively. I only looked at the labels and tried to guess before looking in more detail, so there might be more of them. All of them were labelled as scientific journals by your bot.
They were:
--BurritoBazooka (talk) 20:57, 21 September 2015 (UTC)

Here is a fairly strange error from back in 2014. — Finn Årup Nielsen (fnielsen) (talk) 19:39, 14 November 2016 (UTC)

Scholarly journal coverage in Wikidata[edit]

John, I was very impressed to see the level of coverage of scholarly journals/periodicals we have in Wikidata and I was curious to hear more from you – as someone who's been driving this effort – about:

  • the data sources you use;
  • any known data quality issues;
  • data modeling needs (like missing or problematic identifiers/properties);

and more generally how we can help support this effort as part of m:WikiCite and WD:WikiProject Source. I was at a major scholarly publisher conference in London this week and the figures on journal coverage got quite some attention. --DarTar (talk) 15:58, 5 November 2016 (UTC)

See also Wikidata:WikiProject Source MetaData/ToDo --DarTar (talk) 00:12, 6 November 2016 (UTC)
@DarTar:, it was mostly loaded via my bot User:JVbot by merging the existing Wikipedia articles with all of the journals in the ERA 2010 journal list (Q15735759) and ERA 2012 journal list (Q15794938); see bot requests there for more information, and also more info at Wikidata_talk:WikiProject_Periodicals, and there is a Wikipedia page w:Excellence in Research for Australia about the dataset. There are around 10 minor errors in each dataset, which I recorded on Property_talk:P1058 and of course fixed, and some other problems where ERA 'choices' were reasonable but hard to marry up with Wikipedia's choices for the same topics. I deferred to Wikipedia choices in those cases to minimise disruption.
I could talk and code forever about this topic and dataset, and have been doing it for 10 years, but I have other duties at the moment so I dont have time to participate in your alternative WikiProject. There are lots of data quality issues, but the largest is lack of data governance on WikiData coupled with quick and nasty bots and humans whose objective is having millions of edits rather than high quality edits. Until that is resolved, I generally do not find it useful to participate in Wikidata broadly, because the size of the total problem grows quicker than it can be fixed. I find it is only useful to load data as part of my own projects, do correlation and data cleansing, on Wikidata and in data extracts, and then leave the data to deteriorate over time rather than get into fights with people who are "building Wikidata" at all costs. These are the same problems that plagued Wikipedia for many years, and will probably sort themselves out in due course, by my attempts have failed and I have all but given up hope for the project (and I recommend my clients use their own Wikibase with the data extract loaded). Please let me know if you have specific questions I can help with. John Vandenberg (talk) 07:02, 6 November 2016 (UTC)
Thanks for your input. I am wondering whether you can point us to specific deteriorations and problems with data governance, so we can learn from them? — Finn Årup Nielsen (fnielsen) (talk) 16:43, 6 November 2016 (UTC)
John, the URLs given for the 2010 and 2012 ERA lists are dead links... I can't find the current links. --Randykitty (talk) 07:56, 31 December 2016 (UTC)

Unused properties[edit]

This is a kind reminder that the following properties were created more than six months ago: Norway Database for Statistics on Higher education publisher ID (P1271), Norway Import Service and Registration Authority publisher code (P1275), ISMN (P1208), Jufo ID (P1277), ecoregion (WWF) (P1425). As of today, these properties are used on less than five items. As the proposer of these properties you probably want to change the unfortunate situation by adding a few statements to items. --Pasleim (talk) 19:17, 17 January 2017 (UTC)