Wikidata:Project chat/Archive/2017/02

From Wikidata
Jump to: navigation, search

This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.


Biology: Categories for species[edit]

You are aware of the problem in wikicommons about species and wikidata: wikicommons has species categories and species galleries. But wikipedias have no species categories. So currently there is only one wikidata item per species. Conclusion: most wikicommons species category have no wikidata item!
Someone told me that we needed to create category item for all species.
Is that true ? I seems a lot of work.
I was waiting/hoping for another technical solution.
Best regards Liné1 (talk) 22:00, 31 January 2017 (UTC)

Hello Liné1, Last august a Request for Comments was started, which was not conclusive. Higher on this page "Links from items to Commons categories and galleries" shows that adding of commons categories to wikidata items on articles is a growing phenomenon. Lymantria (talk) 06:54, 1 February 2017 (UTC)

Be inspired and let us collaborate[edit]

The WMF is seeking people who are inspired to reach out to outside knowledge networks. I have asked and it would allow us to reach out to other "hobby networks" and seek how our data and their data can mutually improve.

There is a lot of potential here; there are many projects all with their own niche. When we collaborate, we can share identifiers and compare data. We can both include data as we see fit but more importantly we can curate the data when the data does not match. For those who say "their data is not good enough" I put it to you that neither is ours. The notion is that we both may benefit.

Let us be inspired to do good and improve our data both in quality and in quantity. Thanks, GerardM (talk) 06:55, 1 February 2017 (UTC)

"Rogue" Twitter accounts of national services[edit]

Recently, a number of US national agencies have spawned off so-called "rogue" or "alternative" Twitter accounts. Could Wikidata be a good place to collect these? Obviously with a qualifier. Thoughts? --Denny (talk) 01:41, 30 January 2017 (UTC)

what accounts? MechQuester (talk) 01:50, 30 January 2017 (UTC)
Twitter accounts. --Tagishsimon (talk) 02:00, 30 January 2017 (UTC)
Sorry, clarified. --Denny (talk) 05:22, 30 January 2017 (UTC)
With all due respect to US politics, there are 2 reasons not to add fake/rogue/alternative accounts to any official agency's item. 1. We don't know who created the fake. It may be a single employee, it might be the whole director's office but it could also be an outsider. We will probably never know. 2. In a few months, when the dust settles and the new administration will find a status quo with the current staff the accounts will slowly disappear, so no point adding them here. DGtal (talk) 06:40, 30 January 2017 (UTC)
+1 Don't mix knowledge and information. Snipre (talk) 07:57, 30 January 2017 (UTC)
A good idea is to go to their official website and then look around to find their "official" sponsored account. MechQuester (talk) 14:26, 30 January 2017 (UTC)

Unfortunate outcome, but understandable. --Denny (talk) 17:22, 1 February 2017 (UTC)

Wikimania 2017[edit]

Hello all,

The international Wikimedia conference will settle in Montreal (Q340) on August 11-13. As the call for submission is about to start, let's talk about what you would like to see happening about Wikidata.

Do you have a project of submissing a talk, workshop, meetup? Which topics would you like to talk about with the development team? Let's talk about our ideas on Wikidata:Wikimania 2017!

Thanks, Lea Lacroix (WMDE) (talk) 09:41, 1 February 2017 (UTC)

Trouble understanding how to merge articles?[edit]

I would like to request a merge. When I go to help:merge, it gives me 2 choices: special:mergeitems or the gadget.

When I go to special:mergeitems, I get a “Permission denied” error. When I click the link to my preferences, I do not see a “gadgets” section.

Since I apparently do not have the privilege to merge items, where can I request an administrator to perform the merge? Bwrs (talk) 17:30, 1 February 2017 (UTC)

Could you please provide some steps to reproduce your problem. I guess you are on the mobile site.
There's no particular place to request a merge since everybody can do it. Matěj Suchánek (talk) 17:56, 1 February 2017 (UTC)

About data donations: CC0 (Public Domain)[edit]

Hi everyone,

I am working in a future project in which we are going to gather data and use it, with the focus of migrate then its content to Wikimedia projects. In the case of Wikipedia I understand that every piece of content migrated entirely must to be licensed as CC SA-BY 3.0 or broader. But I want to be safe when I will write the guidelines to contribute and the set the license for the project so I was thinking to apply CC SA-BY to all the project but, if I want to make migrations of data that we are going to gather, do we need to set the CC0 (Public Domain) license for the data?

Maybe it is a nonsense, but I want to be sure with that process before to make any migrations in the future.

Thanks in advance!

Regards, Ivanhercaz Plume pen w.png (Talk) 11:56, 26 January 2017 (UTC)

Pictogram voting comment.svg Comment Checking Wikidata's website footer I imagine that I could set something similar to it, I mean: "data licensed with CC0 (Public Domain), the rest of the content licensed with CC SA-BY". Correct? Regards, Ivanhercaz Plume pen w.png (Talk) 12:04, 26 January 2017 (UTC)
That sounds right. --Jarekt (talk) 14:04, 26 January 2017 (UTC)
Thank you Jarekt! Regards, Ivanhercaz Plume pen w.png (Talk) 15:41, 26 January 2017 (UTC)
Data is under a different copyright law as intellectual work. On data itself is no copyright, but on a collection you can have copyrights. Take statistics of sports people, for example the WTA site for female tennis players. The data we can use, but we can not copy the whole set. In every day life this means we can manually type the info into WikiData, but we can not use a script to scrape their website without their allowance. In case of doubt you can always contact legal at for details. Edoderoo (talk) 19:57, 26 January 2017 (UTC)
@Edoderoo: You say that as though laws are the same everywhere and they are not. WMF servers are in the United States and Americans don't recognize any rights to databases. We license the data here in the most liberal way only because of other legal systems that have more restrictive laws. —Justin (koavf)TCM 03:47, 30 January 2017 (UTC)
Weird, as two years ago their was contact with legal of WMF about data of tennisplayers on the WTF/ITF websites, and their message then was to not scrape every database we could find, as far as I remember. Did legal tell you that we can use any database we see, or is it your own conclusion of how legal stuff works? Edoderoo (talk) 07:46, 30 January 2017 (UTC)
@Edoderoo: My own understanding. I can imagine several reasons why legal would suggest against trying to scrape every database you can find but the results of tennis matches are not copyrightable. —Justin (koavf)TCM 09:48, 2 February 2017 (UTC)

Links from items to Commons categories and galleries[edit]

Here's an update of results of queries into linking patterns from Wikidata to Commons categories and galleries.

A previous version was posted here at VP and also at at Commons VP in December 2015.

There are also some further historical versions, going back to September 2014, for older comparisons.

Commons categories
Commons galleries
total linked
Wikidata articles
(~ 22,165,947)
~ 1,268,063 100,042 ~ 1,299,996
~ 1,209,119
~ 1,235,579
Wikidata categories
396,087 558 396,094
total linked 1,426,002 100,086 ~ 1,696,090 items / 1,523,993 pages
props: ~ 1,590,788 items /
1,419,074 pages

Compared to 2015, perhaps the most notable feature is that new sitelinks to Commons continue to be dominated by sitelinks between Commons categories and article-like items here: up 183,682 compared to an increase of 47,384 in sitelinks between Commons categories and category-like items. (The total number of Commons categories has increased by 912,375 over the same period).

This is against how some Wikidatans feel sitelinks to Commons ought to work. However, it does seem to be the clear preference of most users when adding sitelinks, so perhaps the time has come to accept it as mostly harmless. Jheald (talk) 23:29, 26 January 2017 (UTC)

Thanks Jheald. Curated galleries at Commons are PITA compared with categorisation, they will often never need to exist for many minor players, whereas a category is easy, and can work in multiple ways. I would prefer to see Commons look at whether curated galleries should be subsidiary (or discouraged) in the system and that makes it even easier to win an argument among WDatans. From the perspective of a WSian linking to a category at Commons is more beneficial than linking to a gallery, well it works that way for authors as many do not write many books. The problem is that here we map primarily map categories to categories as a preference, and there is no easy way to have a CommonsCat link to articles, yet many of the sisterwikis may not have categories to match. We need some better way to work through the interweaving of interwikis based upon how the sister prioritises their pages, not how we think that they should.  — billinghurst sDrewth
@Jheald: are you sure users did this and not just one user with a bot? Multichill (talk) 08:48, 27 January 2017 (UTC)
@Multichill: It's possible. I have no idea who has been adding the sitelinks -- I'm not even sure I'd know how to query the history to investigate at scale. If somebody were to have been adding sitelinks with a bot, I'm not clear where they'd be getting their information from. If someone were converting P373s into article --> commonscat sitelinks, why stop at only 180,000 of them? But maybe there are bots out there that add commonscat sitelinks. Jheald (talk) 09:19, 27 January 2017 (UTC)
I often see users creating items by linking a Commons category to a Wikipedia article. The number doesn't surprise me: The most obvious way to add interwiki links is to click the "Add links" link in the sidebar. - Nikki (talk) 12:58, 27 January 2017 (UTC)
Hello guys, maybe I can try to explain you the wikicommons contributor mindset:
  • wikicommons has much much more categories than galleries. So some contributors work only on categories.
  • until very recently, you could access from wikicommons only ONE wikidata item. So if your wikicommons category was linked to a wikidata category item, you could retrieve no properties.
  • now we can access any wikidata item from wikicommons, but it is still quite complex to access wikidata, test if it is a category item, use category's main topic (P301) to access the other item. Not many template do it.
  • for biology, wikicommons has species categories (categories are mandatory) and species galleries (galleries in 20% of cases). But wikipedias have no species categories. So currently there is only one wikidata item per species. Conclusion: most wikicommons species category have no wikidata category item!
    • Case1: When a commons species category has no commons gallery => we link 'commons species category' to 'wikidata taxon item'
    • Case2: When a commons species category has a commons gallery => 'commons species category' has no wikidata item
Best regards Liné1 (talk) 07:21, 2 February 2017 (UTC)

Sanitize labels before using them in wikitext[edit]

Hi all. I saw a number of Wikidata modules at Wikipedias, and many use labels from Wikidata items directly in generated wikitext. That is not good because labels may contain any characters, so a Wikidata label may be constructed to e.g. generate arbitrary wikilinks or external links. I suggest all writers of Wikidata modules to sanitize fetched labels with mw.text.nowiki before using them in wikitext. Best regard, Dipsacus fullonum (talk) 11:45, 2 February 2017 (UTC)

NB. I forgot to mention, but you should of course also sanitize values of type string and monolingual text. Best regards, Dipsacus fullonum (talk) 12:00, 2 February 2017 (UTC)

Opened BY?[edit]

Hello! I am working on a church at the moment. The church was opened at a certain date (Property:P1619 by a "certain person"). How can I make this person, Q5803235, to be the person that officially opened this location?

Maybe officially opened by (P542). - Kareyac (talk) 05:46, 3 February 2017 (UTC)
I think this is perfect, thank you! --Fringilla (talk) 05:59, 3 February 2017 (UTC)
This section was archived on a request by: Matěj Suchánek (talk) 14:02, 8 February 2017 (UTC)


Please make union of Q16205862 (was a redirect in enwiki) and Q427070. I searched help to find a procedure to do or request this, but I couldnt finde. Please link it more clearly somewhere thank you. --Sailko (talk) 10:08, 3 February 2017 (UTC)

@Sailko: Help:Merge is what you’re looking for. I’m not going to merge these items, feel free to practice it by yourself. You can’t break anything which cannot be reverted. Regards, MisterSynergy (talk) 10:10, 3 February 2017 (UTC)
Thanks! I made a redirect from Help:Union to Help:Merge, somebody could look for this like I did. --Sailko (talk) 10:17, 3 February 2017 (UTC)
The merge looks good. I tidied a bit as you can see in the merged item’s history, but that’s basically everything you need to know. Regards, MisterSynergy (talk) 10:22, 3 February 2017 (UTC)
This section was archived on a request by: Matěj Suchánek (talk) 14:02, 8 February 2017 (UTC)

Edit summary statistics?[edit]

Hi all, I'm interested in getting some stats based on edit summary added by tools. For example brilliant QuickStatments always adds #quickstatements hashtag. I'm wondering if it's possible to get how many edits was done by me using this tool and get top 10 tool users. Any ideas? Yarl (talk) 22:55, 3 February 2017 (UTC)

@Yarl: Quarry? See for example Wikidata Most common deletion summaries and edits of a user with a specific edit summary and Who did the most reverts in fawiki. --Atlasowa (talk) 21:40, 4 February 2017 (UTC)
@Atlasowa: Thanks! Yarl (talk) 16:15, 5 February 2017 (UTC)
This section was archived on a request by: Matěj Suchánek (talk) 14:02, 8 February 2017 (UTC)


fr:Concerto pour piano de Poulenc and es:Concierto para piano (Poulenc) mean the same piece. --Gerda Arendt (talk) 08:16, 6 February 2017 (UTC)

Thanks for report, → ← Merged. - Kareyac (talk) 08:43, 6 February 2017 (UTC)
This section was archived on a request by: Matěj Suchánek (talk) 14:02, 8 February 2017 (UTC)

Hebrew Wikipedia issue[edit]

Can a Hebrew-speaking editor please check Ze'ev Jabotinsky (Q319896) and Hanna Jabotinsky (Q6822632), which seem to be about the same person, or two related people? Each links to a different he.Wikipedia article. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:11, 7 February 2017 (UTC)

Hanna Jabotinsky (Q6822632) was the wife of Ze'ev Jabotinsky (Q319896). DGtal (talk) 11:20, 7 February 2017 (UTC)
I agree. I've linked them with spouse (P26) and edited the English label of Hanna Jabotinsky (Q6822632) as suggested. Deryck Chan (talk) 11:48, 7 February 2017 (UTC)


Thank you, both. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:13, 7 February 2017 (UTC)

This section was archived on a request by: Matěj Suchánek (talk) 14:02, 8 February 2017 (UTC)


Can someone give me a talk welcome? I'd like to have the helpful links included in the welcome, but I would feel pretty goofy welcoming myself. ;-) Daphne Lantier (talk) 22:46, 7 February 2017 (UTC)

My pleasure. You're very welcome! Jheald (talk) 00:00, 8 February 2017 (UTC)
This section was archived on a request by: Matěj Suchánek (talk) 14:02, 8 February 2017 (UTC)

Adding label in any arbitary language.[edit]

According to , any language should be admissible for use on Wikidata. However, when i tried to add a manchu label (ISO code mnc) to an arbitary item via the "list of headers" button, it say it cannot recognize the language parameter. How to overcome the error? C933103 (talk) 01:42, 3 February 2017 (UTC)

@C933103: As per this answer, you have to complete the most important messages in order to add that support. --Liuxinyu970226 (talk) 07:59, 4 February 2017 (UTC)

regex formatter for external-ID[edit]

  • General question: There are regex format expressions for external IDs both on a property page (via a format as a regular expression (P1793) claim) and on the property talk page (via {{Constraint:Format}}). Which one is relevant for the result on constraint violation pages?
  • Unrelated specific problem: What’s wrong with the regex format expressions of ID (P3532) and ID (P3539)? The covi page update this night suggests that all values violate the format constraint, but I am unable to identify the problem.

Thanks, MisterSynergy (talk) 11:22, 3 February 2017 (UTC)

Okay thanks, let’s see what happens tomorrow. I thought the wrapping <nowiki>-tags would be enough to delimit the expression. —MisterSynergy (talk) 12:37, 3 February 2017 (UTC)
<nowiki> does not delimit format expressions. It's only function is to escape wiki markup. --Pasleim (talk) 10:49, 4 February 2017 (UTC)

Tools to add statements?[edit]

Hi! I'm using PetScan (Q23665536) to search for specific items on Wikipedia that have no statements. When I have a good selection of items, how can I add the instance of statement to those in one batch? //Mippzon (talk) 18:13, 3 February 2017 (UTC)

"Other sources" → "Use wiki" → "Wikidata". Log in to WiDaR and you will see AutoList-like menu. Matěj Suchánek (talk) 19:11, 3 February 2017 (UTC)

no label (Q18542787)[edit]

I am 99.9999% sure that there is no reason to have an item such as no label (Q18542787), but sometimes seemingly strange things are done so on purpose. Since ~10 different editors touched that item, including sysops, I’d ask for a reason to keep it, otherwise I’d request it’s deletion (tomorrow). —MisterSynergy (talk) 23:22, 3 February 2017 (UTC)

manager of VfL Wolfsburg (Q20089909) is another similar problem. —MisterSynergy (talk) 23:31, 3 February 2017 (UTC)
No. The latter is perfectly OK. Thierry Caro (talk) 04:12, 4 February 2017 (UTC)
I do not see any other item of that type and think that there are better ways to model coaching positions. Not sure which one to prefer:
Both do not require items such as manager of VfL Wolfsburg (Q20089909). —MisterSynergy (talk) 06:43, 4 February 2017 (UTC)
My understanding of position held (P39) is that it is specifically for public office, i.e. political-type positions, so perhaps:
--Oravrattas (talk) 09:16, 4 February 2017 (UTC)
I’m okay with that approach as well. —MisterSynergy (talk) 09:28, 4 February 2017 (UTC)
The notability of no label (Q18542787) was already discuessed in Wikidata:Project chat/Archive/2014/12#Intersection properties. There was pretty clear consensus against this item and the only user defending the item was a sock puppet. --Pasleim (talk) 10:52, 4 February 2017 (UTC)
Thanks, then it is a clear situation. I’ll fix the affected items and propose no label (Q18542787) for deletion. The other problem can be solved independently. —MisterSynergy (talk) 11:00, 4 February 2017 (UTC)


Hello. Is there an item to show that an item is a sports website or a news portal website? Xaris333 (talk) 12:43, 5 February 2017 (UTC)

Easy way to clean up wrong descriptions[edit]

Look at this item. It is not a Wikimedia category but has such descriptions in many languages. There are many items like this. Mostly because of bots or autoEdit. See more examples here. Is there any tool like autoEdit that can delete such descriptions with one click as autoEdit adds :)?--ԱշոտՏՆՂ (talk) 13:13, 5 February 2017 (UTC)

Easy with MediaWiki:Gadget-dataDrainer.js, just not sure if it only works for admins or is it fixed. Stryn (talk) 13:28, 5 February 2017 (UTC)
No, still admins-only. And that 'autopatrolled' group. --Edgars2007 (talk) 13:30, 5 February 2017 (UTC)
Normally in this case you must check the history and you can find a wrong merge. I have restored the previous situation using "Undo" --ValterVB (talk) 13:42, 5 February 2017 (UTC)
Why a misleading edit summary and unnecessary ping? Stryn (talk) 13:56, 5 February 2017 (UTC)
It's an automatic edit summary, so I don't know why system has created that summary, I reverted edits from 15:37, 19 jan 2017‎, about the ping I don't understand. --ValterVB (talk) 14:16, 5 February 2017 (UTC)
See Incorrect revert summaries/pings --Succu (talk) 14:30, 5 February 2017 (UTC)

Wikidata links to English Wikipedia draftspace[edit]

There is a discussion relevant to the above topic, here. All comments welcome. --Euryalus (talk) 11:31, 28 January 2017 (UTC)

you know, that link they are referring to in the post, you can thank @MSGJ: for that. MechQuester (talk) 14:31, 28 January 2017 (UTC)

The message there is:

The EnWiki community requests that Wikidata establish a policy against linking to our draft space. Drafts are intended as an internal workspace. Drafts may contain inappropriate or problematical content. External consumption of drafts is undesired, and is strongly discouraged.

So far the straw poll is a unanimous 4 out of 4 update 8 out of 8 and I closed the poll-21:18, 29 January 2017 (UTC), endorsing it pretty strongly. If the Wikidata community is agreeable, I'd like to follow up investigating the possibility of some sort of technical barrier to entering draft space links. Alsee (talk) 19:22, 28 January 2017 (UTC)

According to WD:N Draft namespace already isn't accepted as a valid sitelink. Mbch331 (talk) 19:32, 28 January 2017 (UTC)
Mbch331, perhaps I am reading it differently because there was in fact a Draft space link, and an open Phabricator task requesting an upgrade to the functionality of draft links, but I can easily see reading it as not targeting draft links at all. That's the Notability policy for Wikidata items. In order to qualify as Notable, the item needs to satisfy one of the listed criteria. The data item did satisfy the criteria, it had links to Italian and French articles. So Notability was satisfied. The item was valid. Then someone thought it helpful to add the a draft link exactly matching the topic. That cannot diminish the already established Notability. Maybe my reading is biased by the circumstances, but it couldn't hurt to more directly target the issue. Alsee (talk) 20:35, 28 January 2017 (UTC)
Can we scan for any other Draft links that might exist? Alsee (talk) 20:48, 28 January 2017 (UTC)
The issue isn't about whether the item is notable. Enwiki doesn't want that external links point to it's draft namespace.
Policy-wise I see no reason why we should against the wishes of Enwiki on this point. Such links should be removed. It might also make sense to prevent the addition of those links technically. ChristianKl (talk) 21:30, 28 January 2017 (UTC)
Yes, we should do it at the mediawiki level. Draft on en.wp is not very much different from the user subspace, and we do not link to those.--Ymblanter (talk) 22:36, 28 January 2017 (UTC)
  • Pictogram voting comment.svg Comment Definitely agree that a sister wiki should be able to define which namespaces are outward facing for notability. Draft: namespace pages at English Wikipedia are not articles, and should not be linked here.  — billinghurst sDrewth 10:25, 29 January 2017 (UTC)
This has nothing to do with notability. Even for clearly notable Wikidata items (for example one's that have sitelinks to the French and German Wikipedia) enwiki doesn't want us to link to their draft namespace. ChristianKl (talk) 19:21, 1 February 2017 (UTC)
  • Pictogram voting comment.svg Comment why technical barrier? why not flag? why policy? why don't english wikipedia editors edit here, rather than straw polls there? you realize people will link to draftspace elsewhere to manage the list, i.e. english does not control inbound linking. - just because english wants to get spun up over a year old ticket with no action, does not mean wikidata needs to take any action. Slowking4 (talk) 13:07, 29 January 2017 (UTC)
    Slowking4, "english does not control inbound linking" - Correct. It's a polite request from one of our communities to another of our communities. "does not mean wikidata needs to take any action" - Correct. It's a request. "why technical barrier" Because EnWiki considers these links to be undesirable, and I believe/hope the Wikidata community considers these links to be undesirable, and on EnWiki we sometimes use edit filters to prevent some categories of undesirable links or other content. If anything, it seems even more in line with Wikidata philosophy and design to place constraints on entered content. (Good luck trying to enter date information for a Solargraph, Wikidata just locks up the save button with no visible way to proceed.) Alsee (talk) 21:00, 29 January 2017 (UTC)
why are you linking to a commons image with screwed up metadata? there is no wikidata there. i would suggest you need to produce some evidence if you want your concerns taken seriously. Slowking4 (talk) 13:50, 31 January 2017 (UTC)
I was linking to a random solargraph to illustrate what I meant. The date of that particular image is January 1 - December 31, 2014. Normal Wikis handle that just fine. There has been discussion of switching Commons meta data to use Wikidata. Wikidata doesn't allow date information to be entered for Solargraphs or any other any date-range content. It just locks up the data field, with no apparent way to proceed. Alsee (talk) 17:26, 31 January 2017 (UTC)
you could proceed without a date. you could enter a year. Slowking4 (talk) 03:54, 1 February 2017 (UTC)
  • This is an issue on how "article" is defined. Articles should link only to articles. English Wikipedia's Draft: namespace isn't articles, so articles shouldn't be linked to there. Od Mishehu (talk) 06:42, 30 January 2017 (UTC)
I think this is a good opportunity to combine the wishes: enwiki doesn't want draft: pages to be linked and wikidata doesn't want to host the links - at least doesn't consider them notable. I support ChristianKl in his idea for technically prevention. Still, if a page is moved from mainspace to draft: on enwiki, what will happen? Lymantria (talk) 06:47, 30 January 2017 (UTC)
We don't have a policy of only linking to articles. We also have items for templates. Do you think there's a reason why enwiki templates shouldn't be linked on Wikidata and thus have links to the versions of the template in other languages? ChristianKl (talk) 19:26, 1 February 2017 (UTC)
Templates on enwiki are the equivalent of their counterparts on other wikis; dratfs on enwiki are not the equivelant of atricles. Od Mishehu (talk) 21:52, 5 February 2017 (UTC)

Semiprotecting properties by default[edit]

Hi everyone,

I wonder if we should semiprotect the entire Property namespace, as properties aren't so easy to improve and too easy to vandalise (with an enormous and unpredictable impact). Becoming an autoconfirmed user (or receiving the confirmed flag) is really easy, and unregistered users will always be able to edit the Property_talk namespace or send requests to other users to add, for example, new labels or aliases, so I think that this protection shouldn't prevent anyone from contributing to Wikidata in any way.

What do you think? --abián 18:27, 28 January 2017 (UTC)

  • Symbol support vote.svg Support ChristianKl (talk) 21:30, 28 January 2017 (UTC)
  • Symbol support vote.svg Support, indeed, I do not currently see any drawback in protecting all the properties.--Ymblanter (talk) 22:32, 28 January 2017 (UTC)
  • Symbol support vote.svg Support Good idea. --Jklamo (talk) 23:05, 28 January 2017 (UTC)
  • Symbol oppose vote.svg Oppose Semi-protect may be a good idea for wildly used property like instance of (P31), but I don't beleive it's a problem in the first place. I have like 20 to 30 properties pages in follow list my and I never see one of this pages vandalised. --Fralambert (talk) 23:28, 28 January 2017 (UTC)
  • Symbol oppose vote.svg Oppose per Fralambert. This would also discourage, if not prevent, new users/ IPs from adding much-needed labels in smaller languages. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:39, 28 January 2017 (UTC)
    • Pictogram voting comment.svg Comment, edit conflict: A look shows us that (almost?) all the recent edits made by unregistered users are tests or vandalisms. Unfortunately, I don't think that this proportion of valid/total edits is going to improve in any moment of the future. However, the editing frecuency, in general, will do continue increasing (and, with it, the number of vandalisms on properties) as Wikidata becomes more and more reachable from the Wikipedias. Instead of only showing a padlock, we could include a message informing that, if the user wants to contribute to the property without having a registered account (something that I see as extremely unusual), we encourage them to leave a request (link to an appropriate new pre-filled message on the talk page, or on the project chat, or on a specific page for these cases). --abián 00:28, 29 January 2017 (UTC)
  • Pictogram voting comment.svg Comment While I see the potential for vandalism here, the property namespace is hardly edited by non-patrollers anyway (about 5 to 10 edits per day), and as far as I see, non-constructive edits are reverted faster there than elsewhere. --YMS (talk) 00:08, 29 January 2017 (UTC)
  • Symbol oppose vote.svg Oppose we need labels in all languages.. Once we have them for all 280+ languages maybe.. Thanks, GerardM (talk) 00:17, 29 January 2017 (UTC)
We do need labels for all languages however we need good labels for all languages. Labels for small languages are much harder to patrol. Having people who actually understand how Wikidata works write those labels makes some sense. ChristianKl (talk) 06:09, 29 January 2017 (UTC)
  • Symbol support vote.svg Support We don't have the resources to watch all properties in all languages. As it is hard to know if a label in a given language is vandalism or not without knowing the language, and we don't have people in all our languages monitoring the changes, then it is a good idea to semiprotect the labels. --Micru (talk) 08:52, 29 January 2017 (UTC)
  • Symbol oppose vote.svg Oppose Until someone can offer an actual demonstrated benefit to this change. This is a proposal, but absolutely no evidence as to why this is being proposed as a good change. Jo-Jo Eumerus (talk, contributions) 09:59, 29 January 2017 (UTC)
  • Symbol neutral vote.svg Neutral with tendency to oppose. I don’t see what could be meant by “vandalise with an enormous and unpredictable impact”; if someone can provide examples, I’d reconsider my decision. —MisterSynergy (talk) 10:12, 29 January 2017 (UTC)
    With that comment I mean that Wikidata isn't only what we see in Wikidata is (and we want it to be) a knowledge base used by the other Wikimedia projects, third-party projects and lots of external applications of all kinds (currently, even Google uses it for its searches). While vandalising a label for a property can mislead some users in Wikidata without greater impact (as labels aren't interpreted by machines, only by humans), other fatal changes, in the worst-case scenario, could harm, until vandalism is reverted, the entire Wikidata ontology and the we-don't-know-which projects and applications that could load the ontology in that state. --abián 11:33, 29 January 2017 (UTC)
    That’s still pretty abstract. I can imagine that changing URL patterns could be malicious for instance, or maybe even changes in equivalent property (P1628). However most information in property items is not critical, and I would prefer to keep properties unprotected for now. As an alternative: can we technically use abuse filters for critical parts (e.g. prevent URL pattern changes by anons and new users)? —MisterSynergy (talk) 12:28, 29 January 2017 (UTC)
    Yes, we can. I hadn't thought of that possibility, and I like the idea if there's no consensus on this. I would also like to have a filter that let unregistered users add new labels but which didn't let them modify or remove the existent ones, but this filter wouldn't be possible because users couldn't revert their own mistakes. --abián 13:05, 29 January 2017 (UTC)
  • Pictogram voting comment.svg Comment I am not adverse to the proposal if we can demonstrate a low percentage of useful edits, though I would like to see if we can utilise other tools to weed out bad edits first. We should be able to utilise abuse filter rules to more easily monitor changes in that namespace, and test and challenge IP edits, or brand new accounts on their edit with a constructive message, and let confirmed accounts pass through.  — billinghurst sDrewth 10:17, 29 January 2017 (UTC)
  • Symbol support vote.svg Support I see 2 reasons to limit changes in properties: first is vandalism and the second is avoid to see contributors changing the scope of the property by modifying the label/description/statemen once they have been set up. But this should be only if a translators team can provide the majority of the labels/descriptions after the property creation. Snipre (talk) 10:30, 29 January 2017 (UTC)
  • Symbol support vote.svg SupportWylve (talk) 10:35, 29 January 2017 (UTC)
  • Pictogram voting comment.svg Comment have you tried flagging / reverting edits with ORES? protection; filters should be a last resort, not first resort. Slowking4 (talk) 12:59, 29 January 2017 (UTC)
  • My experience is that changes in items linked by country (P17) today affects large parts of svwiki. Vandalism there is de facto a larger problem than in the property-namespace. The effect of vandalism of Property-namespace is potentially more critical, but such vandalism is in reality very limited. -- Innocent bystander (talk) 13:23, 29 January 2017 (UTC)
  • Interesting. It could be useful to protect them from vandalism. MechQuester (talk) 13:48, 29 January 2017 (UTC)
  • It would be cool having the possibility of (semi)protecting statements in properties, but leaving labels and descriptions aside. Strakhov (talk) 14:44, 29 January 2017 (UTC)
  • Pictogram voting info.svg Info @Abián If you intention is to got a decision please start a Wikidata:Requests for comment. --Succu (talk) 21:16, 29 January 2017 (UTC)
  • Symbol oppose vote.svg Oppose. Properties require labels and descriptions in many languages. However, these are not public facing, so vandalism to them isn't that problematic. --Yair rand (talk) 21:19, 29 January 2017 (UTC)

Alternative proposal[edit]

I propose that:

  1. Before phab:T47224 is fixed:
    1. Non-autoconfirmed users can only add (not change or remove) labels, descriptions, aliases and claims of a property;
    2. And probably in addition, IP users may not edit properties (unclear whether this is a good idea, maybe we should only enforce the former).
    Both may be enforced by abusefilters.
  2. After phab:T47224 is fixed: any non-autoconfirmed users must provide a summary while editing properties. Later the restriction(s) above may be rescinded if the level of vandalism become lower.--GZWDer (talk) 19:49, 5 February 2017 (UTC)

Quick statements[edit]

Is it possible to add qualifiers to existing statements, using eg Quick Statements ?

I tried, but it seemed that Quick Statements wouldn't re-write the old statement to add qualifiers unless I had first deleted it -- but there doesn't seem to be the possibility to include the deletion in the batch process. Jheald (talk) 00:29, 6 February 2017 (UTC)

On the other hand, it seems that Magnus's new experimental Quick Statements version 2 can do this, so all looks good. Jheald (talk) 00:44, 6 February 2017 (UTC)

flaws in property ethnic group (P172)[edit]

There are a couple of problems with the property ethnic group (P172):

  • There is a requirement for sourced claims on Property talk:P172, but apparently more than 97% of the claims are unsourced (~30600 of 31400 as far as I see). There is obviously no possibility to repair this problem by adding sources to all of these claims. Shall we either remove the requirement for sources, or enforce the requirement otherwise (e.g. by removal of unsourced claims)? We had a similar situation with sexual orientation (P91) last summer/fall with ~5000 affected claims (were removed). Similarly as in the sexuality case, information about the ethnic group can potentially be quite delicate for the persons which our items describe.
  • This property was created by @אבגד in March 2013, without a formal discussion as the talk page claims. I have no idea whether this was normal back then, but is this okay?
  • The covi page lists plenty of values which are not properly defined. This could have to do with the difficult definition of an abstract concept such as the one of this property. Could we please clearly define this property, or shall it be one of the messy “everything links to everything” properties in future? I am unfortunately not very qualified to work on the definition in this field…
  • Recently a couple of cases appeared in which (anon) users added many values on this property within an item about a single person (example Q28445472#P172 see below). Is this property really suitable for such an approach?

I’d like to raise attention for this property in the community and invite all Wikidata users to participate in this discussion. Depending on the outcome we might want to take some action or not… Thanks, MisterSynergy (talk) 09:24, 30 January 2017 (UTC)

Responding to some of your bullet points:
  • It appears that only 196 statements have a proper source, so removing all the unsourced ones would delete almost all the data for this property. We're supposed to notify projects which are using the property before making big changes, so if we do want to delete all the unsourced ones, I would suggest that we tell the projects using our data and give them a period of time to either add sources to the existing data or add the data locally before doing any mass deletions.
  • I don't think the way it was created really matters at this point. If we think it's useful, we should keep it, if we don't, we should delete it.
  • There is one anonymous user with a frequently changing IP address who keeps adding lots of ethnic group statements (who is responsible for both of your examples and probably most of the other cases of multiple statements). Without a better definition of what this property should contain, it's hard to say whether or not it's right to do that. However, that user has been repeatedly blocked for bad edits (not only the ethnic group statements), if you spot them editing again, it's probably worth letting the admins know.
- Nikki (talk) 11:46, 3 February 2017 (UTC)
Thanks for your response.
  • We had a similar situation with the mentioned sexual orientation (P91) property last year, and I finally ended up removing ~5.000 unsourced claims with that property after discussion here at WD:PC (links can be found on Property talk:P91). Until now only ~25 unsourced claims have re-appeared, but most of the data was lost during this process (it is likely still available via categorization etc. in Wikipedias). In this case of ethnic group (P172) I do not want to suggest to necessarily take the same action, but the obvious discrepance of theoretical source requirement and practical lack of sources worries me. I therefore asked for consensus of either a removal of data or a removal of this requirement, with no real personal preference on either of these options.
  • Regarding the definition: I have no useful knowledge about the concept of ethnic groups, thus I can’t help much in a definition process. This topic only came to my attention due to RC patrolling, where lots of questionable edits showed up recently. If this situation of lacking definition is not going to be solved soon, we end up with another messy “everything-links-to-everything” property which is no longer going to be useful for any application. We should seriously avoid a situation where tons of crappy legacy claims pile up that we practically cannot get rid of just because it is heavily used in Wikipedias. The earlier these problems are fixed, the less trouble we have. Unfortunately we don’t have useful procedures in place.
Again, thanks for your comment. I hope that more editors add their opinion on this problem here. Regards, —MisterSynergy (talk) 12:11, 3 February 2017 (UTC)
Like you, I also have no real preference and don't really know enough about it to help define it. :/ I don't have any objections to removing them, I would just like to see it done in a way which is sensitive to the other projects using our data - we don't want to put people off using our data because we keep suddenly mass deleting things without any warning. - Nikki (talk) 16:27, 3 February 2017 (UTC)
I would be fine with removal of unsourced data via a bot. ChristianKl (talk) 16:07, 3 February 2017 (UTC)

Big amount of adding was made by me. Information abuot ethnicity I took from sources joined to items with described by source (P1343). I didnt join those directly, because all/nearly all properities have the same source. Nearly only ethnic group (P172)Armenians (Q79797) with Armenian Soviet Encyclopedia (Q2657718), Karabakh War 1988–1994 (Q16392167). Sources (books), that have not WD item are planned to get them and be joined. - Kareyac (talk) 05:51, 5 February 2017 (UTC)

@Kareyac: I already noticed that, since Armenians (Q79797) is heavily used with this property. If almost all of your claims have the same source, it might be possible to add sources to your claims automatically. The amount of extra items would be small, the major task would be to compile a list of “item, value, source claims”, e.g. in an Excel sheet. —MisterSynergy (talk) 21:28, 5 February 2017 (UTC)
@MisterSynergy: Thank for good advise, I'll follow it when find Kareyac-friendly tutorial for Quickstatments. I stopped at importing QXXX-looking WD items list from PetScan. Would it be easier and faster to ask a bot holder to copy-paste from described by source (P1343) to ethnic group (P172) of the same item? It can be more effective, because many described by source (P1343) include details like page number page(s) (P304), which are different. - Kareyac (talk) 07:09, 6 February 2017 (UTC)
It would be important to compile complete references according to Help:Sources. How they exactly look like depends on the reference’s type of work (website, database, book, etc…). For printed books (as in your case, as far as I can see), one particularly needs the chapter/page in a given edition of this work for each claim separately. If that information was available in a spreadsheet, it is indeed possible to add complete references to existing claims by bot (without that much bot knowledge), but as far as I know not with QuickStatements.
The sheer number of affected claims might be a problem. Following this query, there are ~8.200 claims “P172:Armenian”. We need page/chapter information for each one separately… —MisterSynergy (talk) 12:49, 6 February 2017 (UTC)


How civile service rank Active State Councillor (Q2623484) can be added to person’s item? noble title (P97) looks like. - Kareyac (talk) 07:38, 6 February 2017 (UTC)

Quality Criteria for Building a Tool to Evaluate Item Quality[edit]

Hi everyone!

As we all know, data quality is very important for Wikidata. We always strive to improve our data quality, so that we can make it usable for more use cases.

At the moment, I am exploring the usage of machine learning to evaluate item quality. I am aiming to develop a kind of ORES for evaluating item quality. This will allow us to grade item quality. Furthermore, it enables us to easily find low quality items and fix them. As a result, we can improve the data quality of the whole Wikidata.

As the first step to develop the mentioned tool, I am going to launch a campaign for grading the quality of Wikidata items in the near future. I have made the criteria for each quality grade which is based on Wikidata:Showcase_items.

I would like to ask your opinion if I have made appropriate criteria for each quality grade. Any suggestions pertaining to this would be welcomed!

Proposed criteria

You can find the criteria below:

Grade A

  • Large number of statements with:
  1. References for non-trivial statements (references other than Wikimedia projects)
  2. Appropriate ranks
  3. Qualifiers
  • Large number of completed translations: labels, descriptions, and labels of used properties
  • All appropriate sitelinks to corresponding Wikimedia projects
  • All appropriate aliases exist in most languages
  • If applicable, there is an image associated with the item

Grade B

  • Good number of statements with:
  1. References for non-trivial statements (references other than Wikimedia projects)
  2. Appropriate ranks
  3. Qualifiers where applicable
  • Good number of completed translations: labels, descriptions, and labels of used properties
  • A few missing sitelinks to corresponding Wikimedia projects. You might need to check whether the item actually have an article in other Wikimedia projects (e.g. Wikipedia) and does not have the sitelink to that article.
  • Almost all appropriate aliases exist in most languages
  • If applicable, there is an image associated with the item

Grade C

  • Moderate number of statements with:
  1. References for non-trivial statements (references other than Wikimedia projects)
  2. Appropriate ranks
  3. Qualifiers where applicable
  • Moderate number of completed translations: labels, descriptions, and labels of used properties
  • Moderate number of missing sitelinks to corresponding Wikimedia projects. You might need to check whether the item actually have an article in other Wikimedia projects (e.g. Wikipedia) and does not have the sitelink to that article.
  • Appropriate aliases exist only in some languages
  • Although it is applicable, there is no image associated with the item

Grade D

  • A few statements with:
  1. References for non-trivial statements (references other than Wikimedia projects)
  2. Appropriate ranks
  3. Qualifiers where applicable
  • A few completed translations: labels, descriptions, and labels of used properties
  • Many missing sitelinks to corresponding Wikimedia projects. You might need to check whether the item actually have an article in other Wikimedia projects (e.g. Wikipedia) and does not have the sitelink to that article.
  • Appropriate aliases exist only in a few languages
  • Although it is applicable, there is no image associated with the item

Grade E
All items that do not match grade “D” criteria.

You may find the criteria is vague. For instance, “large number of completed translations”, how do we define/quantify ‘large number’? however, this is intentional as I want the campaign participants to use their common sense (i.e. their own definition of ‘large number’) in grading the item quality.

Thanks a bunch! :)

--Glorian Yapinus (WMDE) (talk) 15:09, 30 January 2017 (UTC)

  • Happy to see you're having a shot at this. I already noticed some tasks floating around in fabricator. I'll break down the reasoning a bit more. Things we could measure that indicates quality:
  • Labels (the more the better)
  • Aliases (already a bit more complicated, depends on the domain if aliases are even available)
  • Description (the more the better)
  • Statements
    • The number of statements
    • The number of unique P statements (this would filter out items which have a lot of statements of one property type)
    • The statement coverage for a domain, for example for instance of (P31) -> human (Q5) we expect sex or gender (P21) and some others. This will enable wikiprojects to define quality criteria for their subject area just like
    • Something with qualifiers
    • Something with ranks (doesn't seem to be used a lot)
    • References, real references a lot of good ones
    • Quality of the linked items. For an item to reach the highest class, the linked items need to have a minimum quality too otherwise I'll see a lot of statements not in my language
Sitelinks are a bit tricky. Basically everything on Wikipedia is linked with something here because people ran bots. Could be used the other way around, if an item has sitelink in language qqq, the label and description needs to be set in language qqq to reach a certain level
I would not invent a new scale but just base it on en:Wikipedia:Version_1.0_Editorial_Team/Assessment#Quality_scale. Multichill (talk) 15:34, 30 January 2017 (UTC)
I've boldly created Wikidata:Item quality and filled it with this proposed scale so that we can iterate there. (No harm keeping the conversation here for now though.) Multichill, I'm a fan of adopting the enwiki language -- e.g. "Stub" through "Featured Item". But I'm not sure we can directly make use of the criteria for each level of the enwiki scale in Wikidata. We'll need to iterate on what we *mean* for each level of the scale. It's critical that we capture the core idea in the description of this scale and that we have general agreement among those who will help us do the labeling so that ORES can learn from a consistent set of assessments. --EpochFail (talk) 16:00, 30 January 2017 (UTC)
Multichill, thanks for the suggestion! I think I have added most of the criteria that you mentioned. In particular to the item description, I think we cannot rely on it to measure item quality because there are a lot of high quality items (e.g. showcase items) which have short description. --Glorian Yapinus (WMDE) (talk) 14:13, 31 January 2017 (UTC)
Glorian: Item description would be a boolean per language (either you have it or not). The more languages that have description, the better. Same as for labels. Multichill (talk) 16:17, 31 January 2017 (UTC)
Gotcha. Thanks Multichill! --Glorian WD (talk) 16:13, 1 February 2017 (UTC)
  • Instead of one quality indicator I would propose 3 indicators:
  • one about labels and descriptions (number of labels, number of labels with a description)
  • one about the number of statements with references (number of statement with a reference, number of statement with several references)
  • one about the number of critical statements according to the class of the item. In the case of people, having both birth and death date with corresponding locations is more valuable than having an item without any birth and death information but plenty of identifiers.
Snipre (talk) 16:16, 30 January 2017 (UTC)
Snipre, are you referring to 3 quality indicators for each quality grade? for instance, quality grade A have 3 indicators: one about labels and descriptions (number of labels, number of labels with a description), one about the number of statements with references (number of statement with a reference, number of statement with several references), and one about the number of critical statements according to the class of the item. --Glorian Yapinus (WMDE) (talk) 13:42, 31 January 2017 (UTC)
@Glorian Yapinus (WMDE): Yes, because contributors can have different objectives: some to work more on translations so they will try to improve items poorly translated in terms of label/descriptions, some want to improve the statement quality by adding good sources,... Snipre (talk) 18:32, 31 January 2017 (UTC)
Breaking that quality indicator into many indicators could enable us to add later more special indicators. For example an additional indicator about the quality of film items (have they many or few film data, ...) or biography items (born when and where, occupation, ...). --Molarus 23:40, 31 January 2017 (UTC)
I would say this is a very good idea. But, I think at this stage, we should start with one quality indicator and see how does it go. If it turns out to be a useful tool, we can try to implement 3 quality indicators. Nevertheless, I could consider to add "number of critical (expected) statements according to the item class" to the proposed quality criteria. Thanks for your feedback Snipre, Molarus! --Glorian WD (talk) 16:37, 1 February 2017 (UTC)
  • In terms of statements, one should also look at the precision/fine-grained-ness of the statements. For example for location-type statements, is the given location just a country, or is it a city.
It would also be good to think more explicitly about completeness -- if an item includes administrative entities, are all the relevant entities included?
These kind of metrics need to be developed for more properties. Jheald (talk) 17:44, 30 January 2017 (UTC)
Could you give me a specific example of what do you mean with "for location-type statements, is the given location just a country, or is it a city"? I agree that we should take into account of relevant properties in items. I would consider about that. Thank you Jheald. --Glorian WD (talk) 16:43, 1 February 2017 (UTC)
The classic one is located in the administrative territorial entity (P131), pointing to something other than the most detailed applicable level. For example, at the moment for Eiffel Tower we have P131 -> 7th Arrondissement, which is good. But other monuments may have P131 -> Paris or even P131 -> France. So one question we should be trying to formulate, and track, is: are the values as detailed as we would hope for? Jheald (talk) 17:45, 1 February 2017 (UTC)
Got it. Thanks! --Glorian WD (talk) 20:10, 1 February 2017 (UTC)
  • Well, wonderful. But what does it bring us and how does it add value. It is in my opinion something that adds little. I do not care about "categories" I care about quality. This is a stand alone construct that has not much to do with quality, Thanks, but no thanks GerardM (talk) 22:33, 30 January 2017 (UTC)
Hi GerardM, I think if we can grade the item quality, it paves the way to develop a tool for quickly discovering low quality items. This allows those items to be immediately fixed (i.e. improve the item quality). Ultimately, we can increase the quality of items in Wikidata. --Glorian Yapinus (WMDE) (talk) 15:32, 31 January 2017 (UTC)
Check out our statistics. We have 2,609,121 items with no statements. We have 5,667,036 with only one statement. Please focus attention on the things that matter. THIS is where quality is lacking. Thanks, GerardM (talk) 18:44, 31 January 2017 (UTC)
  • I think it would be more useful to have a tool to recognize which items are lacking attention... I see little point in giving a score to an item, but perhaps it would be more useful to recognize which items have potential for growth. It could be a metric based on item page views, sitelinks, number of statements and labels.--Micru (talk) 15:05, 31 January 2017 (UTC)
Hi Micru, if we already able to grade item quality, then I think it is possible to develop a tool for quickly finding low quality items (i.e. items that have the potential to be improved further). --Glorian Yapinus (WMDE) (talk) 15:23, 31 January 2017 (UTC)
Finding an item that can be improved further is quite easy given that most items are in that class. Could you describe a user story of how you imagine this is going to be used? ChristianKl (talk) 17:21, 31 January 2017 (UTC)
  • +1 to Micru; the proposed criteria would produce ratings which strongly correlate with the absolute importance of the entity described, since the number of statements and/or sitelinks naturally favors important concepts. It would be valuable from an editors point of view if fairly good grades (but perhaps not excellent grades) can be achieved for less important concepts and items as well. We’d otherwise just calculate relevance, not quality. The number of sitelinks is perhaps the best “external” measure of absolute relevance, so maybe we should use it for weighting of a rating, not as an additive input. —MisterSynergy (talk) 15:30, 31 January 2017 (UTC)
I think that we're using a wrong and partial definition of "quality". With the exposed points we're esentially measuring, in a broad sense, completeness, a quality dimension. In my opinion, we can't say that we're fully measuring quality if we're not measuring consistency.
Even to measure completeness in this broad sense, we should take into account that some statements using certain properties never, or hardly ever, need a reference, while it's terrible that some statements using certain other properties (such as charge (P1595) or date of death (P570)) have no references. As well, not having a label/description in English is much worse than not having it in Kölsch (with all my respect to Kölsch).
Thanks for your help! --abián 16:57, 31 January 2017 (UTC)
PS: I like the idea of classifying items in degrees of completeness. ;-) --abián 17:06, 31 January 2017 (UTC)
  • Could you explain why you think building a tool to automatically grade items is a valuable usage of Wikimedia development resources? It feels to me like there are many more important priorities like those on the community wish list. ChristianKl (talk) 17:21, 31 January 2017 (UTC)
    Indeed, I absolutely agree, there are other priorities which are more important; for example, detecting and preventing constraint violations and vandalism, which are expected to grow over time without a large community that can watch and avoid them. However, Glorian Yapinus is a student who may want to work on completeness, and it's better to have Glorian helping Wikidata by measuring its completeness than not to have Glorian helping Wikidata, I appreciate his decision and thank him for his help. --abián 17:58, 31 January 2017 (UTC)
Okay, I didn't know that Glorian is a student. Given that he's listed as WMDE I thought of him as a normal employee. ChristianKl (talk) 22:30, 31 January 2017 (UTC)
  • (EC) Wow! Lots of skepticism here. Well, first of all, the article quality prediction models for Wikipedia have been a break away success. "Completeness" is a good way to think about what these models capture. It's incredibly useful to know how complete a Wikipedia article is. People use them to route edit recommendations work (e.g.), measure the completeness of subsets of the wiki (e.g. slides), and to help students know when their article drafts are ready to be published in the main namespace (see the WikiEdu dashboards. I can't compare the urgency of building this model as opposed to working on other wishlist items, but I think this is likely to be highly valuable and we're actually positioned to execute on it. See m:ORES. We've already built and deployed models to help catch vandalism in Wikidata (enable the mw:ORES review tool in your beta features). Now, we're working to bring Wikidata support in ORES to the same level as some of the big Wikipedias. This quality model is the next step. --EpochFail (talk) 18:03, 31 January 2017 (UTC)
    yes, break away success until it makes conclusions that paradoxically challenge sacred cows, like "be nice to vandals" then it's, "what POV cruft is that." unfortunately with ritualistic behavior, facts don't matter. as we see here, they did not invent it so who is this Epoch cat, and who is WMDE to tell us what to do? lol. Slowking4 (talk) 02:00, 2 February 2017 (UTC)
  • Having an algorithm to decide whether an article draft is ready to be published to the Wikipedia main namespace is useful but there's no equivalent of draft items in Wikidata. In Wikipedia an article mainly stands on it's own. If you add information about the spouse of a Bob in Wikipedia, you can simply write in the article that Alice is the spouse. In Wikidata you create an extra item for Bob. If that extra item has little statements, the proposal here suggests it get's flagged as a low-quality item that needs improvement.
If I have a project in Wikipedia to improve the coverage of human muscles than I might be interested in having a list of low quality articles about muscles. On Wikidata it works differently. I might instead look at a property like innervates (P3190) and work through adding information about it to our muscle items.
When it comes to adding translations the process is similar. I don't decide that a certain item doesn't have enough translations and seek to translate the specific item into 10 new languages. I rather look at a group of items that are missing labels in a language I know and add the data.
When it comes to the data that Wikipedia imports Wikipedia also doesn't directly care about item quality. It cares about statement quality. It cares about the individual statements providing true information that's referenced. An idea to judge Wikidata items quality by simply quantify metrics wouldn't help at all with those concerns.
In general quality data is data that's correct. Measuring quantity of statements in an item doesn't measure the quality. Measuring quantity is easier than measuring the actual quality but there are possible negative effects when we pretend that quantity and quality are the same thing. The proposed quality metric would call it quality to import a lot of low quality data from low quality databases provided the low quality database is referenced. ChristianKl (talk) 22:24, 31 January 2017 (UTC)
Thanks for your opinion ChristianKl. So if I am understand you correctly, you think that quality is more to the accuracy of the data (i.e. whether the data is correct) than the quantity of data in items right? --Glorian WD (talk) 16:00, 1 February 2017 (UTC)
On of the goals of Wikidata is to have data that the individuals Wiki's want to import. I think that some of the data I added about "innveration" is higher quality than the data at some Wiki because it's more precise. It contains information about which branch of a nerve innervates a muscle and not only the name of the general muscle. I consider being more precise a sign of quality.
If you look at the enwiki discussion we have lately their core data quality complaints are lack of references and false data. While there are other use cases of Wikidata then Wikipedia imports, it's an important use case for many contributors and your quality metrics don't help with optimizing for Wikipedia imports.
If I look at the field of protein databases I would say that TREMBL has higher quantity and Swissprot has higher quality. Swissprot quality comes through it being better curated.
Instead of looking at item quality it might be worth to try to estimate property quality for a domain. If Wikipedia discusses whether to integrate a certain property into a template that's a decision where quality has to be assessed. Is the data in question of high enough quality to provide an added benefit for Wikipedia? ChristianKl (talk) 17:13, 1 February 2017 (UTC)
Swissprot has curated data true. But it does not make its data of higher quality. Swissprot has a backlog that is huge. It is not up to date. My point is not to slag Swissprot but to break your argument. Quality is not only because of its curation but also because of completeness. If Wikidata is to make a difference it is particularly important to import and curate data. It is best done by comparison with other sources. Quality is in having a practical relevance to all the sources that connect to us. Including Wikipedias including Swissprot. Thanks, GerardM (talk) 07:00, 2 February 2017 (UTC)
precision is different from accuracy or quality. [2]. it is unclear what the quality of wikidata necessary to merit inclusion among some communities. they seem to say they will only include with a reference, which is verification, but not quality. Slowking4 (talk) 02:07, 2 February 2017 (UTC)
What I have observed is that inclusion is arbitrary. It has to do with the agenda of people and the faulty perception that Wikidata is superior. It is not; we do our own thing. Thanks, GerardM (talk) 07:28, 7 February 2017 (UTC)
  • E should be a much more definite level: no information. This will help stretch and give definition to the other categories. --Izno (talk) 14:37, 7 February 2017 (UTC)

Quality but differently[edit]

When an item is to have quality, there are a few things to consider. First of all many items are associated with articles. Articles link to other articles and consequently these articles are related. When these article links are associated with statements in an item, the item is a reflection of the article and, that is quality.

Often items are associated with lists or categories. It would be good to consider what such a list or category stands for and, it is great when all the articles, the items are reflected when you research such an item. This is particularly true because fairly often the links in a project do not connect properly. When a list is completely available in Wikidata, there may be both red links and wikilinks that are associated with the list. Quite often different language editions have a different set of articles that are associated. Wikidata brings this all together.

Quality is not so much associated with items in Wikidata but in the relations with other items. When we seek to find items that are of high quality, we should concentrate on these relations because this is how we can even help Wikipediass to do better. Thanks, GerardM (talk) 12:12, 1 February 2017 (UTC)

Wikimedia Foundation is hiring community members as strategy coordinators[edit]

Hello all! At the moment, the Wikimedia Foundation is hiring 20 contractors - 17 strategy coordinators for specialized languages and 3 Metawiki coordinators. I was am posting this on your noticeboard to reach out to any community members who would be both interested in being a part time contractor with us for three months and a good fit for any of the movement strategy facilitation roles. Even if you are not personally interested in the position, we would appreciate your assistance in encouraging community members to apply, either individually or with local wiki announcements. You can find the Job Description for the position at this page. There is a less-formal description of the tasks they would be working on here on Meta. Kbrown (WMF) (talk) 19:08, 6 February 2017 (UTC)

Alexa rank[edit]

Should values for Alexa rank (P1661) with no date qualifier be removed? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:42, 7 February 2017 (UTC)

There are only 10 claims with this problem (assuming this quickly built query is okay). This could practically be fixed manually, at least if a source is provided. If not, I would recommend to remove the value. —MisterSynergy (talk) 16:02, 7 February 2017 (UTC)

United States[edit]

I propose changing the English label for United States of America (Q30) to United States, for consistency with the English/Simple article names as well as common usage. Nikkimaria (talk) 00:16, 7 January 2017 (UTC)

I am not sure this is a good idea. It may well be that "United States" is unambiguous for native English speakers, but there are plenty of non-native speakers who use Wikidata. Consistency with the article names does not seem any kind of argument: plenty of cases where the label is (and must be) entirely different from an article name. - Brya (talk) 08:07, 7 January 2017 (UTC)
It is fine to add an alias but it is the official name. Following the names of any Wikipedia is an extremely bad idea. In many cases there is no proper fit between the article and the item. Thanks, GerardM (talk) 08:11, 7 January 2017 (UTC)
@Brya: Help:Label indicates that we should use "the smallest unit of information that names an item" even if that unit is ambiguous, and that we should use the most common name - both of those support the change. Further, @GerardM: it indicates that we should consult the corresponding Wikipedia page for guidance on what the most common name is. It doesn't say to use official name at all. Nikkimaria (talk) 13:18, 7 January 2017 (UTC)
Well, the Help:Label page is beginning to look a little out of date, but even Help:Label is less strict than you make it appear, to quote:
"Wikimedia page title may give orientation
To figure out the most common name, it is good practice to consult the corresponding Wikimedia project page (for example, the title of a Wikipedia article). In many cases, the best label for an item will either be the title of the corresponding page on a Wikimedia project or a variation of that title. [...]"
Brya (talk) 13:44, 7 January 2017 (UTC)
Yes. Any good reason not to do that in this case? The guidance of that page in sum supports "United States" much more strongly than it does "United States of America". Nikkimaria (talk) 17:05, 7 January 2017 (UTC)
@Brya: Nikkimaria (talk) 13:56, 8 January 2017 (UTC)

There are so many instances where the Wikipedia article is just a choice to allow for disambiguation that the notion that it is the best fit is plain wrong in practice. Thanks, GerardM (talk) 14:30, 7 January 2017 (UTC)

I suggest you raise that general point at Help talk:Label, but in this particular case that is not a concern. Nikkimaria (talk) 17:05, 7 January 2017 (UTC)
We still have United States of Brazil, so that the United States is ambiguous.--Ymblanter (talk) 20:52, 7 January 2017 (UTC)
@Ymblanter: And per Help:Label we resolve potential ambiguity by using the description, not by extending the label. Nikkimaria (talk) 13:56, 8 January 2017 (UTC)
You're right that we don't have to include disambiguating information in the label, but this isn't just extending the label, "United States of America" is a name in actual use, e.g. on coins, notes, passports, so they're both valid names. Looking at other countries, we're pretty inconsistent in whether we use official names or the common shortened version... - Nikki (talk) 15:14, 8 January 2017 (UTC)
The most common name is what we're meant to be using, which here is "United States". Nikkimaria (talk) 17:04, 8 January 2017 (UTC)
@Nikki: Nikkimaria (talk) 01:15, 10 January 2017 (UTC)

@Nikkimaria: What advantages do you expect to be gained by this change? [Also, please note the section, above, #Nikkimaria, where you were pinged.] Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:05, 7 January 2017 (UTC)

In addition to being more in tune with the guidance for labels, this change would provide advantages for the use of Wikidata on other projects. (Hm, for some reason I did not get that ping...) Nikkimaria (talk) 21:09, 7 January 2017 (UTC)
"The advantages are that it would provide advantages"? Please be more specific. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:39, 9 January 2017 (UTC)
It would ensure that data passed through from Wikidata is consistent with the terminology in use on our major English-language projects, to begin with. Nikkimaria (talk) 01:15, 10 January 2017 (UTC)
Is there any reason not to follow the guidance of Help:Label in this case? Nikkimaria (talk) 02:14, 11 January 2017 (UTC)
It is guidance and not every page and every answer fits within general guidance. Present a decent case about why the change is valuable, not ducking and weaving with indirect reference to a page. In this case, I have specifically said to you that people may wish to use the label through the modules. You have not presented a reasoned case about why your change is better, and I dispute your unsubstantiated and personal opinion that it is advantageous. We are talking about the country, not its possessive use, so please don't try the fake and vague "consistency argument." We use United Kingdom for the country, and British for the people, and so on. The country is the United States of America and there are the other regular uses United States, USA, American, ... depending on context of the usage required use away in the aliases.  — billinghurst sDrewth 13:14, 11 January 2017 (UTC)
(a) Yes, it is guidance, but unless we have good reason to do otherwise we should follow it. (b) And as I've told you, use of the label through the modules is a good reason to use the more common term as the label, as this will avoid having to override the value in multiple locations. "United States" is appropriate in more cases than is "United States of America", on projects that use the former, in templates or tables where space is at a premium, etc. (c) To summarize: Such a change would be in line with the guidance of Help:Label as well as both common usage and the terms in use in our major English-speaking Wikipedias. (d) What are you talking about with "possessive use"? We use "United Kingdom" for the country, not "United Kingdom of Great Britain and Northern Ireland", because although the latter is the official name the former is the common name.
So now, what is your "reasoned case" for using the less common label, despite the guidance not to and the practical implications of this choice? Nikkimaria (talk) 02:38, 12 January 2017 (UTC)
@Billinghurst: Nikkimaria (talk) 13:58, 13 January 2017 (UTC)
Many people do disagree with you. You insist on something that is imho and in the opinions of others arguably wrong and you get angry when people disagree with you. Why? Thanks, GerardM (talk) 05:38, 12 January 2017 (UTC)
I'm not angry, I'm simply asking that you and others provide a reason why following Help:Label and common usage is "arguably wrong". Nikkimaria (talk) 13:03, 12 January 2017 (UTC)
@Billinghurst: Do you have such a reason? Nikkimaria (talk) 23:52, 19 January 2017 (UTC)
@Brya, Billinghurst, Multichill, Nikki: Does anyone? Nikkimaria (talk) 00:29, 22 January 2017 (UTC)
Nobody agrees with you. Let it go. - Brya (talk) 05:44, 22 January 2017 (UTC)
@Brya: You're welcome to disagree with me, it'd just be nice if you had a good reason for doing so. Do you think Help:Label should be changed? If so, to what? If not, why not apply it here? Nikkimaria (talk) 00:36, 23 January 2017 (UTC)
Does anyone else have answers to these questions, or should we go ahead and make the change? Nikkimaria (talk) 00:51, 29 January 2017 (UTC)
No change. The good reason is "there is no consensus for the change". Can we move on yet?  — billinghurst sDrewth 10:08, 29 January 2017 (UTC)
@Billinghurts: Consensus is based on policies/guidelines/rationales, not voting. The "general guidance" in this case does not support your position, as explained above. Do you have a response to the questions to you above? Nikkimaria (talk) 13:46, 29 January 2017 (UTC)
Fix ping: @billinghurst: Nikkimaria (talk) 13:48, 29 January 2017 (UTC)
I believe that I have a pretty good knowledge, and demonstrated application, of the practice of consensus at Wikimedia. I believe that there is no consensus in this community for the change(s) that you propose. Personally, I generally try to express my point of view once, clarify later if needed, so as not to bore people with repetition, nor wish to be ignored through unnecessary dogmatism. While I am not always successful, I continue to try that as a practice, and it is my experience that it is respected and preferred.  — billinghurst sDrewth 21:43, 29 January 2017 (UTC)
@billinghurst: Well, in this particular case, I would prefer if you responded to the questions I posed above. I assure you I will not be bored by an actual rationale to not follow Help:Label or to change it. If there is no consensus in this community for that guidance page, it should be updated, lest others be misled into thinking there is. Nikkimaria (talk) 01:23, 30 January 2017 (UTC)
Hoi, Dear Nikkimaria, you have been told by multiple persons that they do not agree with your point of view. Your attitude is one where you want to force the issue. You are being aggressive and it is not appreciated. Thanks, GerardM (talk) 14:30, 29 January 2017 (UTC)
Dear @GerardM: I don't intend to force anyone to agree with my point of view, but the documentation of Help:Label should match what the community wants to implement, and it appears from the disagreement here that it does not. Nikkimaria (talk) 01:23, 30 January 2017 (UTC)
And you want to make a "should" a "must". You are repeating yourself and it is not welcome. I am particularly glad that we have no policy wonks like he once that ossify the English Wikipedia. Thanks, GerardM (talk) 06:37, 30 January 2017 (UTC)
If you feel that way, propose changes that will have the talk conform with our practice. Do not push change where it is not wanted. Thanks, GerardM (talk) 06:41, 30 January 2017 (UTC)
@GerardM: What exactly is "our practice", in your opinion? As mentioned above, we are inconsistent in whether we use common or official name, whether we follow article names or not, etc. Nikkimaria (talk) 02:29, 31 January 2017 (UTC)
It is inconsistent yes. We do not have good technology to use appropriate labels. When we are to standardise on logos, we need the technology to manage what is opportune. We do not have it and as far as I am concerned the status quo is to be preferred for arbitrary rules. Thanks, GerardM (talk) 06:45, 31 January 2017 (UTC)
@GerardM: So, in your opinion, "our practice" is that whomever adds a label decides from their own head what they think it should be, and then it never changes? They need make no reference to any guidance page or source? Nikkimaria (talk) 02:58, 1 February 2017 (UTC)
Did you read my answer. We cannot do a proper job on labels. It is deliberately made "easy" and consequently there is no way that a policy or whatever makes sense in all cases. In recognition of this sub standard situation your notions do not function as you imagine. No, I do not need a policy to tell me this. Thanks GerardM (talk) 05:58, 1 February 2017 (UTC)
@GerardM: I read your comment, and it seems to be consistent with my interpretation of it above - you think there should be no policy, and thus that whomever happens to add the label can do whatever they please. In that case you would argue that Help:Label should be deprecated? Nikkimaria (talk) 02:59, 2 February 2017 (UTC)
I personally prefer “United States of America” to differentiate ourselves from “United Mexican States,” “United States of Brazil,” etc. Bwrs (talk) 17:37, 1 February 2017 (UTC)

@Nikkimaria, Brya, GerardM, Ymblanter, Nikki, Pigsonthewing: this discussion is about the English label but FYI, in multiple other languages, the short/common/expected name is already use (at least French, German, Spanish, etc.). Cdlt, VIGNERON (talk) 18:58, 1 February 2017 (UTC)

Yes - another reason to change the English one. Nikkimaria (talk) 02:59, 2 February 2017 (UTC)
Given our policy I support changing it to the shorter name. I don't think it's likely that non-native speakers get confused by the shorter label. ChristianKl (talk) 12:11, 8 February 2017 (UTC)

You can now download a query result in SVG[edit]

download results menu

Hello all,

As you may know, when you run a query on the Query Service, you can download the result in different open formats (JSON, TSV, CSV). We just added a new button that allows you to download your graphs as a SVG image! Then with a software such as Inkscape (Q8041) you can easily modify the colors of the graph or make a nice picture to tweet with your query.

Note that this feature works with all the views except table, image grid, timeline, graph builder, map, and graph. If you encounter a problem, you can add a comment on the Phabricator task.

Lea Lacroix (WMDE) (talk) 15:32, 31 January 2017 (UTC)

Thanks to the developers for this: I really like the idea of sharing, say, a bubble chart as a file in its own right, rather than doing a screen capture. Would it be a small step to do this with graphs visualisations as well? Those, I find, benefit the most from clean-up in Inkscape. MartinPoulter (talk) 15:38, 7 February 2017 (UTC)
@Lea Lacroix (WMDE): Can you please file a ticket for that? --Lydia Pintscher (WMDE) (talk) 18:08, 7 February 2017 (UTC)
Done :) Lea Lacroix (WMDE) (talk) 13:16, 8 February 2017 (UTC)

Wikidata weekly summary #246[edit]


Please be careful with the last monthy task (merge via unique value constraint violations of GeoNames ID (P1566)). The bot-generated articles in cebwiki and svwiki contain a lot of mistakes particularly in connection with GeoNames, so mergers cannot be done straightforward without an individual in depth check of both items. —MisterSynergy (talk) 16:26, 7 February 2017 (UTC)

  • "Fixing a number of keyboard navigation issues based on your feedback" could you be more detailed about what changed as far as keyboard navigation goes? ChristianKl (talk) 11:37, 8 February 2017 (UTC)

APS number[edit]

This is the number ascribed to a periodical in the American Periodical Series microfilms. For example The Knickerbocker is 949. Do we have this property? All the best: Rich Farmbrough21:34, 7 February 2017 (UTC).

@Rich Farmbrough: A search for "P:APS" (without quotes) finds nothing. A search for "P:American Periodical Series", likewise. (That's the best pattern for finding properties.) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:55, 7 February 2017 (UTC)

Language of image legends for taxa[edit]

I've added an image, with a legend, to Menida (Q4043977). For now, I've said the language of the latter is Latin, but this is not strictly correct. How can I specify "no language"?

Also, might we one day be able to italicise a legend? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:17, 8 February 2017 (UTC)

I think "mul" is the best language code to use atm. Sjoerd de Bruin (talk) 12:23, 8 February 2017 (UTC)
Help:Monolingual text languages gives a few other alternatives. How about zxx "no linguistic content, not applicable"? -- Innocent bystander (talk) 18:00, 8 February 2017 (UTC)
Thank you; that's preferable, if not perfect - I didn't know it was available to us. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:28, 8 February 2017 (UTC)

Problem with HarvestTemplates imports[edit]

Yesterday, I used my Pigsonthewing-bot account to import values for World Checklist of Selected Plant Families ID (P3591), from |wcsp= in en:Template:Taxonbar. From over 17K templates, it found only ~23 values (example1 with this target page; value fetched from en:Eithea).

Today, User:Succu reports that "I had to remove almost all imported values because they are wrong.". (See example 2 - same item as example 1.)

Is there a bug in HarvestTemplates? Or should we stop importing values from Wikipedia due to improper use of templates there? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:53, 8 February 2017 (UTC)

Better stop importing data from Wikipedia. We have now enough criticism about the low quality of WD statements (see the latest discussion on WP:EN). Snipre (talk) 17:15, 8 February 2017 (UTC)
This does not look like a technical bug in HarvestTemplates. As far as I can see there are two problems:
  • It is well-known that “Succu’s items” about taxa (I hope this is the correct term here) have high quality, which also applies for their identifiers. A simple imported from (P143) “reference” is apparently substandard in this field, but this is the best HarvestTemplates can do (and good enough for may other fields). In your given example Succu improved that with a “standard reference”. It would be advisable to my opinion to let experienced editors in a given field do the import, and if no editor touches a property for a while you can start imports as a non-expert somewhat later. Succu took notice of the property proposal procedure.
  • Besides that there were indeed many wrong values, such as in this case. These wrong values are indeed contained in enwiki, for whatever reason. In a considerable fraction of the affected cases the imported identifier is identical to the Wikidata-Qid without the Q. I have no clue what happened here. It would help to check values for correctness before an import can start, at least some of them.
I guess you can be glad that only 17 value were found. Your import could’ve been much worse. —MisterSynergy (talk) 18:31, 8 February 2017 (UTC)
(ec) I agree. Some of the values in the harvested template are completely nonsens, e.g. en:Paeonia clusii has 7124094. --Succu (talk) 18:35, 8 February 2017 (UTC)

Label of Taiwan (Q865)[edit]

In light of discussions about United States of America (Q30)'s label above, I think it would be a good chance to talk about labels of Taiwan (Q865). As per numerous discussion being held on English Wikipedia regarding the entry's name, its is clear that Taiwan is picked as the current article name on English Wikipedia because it is the most well known name of the entry although it is not precise enough. As per discussions above, I think wikidata would have priority different from wikipedia and thus would favor a more precise despite a bit more mouthful name, Republic of China (Taiwan) ? C933103 (talk) 10:59, 3 February 2017 (UTC)

You might want to see an unarchived User talk:Neo-Jay discussion (visit that page and then Ctrl+F type "Q865"). --Liuxinyu970226 (talk) 13:09, 3 February 2017 (UTC)
I could almost see this unsolvable mess happening when the English Wikipedia renamed the articles. Oh well. ¯\_(ツ)_/¯ Deryck Chan (talk) 11:55, 7 February 2017 (UTC)
Why is Taiwan not supposed to be precise? What other entities are also named Taiwan? ChristianKl (talk) 08:08, 9 February 2017 (UTC)
@ChristianKl: fwiw, Taiwan Island (Q22502), Taiwan Province (Q57251), Taiwan (Q229312), Republic of Taiwan (Q716489), Software performance testing (Q1982529), Pulse-frequency modulation (Q2066418), Taiwan (Q18112781), Category:Taiwan (Q7087450) --Liuxinyu970226 (talk) 09:09, 9 February 2017 (UTC)
I don't see much room for confusion in that list. ChristianKl (talk) 09:32, 9 February 2017 (UTC)

Handling mistakes in imported data[edit]

User:Pasleim/Implausible/age lists items that have mistakes in dates of birth/death. They include vandalism, typos, bad merges but also bulk data imports. For instance, Jan Nicquet (Q21522558) died before he was born and does have a (reliable) source for that.

Do we have a process how to handle wrong data even if it's obvious that it's wrong? We have ranks but how do we know that some data is not correct if we don't know what is correct? (In this case, which date should be deprecated? Or both?) Matěj Suchánek (talk) 15:59, 6 February 2017 (UTC)

Deprecate both, this particular database entry is obviously not at all reliable. For each and every claim we have to judge whether it is “correct”, and this process is not always as easy as in this case. If the source data is not as faulty as in this case and if there are no contradictory claims in other sources, we just believe what the sources are saying. For that reason we don’t collect “correct” data (we don’t know that for sure), but we collect referenceable data. —MisterSynergy (talk) 16:34, 6 February 2017 (UTC)
No, inform the source so they can fix it. That's what I've done here. In this case these claims are supported by two books ([3] & [4]). In the next couple of weeks someone will walk downstairs into the library, have a look at the source and correct it.
RKDartists ID (P650) and RKDimages (P350) are curated databases of high quality, but curated by humans so mistakes happen. The Netherlands Institute for Art History (Q758610) has been very responsive and their curators have been investigation quite a few odd things we found. Multichill (talk) 20:36, 6 February 2017 (UTC)
It's really nice to see that some data maintainers take care of their database like this. I wonder how many other data providers are that reachable and where we could keep this information. Think of a property like bug tracking system (P1401), for data donators. But there is a question if we are able to express the realtion between a piece of information with the reference and the donator. Matěj Suchánek (talk) 14:57, 7 February 2017 (UTC)
we are building relationships with GLAMs. they are unaccustomed to the quality control that a wiki can provide. they are used to paper methods. but it will be food for GLAM case study. Slowking4 (talk) 05:00, 9 February 2017 (UTC)

Quantity on ART UK links[edit]

Multichill (talkcontribslogs) has asked me to seek some wider thoughts on the use of quantity (P1114) as a qualifier on Art UK artist ID (P1367), to indicate the number of paintings currently showing in the Art UK database associated with that particular Art UK identifier: see eg diff

I'm keen to have this because it seems to be valuable information to have in its own right; also potentially valuable for sorting and filtering; and, fundamentally, because I want to include it in a template for use on en-WP, where very often at the moment it is given as part of the text in the external links section: eg "133 Paintings by en:Titian at the Art UK site" -- giving the reader an indication of how much they're likely to find if they click the links.

Briefly, it seems to me there are two questions: where and how to add this information. (Okay maybe also a third: if we should add it, but I hope I've already spoken to that).

  • Where: To me it makes sense to add the information as a qualifier on P1367, rather than (say) on the main item, because the datum relates specifically to the number of paintings presented for the particular identifier. (Currently there are about 50 artists with two different identifiers, linking to two different pages at Art UK, with two different sets of paintings on them. It's useful to track these separately, eg to link to them separately). So to me it seems to make sense to store the information as some qualifier on the value of P1367.
  • How: Which property, in that case, to use? quantity (P1114) seemed the most generic, for a "quantity, total number, number of instances, number, amount, total" as its list of (English-language) equivalent names specifies for it -- in this case to record a "number of instances" of painting records attached to this identifier.

Of course, one could also create some special-purpose subproperty of P1114 specifically for this kind of use.

But for the moment, I'd quite like to get on, so that the data can be in place and complete and usable; and also, because I'd like to get a template rolled out using it on en-wiki.

If desired, it would seem to be a standard enough bot job at any time in the future to migrate the information from P1114 to any new more specific property that might be created for the purpose.

So I'd be grateful for quite prompt feedback. Firstly, would it be okay to finish the Quick Statements run to complete populating the P1114 qualifiers? (Currently about 8500 have been done out of a total 20,000 or so). Secondly, looking further forward, is quantity (P1114) the appropriate property to use for this qualifier, or would it make sense to create some new more-specific subproperty?

Follow-ups, please, to Property talk:P1367.

Thanks very much, Jheald (talk) 22:13, 6 February 2017 (UTC)

  • I don't think that the existing property would be understood. I would prefer to create a new property. ChristianKl (talk) 12:04, 8 February 2017 (UTC)

Update: @ChristianKl: I have proposed a new property specifically for this role, to take over from quantity (P1114), at Wikidata:Property_proposal/Authority_control#match_count. (Feel free to suggest a better name). In the meantime, I have resumed the run with Quick Statements to fill out the information using quantity (P1114) for the time being. Jheald (talk) 15:54, 9 February 2017 (UTC)

Concern about property creation by GZWDer[edit]

@GZWDer: created Wikidata:Property_proposal/BLF ID . As it stands there was an open an unanswered question by myself about the name of the property. I generally think that it isn't good when properties get created while there are still unresolved issues in the property proposal page.

Wikidata:Property_proposal/Danish_ancient_monument ID is similar. There's concern about the stability of the ID.

As it stands I question whether it's good that GZWDer has property creation rights. ChristianKl (talk) 16:59, 7 February 2017 (UTC)

  • Sorry for not noticing the question you asked.
  1. For BLF ID, there're consensus to create this property, and the original proposed name is valid. You may change the label per BOLD.
  2. For Danish ancient monument ID, Let ping @Alicia Fagerving (WMSE): for comment.

--GZWDer (talk) 17:11, 7 February 2017 (UTC)

I think in general the policy for property creation should be to check whether all issues brought forward on the page are resolved and not only whether they are upvotes. ChristianKl (talk) 17:42, 7 February 2017 (UTC)
I don’t think that the exact “property name” (here: English label) is relevant for the decision of the property creator whether or not a property should be created. Just go ahead and fix the label. In the other case I am not sure why Finn Årup Nielsen is concerned about “identifier” stability, but at a glance it looks as if this is a similar “identifier” as many others. —MisterSynergy (talk) 17:12, 7 February 2017 (UTC)
The creation of BLF article ID (P3595) is fine. I might have waited for Danish ancient monument ID (P3596), but the question from Finn Årup Nielsen was preceded by a "support". An unqualified rule of "check whether all issues brought forward on the page are resolved" is not sensible, as it would allow a single vexatious editor to operate an embargo. Calling for the removal of the property-creator flag, especially without prior discussion, is overkill. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:57, 7 February 2017 (UTC)

Given the circumstances I agree that it was within GZWDer's remit to decide to create these properties and calling for creator right removal is overkill. Property names are editable. If it turns out in the future that these identifiers are unstable, we can head over to Wikidata:Properties for deletion, as has been done for a number of properties which referred to external databases that had later been shut down. Deryck Chan (talk) 12:35, 8 February 2017 (UTC)

I created the proposal and was perhaps the one not replying before the creation. I now changed the label using my everyman's rights to "BLF article ID". And I wish to thank GZWDer (talkcontribslogs) for being helpful with the creation. – Susanna Ånäs (Susannaanas) (talk) 09:33, 9 February 2017 (UTC)

Native label[edit]

From here:

How I can add native label ?  – The preceding unsigned comment was added by Mr. Nijwmsa Boro (talk • contribs).

Using bots to programmatically feed (new) categories[edit]

Kopiersperre Jklamo ArthurPSmith S.K. Givegivetake fnielsen rjlabs ChristianKl

Pictogram voting comment.svg Notified participants of WikiProject Companies


I am asking info to comply with Wikidata policy and technical feasibility.

  • Is it possible to use bots to index information from websites programmatically to Wikidata?

This means scraping entity names and entity properties from the web and feed a collective repository.

  • What will be the source for new entities?
  • Category I would like to add is companies and startup ecosystem, adding factual information on stake-holders (investors, people). These is public information.
  • Please instruct on policy for running bots.

Anyone could suggest P2P web crawlers?

  • Can new entities be fed in programmatically as Special: new items, and then curate their organisation?

As example, I don't know if they could already fit in Wikidata:WikiProject Companies and properties (e.g.investors) may be a new category.

  • How to programmatically feed new properties that may not exist in a project, or that may vary in time?

As example, Assets, Number of employee. --Micru (talk) 21:46, 24 August 2014 (UTC) Tobias1984 (talk) TomT0m (talk) Genewiki123 (talk) Emw (talk) 03:09, 9 September 2014 (UTC) —Ruud 16:15, 9 December 2014 (UTC) Emitraka (talk) 14:32, 14 October 2015 (UTC) Bovlb (talk) 19:10, 21 October 2015 (UTC) Peter F. Patel-Schneider (talk) 22:21, 23 October 2015 (UTC) ArthurPSmith (talk) 15:51, 5 November 2015 (UTC) --Daniel Mietchen (talk) 20:53, 3 January 2016 (UTC) --Harmonia Amanda (talk) 22:00, 27 February 2016 (UTC) --Lechatpito (talk) --Andrawaag (talk) 14:42, 13 April 2016 (UTC) --ChristianKl (talk) 16:22, 6 July 2016 (UTC) --Cmungall Cmungall (talk) 13:49, 8 July 2016 (UTC) Cord Wiljes (talk) 16:53, 28 September 2016 (UTC) DavRosen (talk) 23:07, 15 February 2017 (UTC)

Pictogram voting comment.svg Notified participants of WikiProject Ontology  – The preceding unsigned comment was added by Gg4u (talk • contribs) at 09:34, 9 February 2017‎ (UTC).

Yes, it's possible to import data with bots into Wikidata. is an example of a bot that imported a lot of biochemical data into Wikidata and does this well.
In general it's very useful to have the ID's of the database from which the data is imported. We have CrunchBase organisation ID (P2088), CrunchBase person ID (P2087), Bloomberg person ID (P3052), Bloomberg private company ID (P3377) and Angel List ID (P3276). For other databases it would make sense to create a proposal for new properties.
Information that changes in time can be qualified with start time (P580), end time (P582) and point in time (P585)
Our policy for bots is: Wikidata:Bots ChristianKl (talk) 09:53, 9 February 2017 (UTC)


Boating (Q2141830) is stated as a subclass of competitive boating (Q1295912). Both are labelled, in English, as "boating". There are separate articles in French. Can a French-English speaker please provide better English labels? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:57, 6 February 2017 (UTC)

Hmm. en:Boating seems very similar to fr:Nautisme (encompassing both sport and leisure) while fr:Plaisance (loisir) seems to be purely "pleasure boating". I think the enwiki link is on the wrong wikidata item, and the label there should be changed to "pleasure boating". However, I don't know about the other wiki links. ArthurPSmith (talk) 16:51, 6 February 2017 (UTC)
@Pigsonthewing, ArthurPSmith: I propose we swap the "subclass" relation around and keep the interwiki links. Essentially Boating (Q2141830) is "pleasure / recreation / sport boating as opposed to commercial boat travel". French is the only Wikipedia that further disambiguates between "pleasure boating" (plaisance) and "competitive sport using surface vessels" (nautisme). Deryck Chan (talk) 12:51, 8 February 2017 (UTC)
I've done as I proposed above as there's no objection. I've also merged fr:Navigation maritime, which Boating (Q2141830) was a superclass of, into seamanship (Q351363). They cover very similar topics and had Q3337280 only had a French Wikipedia article. Deryck Chan (talk) 14:44, 14 February 2017 (UTC)
@Deryck Chan: Thank you. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:08, 14 February 2017 (UTC)
This section was archived on a request by: Deryck Chan (talk) 10:37, 15 February 2017 (UTC)

De-Recognition of Wikimedia Hong Kong[edit]

This is an update from the Wikimedia Affiliations Committee. Translations are available.

Recognition as a Wikimedia movement affiliate — a chapter, thematic organization, or user group — is a privilege that allows an independent group to officially use the Wikimedia trademarks to further the Wikimedia mission.

The principal Wikimedia movement affiliate in the Hong Kong region is Wikimedia Hong Kong, a Wikimedia chapter recognized in 2008. As a result of Wikimedia Hong Kong’s long-standing non-compliance with reporting requirements, the Wikimedia Foundation and the Affiliations Committee have determined that Wikimedia Hong Kong’s status as a Wikimedia chapter will not be renewed after February 1, 2017.

If you have questions about what this means for the community members in your region or language areas, we have put together a basic FAQ. We also invite you to visit the main Wikimedia movement affiliates page for more information on currently active movement affiliates and more information on the Wikimedia movement affiliates system.

Posted by MediaWiki message delivery on behalf of the Affiliations Committee, 16:25, 13 February 2017 (UTC) • Please help translate to your languageGet help

✓ Done marked
< - > end time (P582) See with SQID < 2017-02-01 >
--Liuxinyu970226 (talk) 10:02, 14 February 2017 (UTC)
This section was archived on a request by: Liuxinyu970226 (talk) 15:45, 15 February 2017 (UTC)

BnF ID assistance[edit]

Can someone please help me determine the correct BnF ID for Anacreontea (Q145973)? I can get this link to work, but can't get it to work using BnF ID (P268), and the documentation for that property indicates a long history of problems like this one, with no solution given. --EncycloPetey (talk) 20:02, 13 February 2017 (UTC)

✓ Done Thanks, DeltaBot. --EncycloPetey (talk) 20:46, 13 February 2017 (UTC)
This section was archived on a request by: Matěj Suchánek (talk) 14:54, 15 February 2017 (UTC)

Constraint violation warning[edit]


I just had an idea : wouldn't be more effective if a bot left a warning message on the editor's talk page after they add a statement violating a constraint ? The reason if that the constraint violation pages are a great tool but most editors never heard about it, a message on the talk page would probably be more effective.

Has anyone already thought of that ? Is it technically doable and socially desirable ?

Cdlt, VIGNERON (talk) 15:39, 1 February 2017 (UTC)

No, please, no more bot-messages on user_talks than necessary! It is terribly annoying already at Commons. Some other kind of message could maybe be an option. Like being "pinged" from somewhere. -- Innocent bystander (talk) 16:14, 1 February 2017 (UTC)
For example be pinged from the item's talk page. That's a good place for this message anyway. --Denny (talk) 17:27, 1 February 2017 (UTC)
@Innocent bystander, Denny: I must say I'm *not* a big fan of automated messages. Notification is a good idea to explore. Is it doable ? Cdlt, VIGNERON (talk) 17:54, 1 February 2017 (UTC)
Please don't use Talk pages for such a purpose: it makes it impossible to tell if there is actual content on it. - Brya (talk) 06:13, 2 February 2017 (UTC)
It can be done on Property talk page. --Infovarius (talk) 13:23, 2 February 2017 (UTC)
There is at least one Property Talk page which could not bear this. - Brya (talk) 18:54, 3 February 2017 (UTC)
@Brya: which one?
@Infovarius: how? can you do a test?
Cdlt, VIGNERON (talk) 15:30, 9 February 2017 (UTC)
Property talk:P225. - Brya (talk) 05:19, 10 February 2017 (UTC)

Death date *after* certain year?[edit]

When I handle biographies which state that the death date is 'not earlier than x' (like in ru:Балясников, Василий Фёдорович/Q4077141: 'not earlier than 1916'), it is clear that I should configure date of death (P570) as in this edit: date of death (P570) = unknown with qualifier earliest date (P1319) = 1916.

What value should I set for the same qualifier when date of death is said to be 'after x' (like in ru:Бибиков, Иван Степанович / Q4086234: 'after 1906')? My programmer gut says the value is exclusive ('after' meaning 'later than given year'), so I set date of death (P570) = unknown with qualifier earliest date (P1319) = 1907. Is this the correct approach, or should I use earliest date (P1319) = 1906? --Gikü (talk) 21:08, 9 February 2017 (UTC)

I just noticed that one of the aliases for earliest date (P1319) property is later than. That means I should go with 1906, right? --Gikü (talk) 21:11, 9 February 2017 (UTC)
More general question: do we have a Manual of Style (MOS) page summarising how to represent uncertain dates? It would be a useful thing, I think, to have how to represent all the different use-cases all gathered together and summarised in a single place.
In response to Gikü's direct question, I would think that in most cases 1906 is intended to be a possibility, but what has been written has been communicated lazily. In most cases I would understand "after 1906" to mean that there was a known event that occurred in 1906 at which it is known that the subject was alive; and after that no more is known of them. Jheald (talk) 21:16, 9 February 2017 (UTC)
I think the MOS you've asked is available at Help:Dates#Inexact dates. --Gikü (talk) 21:40, 9 February 2017 (UTC)
Thanks Gikü, that's a useful link. I notice that increasingly, if a date is say "between 1623 and 1628" people seem to be using unknown, earliest date 1623, latest date 1628; rather than "1620s" (as the help page seems to suggest) with the same qualifiers.
I can see that "unknown" is clearer (even though the date is not completely unknown), but doesn't putting in "unknown" mess up sorting in reports, whereas "1620s" would still place the person or event in pretty much the right place? Jheald (talk) 21:50, 9 February 2017 (UTC)

Wayback Machine URL added to references[edit]

Hi there, I'm relatively new to Wikidata, and I got my "official" introduction by watching Asaf Bartov's talk on YouTube. I was adding references to property values by copying and pasting URLs, and I was wondering if there was a reference property that gave space to an archiveurl from the Internet Archive's [ Wayback Machine]. Linkrot is a big enough problem for me to use it a fair amount on Wikipedia, so I think Wikidata would benefit from such a property if it doesn't already have it. Can someone enlighten me on whether Wikidata has such a property for references, and if not, how to make it? Thanks, Icebob99 (talk) 02:16, 10 February 2017 (UTC)

Hi there, @Icebob99:! You're probably looking for archive URL (P1065), with qualifier archive date (P2960). Mahir256 (talk) 03:26, 10 February 2017 (UTC)
Thanks! That's exactly it. Icebob99 (talk) 03:55, 10 February 2017 (UTC)
Good to know, I didn't know that.--Alexmar983 (talk) 05:06, 10 February 2017 (UTC)

Vandalism out of control[edit]

Can anyone remind me why are IP edits allowed? There are not enough people watching recent changes and the vandalism backlog overflows staying there forever. I have been cleaning the last day of vandalism, and it is not a pleasurable task that I plan on repeating. If the community cannot keep up with the maintenance, why are we allowing anonymous edits?--Micru (talk) 11:10, 20 January 2017 (UTC)

Wikidata:Project_chat/Archive/2016/10#Does it make sense to allow anonymous editing on Wikidata? is relevant. Matěj Suchánek (talk) 11:30, 20 January 2017 (UTC)
Yes, I see that many users oppose disabling anonymous edits, but I wonder how many of those same users patrol recent changes on a regular basis. That is a discussion that Recent Changes Patrollers should be having, but apparently there are none here in WD.--Micru (talk) 12:30, 20 January 2017 (UTC)
There are. And I believe there actually are enough to handle the incoming edits, though of course we would greatly benefit from having more patrollers, or from more patrollers using the actual "patrol" feature, so obviously correct or already reverted edits wouldn't have to be checked by others again. --YMS (talk) 10:47, 21 January 2017 (UTC)
If as you say there would be anyone watching, vandalism wouldn't have crept in so much, to the point that I have been reverted from a legit edit because I used an item whose label had been vandalized in another language, which led the user to think that I was vandalizing myself (!). On many wikis users who spend time watching RC identify themselves with Template:User wikipedia/RC Patrol (Q5654490), this is not the case of Wikidata, so it makes me very doubtful about your claims that anyone is watching at all when there have been reports of vandalism staying for months in important items. At least the first step would be recognizing that we have a problem here, and just brushing it under the carpet with the belief that something is being done (when there is no proof that it is the case), it is not going to help in gaining the trust of the Wikipedias that are already suspicious of the lack of quality of the data that is being allowed into WD. --Micru (talk) 12:12, 21 January 2017 (UTC)
I am not brushing anything under the carpet. I have been calling for help in vandalism patrol myself here and elsewhere several times before. Yes, we need people doing this. But no, it's not that we don't have them, as younare saying. Wearing a badge saying "I'm an RC patroller" doesn't help us, just like no one wearing such badges does not mean in any way that there is nobody taking care of RC. I guess I managed to patrol most of the IP and newbie changes over the last year, and while I've found a lot of vandalism that nobody found for days, weeks ands months, I saw much, much more vandalisn edits already reverted by dozens of fellow users, many of them just as active as I am. --YMS (talk) 13:25, 21 January 2017 (UTC)
Then what is needed is a coordinated effort, as in a RC patrol. There is no use in single users checking vandalism every once in a while on their own, not knowing what has been checked, or what is there to be checked, because it makes the whole process inefficient and full of holes where vandalism just slips away as we are seeing, plus eventually there is the danger that said users get burned out by not having support. If you are already doing the task, and know others that do it as well, what do you think about transitioning into a collaborative effort? For me (and I hope for others too), it would be very useful to be able to refer to a place where to find tools, info about patrolling, etc or other users that can help on a temporary or frequent basis.--Micru (talk) 14:01, 21 January 2017 (UTC)
+1; it was just yesterday that I proposed (on another Wikdiata page) to have a Wikidata:WikiProject CVN in which we can team up, share CVN tools and also make CVN work visible to external Wikidata users such as Wikipedia communities. —MisterSynergy (talk) 14:17, 21 January 2017 (UTC)
Yes, please, go ahead and start it! Perhaps with the full name (Wikidata:WikiProject Countervandalism?), because many people don't know what CVN stands for. I just found out that there is a #cvn-wikidata IRC channel listed on meta.--Micru (talk) 14:45, 21 January 2017 (UTC)
I like the idea of a wikiproject. I would find it useful to have somewhere where people can bring attention to things that they are not sure about or don't have the time/energy to fix. It could also be a good place to document some of the problematic anonymous editors whose IP addresses change a lot. - Nikki (talk) 13:25, 27 January 2017 (UTC)
(Edit conflict) I've compiled a list of tools and links under User:YMS/RC. Otherwise I don't know what to do in terms of collaborative effort. At least for me, I cannot imagine being assigned to shifts or certain areas of work. I check RC multiple times a day and wouldn't change this e.g. if I knew who exactly is active at which exact times. --YMS (talk) 14:23, 21 January 2017 (UTC)
Well at least we need to advertise available tools much more than we do until right now. Your page is great, but I did not know it until yesterday. Something comparable should be available in the Wikidata namespace, as a community project. I would agree, however, that assigned shifts or simliar approaches are not necessary. —MisterSynergy (talk) 14:35, 21 January 2017 (UTC)
(edit conflict)Sharing your tools and links more broadly is already a useful contribution. I don't think anybody would ask you for a deeper commitment that the one you already take. Having a group can encourage other users to participate too.--Micru (talk) 14:45, 21 January 2017 (UTC)
I agree that IP vandalism is a major problem. But I already noted, that (maby suprisingly) most of IP edits are not vandalism, but most of vandalism come form IP and mobile edits. As I noted, in my opinion the solution is to use (semi-)protection much more frequently.--Jklamo (talk) 12:35, 21 January 2017 (UTC)
The amount of vandalism is honestly excessive. MechQuester (talk) 21:41, 23 January 2017 (UTC)
@MechQuester: Where do you see this excessive vandalism? I browse unpatrolled changes or claims or terms (de/en) right now, but I don’t see that much vandalism to be honest. Since I don’t want to rule out that I use wrong filters here, I’d ask here now… Thanks, —MisterSynergy (talk) 08:41, 24 January 2017 (UTC)
I just noticed that an undo doesn't lead to an automatic mark that the edit was patrolled. Is there a reason why this doesn't happen? It seems to me like the system could assume that any edit that get's undone by a person who can mark as patrolled should be marked that way. ChristianKl (talk) 10:17, 24 January 2017 (UTC)
Symbol support vote.svg Support indeed, this makes the patrolling very tedious, because you have to go twice on the same diff, to cancel it, then mark it as read :( --Hsarrazin (talk) 19:50, 25 January 2017 (UTC)
I added a phabricator task ( ChristianKl (talk) 10:50, 27 January 2017 (UTC)
I should not have said really excessive. More like, some of the vandalism, doesn't get caught at all. MechQuester (talk) 14:33, 24 January 2017 (UTC)
  • I think it would be good to start a counter-vandalism wikiproject. We could coordinate all tools in one place, have a list of people involved that could be contacted by those looking to volunteer in the area, and maybe work on developing new tools or better using the ones we have for Wikidata-specific vandalism. All good ideas. -- Ajraddatz (talk) 22:02, 30 January 2017 (UTC)
At the Dutch WP we have a feature called RTRC (Real Time Recent Changes). Maybe it could come in handy to apply that also for Wikidata and other big projects like ENWP. Q.Zanden questions? 15:27, 2 February 2017 (UTC)
PS: This is a feature from User:Krinkle and can be added to your common.css by adding this code. Q.Zanden questions? 15:32, 2 February 2017 (UTC)

My test with "local" users show that they can take care of wikidata items and learn fast. The main problem here is the lack of an "inclusive" policy before "exclusive" policy, I guess. On itwikipedia users keep saying that "they are expert on wikidata and vandalism is not a issue there" when this problem emerges, curbing maybe the enthusiasm of many possible good users who could help. But there is some surprisingly long undetected vandalism and the problem emerges quite often asking around.

Still, the core point for me is that this is not just a "nerd game", this is content, you need more real content-focused users from the other platforms. You don't need only efficient tools to let one user to monitor 10000 OS he sometimes knows nothing, you need 100 users that monitor 100 OS they know a lot about. Funny things, they are not even difficult to find with good wikimetrics, but I think a lot of users don't trust this vision.

in any case, a countervandalism project is a good idea. Let's simply try not to focus "exclusively" on tools and shortcut because the strength of an integrated platform is its interconnection and plurality. Than of course those things are always fun.

Talking about shortcuts, I re-propose my suggestions that some class of users can automatically mark a thanked edit as patrolled. Like an option in the preferences to select.--Alexmar983 (talk) 06:57, 9 February 2017 (UTC)

If you want 100 people to monitor 100 changes than we need tools to show every of those 100 people the rights changes that they are well equipped to monitor. That means that it's best when changes in Italian labels and descriptions get shown to people who actually speak Italian. Better Watchlist integration into the Watchlist of Wikipedia would also help.
Of course in addition to new tools it's also important to win new users for Wikidata. I'm not sure that saying: "We want more people who monitor our recent changes" is a good way to win content focused users. Gaining new users is likely a combination of Wikidata providing more value, the value that Wikidata provides becoming more well known (in Wikipedia and in society in general) and new users having a good editing experience.
When creating a new Wikipedia page for a fascia in the human body, I have seen it get deleted because it containted too little information. Editing on Wikidata is often more fun because it's possible to add many small items without having to have a big discussion over the notability at every time. ChristianKl (talk) 11:02, 11 February 2017 (UTC)
ChristianKl actually, you mainly need people to take care about the items they know. When they focus on specific subgroups of items, they do it intensively, so they don't really need to select something. This comes more naturally if they work on a platform that it is strongly integrated with wikidata, so such changes affect their "real articles". The strange thing is that itwikipedia is strongly integrated but sometimes discourage users to enjoy wikidata in a creative way, creating more bottlenecks than necessary IMHO. So I introduced them to wikidata with P18 and now I still receive half a dozens of email and talk messages per month about wikidata in general, which means there is some gap I was clearly filling, no doubt. BTW this is to say that I am "neutral", I don't think all platforms should use intensively wikidata, itwiki for example was "forced" in a quite limited time to become more wikidata-centric and users were sometimes confused. But I totally support they do stuff, but not necessarily the same on every wiki. Talking about less integrated platforms, i think that you have to welcome users sometimes with candy, sometimes just being honest about what is really necessary. But in any case if you show what wikidata can do, than the integration is quite natural. For example many people like wikidata lists, they like image maintanance, they like bot-created articles from wikidata. And they like those things especially where they are new users that see wiki more as a integrated platform, with still a lot of curiosity. They like them "actively". So I guess, the future can be bright, it just need a little bit more of human connection. They come here if you come to them. Listen to what they really need, accept they can teach you something interesting too and you are already more than half a way for a "robust" platform, IMHO.
One thing I have noticed is that we don't give the autopratolled right "manually" anymore, it is not even shown in the SUL summary and it is semiautomatic I guess. IMHO in the early phases (and we are still "young"), granting AP should be a "human contact". Being pinged on the candidature page is good, more importantly you are proposed if you did something more than indirectly connecting items, so there is a growth, an interaction. People are "proud" when they receive It... it would be much better.
And finally, some wikimetrics. On itwiki Nemo bis simply did a test 2-3 years ago with all users that had the biggest amount of patrolled edits, made a lists, and I helped him and conctact them in their talks... like "you have used the undo button many times in the last month, would you like to be a patroller?". it's not automatic, there are many steps, they have to say yes, propose themselves, than we discuss if it is ok, but in the end we got 2-3 new patroller in few weeks, and they were good.
I would try these things before banning IPs. And in any case, can't we just have some preview option like dewiki? On some platform can be discouraging, especially small one (i remember a discussion on meta forum), but we here are very technical so maybe it works fine. We also need some good objective data to think about the possible strategies.--Alexmar983 (talk) 13:49, 11 February 2017 (UTC)
@Alexmar983:Could you list the bottlenecks that you see in a more detailed way?
I agree that a process in which autopatrolled is given by a human would be a step forward. ChristianKl (talk) 17:32, 11 February 2017 (UTC)

Copy reference from one item to the other[edit]

Hello. Is there a way to copy reference from one item to one other? Xaris333 (talk) 11:52, 5 February 2017 (UTC)

@Xaris333: Yes, there is a gadget "DuplicateReferences"  — billinghurst sDrewth 12:25, 5 February 2017 (UTC)
@billinghurst: This gadget works only inside an item, not between items (different wikidata pages). Xaris333 (talk) 12:27, 5 February 2017 (UTC)
Yep, I misunderstood your need. I haven't seen a tool like the one you desire.  — billinghurst sDrewth 12:30, 5 February 2017 (UTC)

Anyone? Xaris333 (talk) 21:11, 10 February 2017 (UTC)

Identifying a source but differently[edit]

Hoi, we have a lot of identifiers for external sources. They serve a purpose, it is good to have them. Internal to the Wikimedia projects there are projects that identify subjects that are of their interest. These subjects do not fit any known criteria but the interest of their group. One of these groups wants to maintain a list of artists that are "of the African diaspora". Claiming any criteria but the interest of this group is problematic for obvious reasons.

The idea is to identify them all as being part of this Wikimedia project. This can be done in a variety of ways but realistically they just want to identify items as being of interest to them. In effect they will work on the Wikidata items and seek to complement the available data. Their purpose is to query Wikidata and use it as well for work in Wikipedia, Commons ... whatever. The benefit they gain is that Wikidata queries are real time. Update items and the result in queries is different. They only have to add data once and that is it.

My proposal is to allow for identifiers to these projects and treat them as a source. The projects are established projects and they seek to use Wikidata for their purposes and will work towards high quality Wikidata data. I intend to help them with this; there are over 900 items involved and it is an expanding list. My reason for doing this: it gives Wikidata a new purpose. Thanks, GerardM (talk) 10:35, 9 February 2017 (UTC)

Why not do something similar to en-wiki, and add a wiki-project banner template to the item talk page, with code that adds it to a category?
I suppose the difficulty is to combine such category information with eg SPARQL, to limit maintenance queries to their items of interest. But I think Magnus has tools that can do category + query combination, doesn't he? Jheald (talk) 11:00, 9 February 2017 (UTC)
Perhaps we could ask the WDQS developers to consider creating a preprocessor SERVICE in queries, that could take a category specification and would replace it with a VALUES list of items? I think that could have some quite general usefulness. Jheald (talk) 11:08, 9 February 2017 (UTC)
I created a Phabricator ticket for the suggestion. Jheald (talk) 12:46, 9 February 2017 (UTC)
We have imported from (P143)...? Gerard, I'm not sure what you mean by "Identifiers". Are they properties or items or something else? Deryck Chan (talk) 11:19, 9 February 2017 (UTC)
  • Not in favour of any item labelling for particular work. Each group has to use the common statements and perform some multi-criteria queries to extract items of interest. If we start to allow groups to tag items for particular purpose, we are dead. People have to learn how to use queries to extract the data they want instead of putting personal identifiers on items they want to improve, that's the way a database is used. Snipre (talk) 12:19, 9 February 2017 (UTC)
  • I tend to agree with Snipre. In most of the wikiprojects information.... data can be queried and lists obtained. This is exactly why a database like Wikidata is for. It does not make much sense creating properties such as P9999 ("related to wikiproject X") -> "Wikiproject:French literature" (if that is what GerardM meant, I did not understand the "identifiers" (?) and "sources" (?) part) when they should be able to identify these articles through standard queries ("writers writing in French" or "books written in French" or "french critics studying a "subclass of" or "facet of" French literature"...). Yes, storing data on "skin colour" is more troublesome, but... a property such as this would be a way to circumvent that 'troublesomeness' through "wikiproject common sense"/"personal tagging". This is just another way to say "John Doe is somehow pretty dark skinned or mulatto or ... and he does not live in Africa" not using a so called "skincolour property". Are there many examples of Wikiprojects in which querying lists is not easy? Or are they a minority? Strakhov (talk) 13:05, 9 February 2017 (UTC)
Obviously. It is important to include statements to the best effect. What you deny is that not all the content can be cleanly defined by statements and found by queries. When you have a random amount of data, an identifier that works like a source statement can be ignored for practical purposes but it allows people to do their project work. So YES, we want their statements and the point is that they WILL update the data because they use queries to determine how they are doing. So EXACTLY by allowing identifying a random grouping you will see that the quality for these items will be highly improved. Thanks, GerardM (talk) 14:29, 9 February 2017 (UTC)
If I understand what GerardM is proposing, I still don't see any purpose in "identifiers" for this - why not propose a property "of interest to" which would be set to an item for the wikiproject or other representation of the group of interest. ArthurPSmith (talk) 17:22, 9 February 2017 (UTC)
Because it is of little interest to most people to whom it is interesting. It is as interesting as knowing what other sources have information about the same subject. Not. But it makes it possible to do the nifty things that query enable. Thanks, GerardM (talk) 18:33, 9 February 2017 (UTC)
  • As far as I understand the problem is that the group is interested in people of the African diaspora but we don't have a property that says which people belong to the African diaspora. Additional the group also doesn't have a separate website that could be referenced with an external ID.
In theory, I'm interested in making it easy for groups to cooperate with Wikidata, in this case, I'm however doubtful. If we had a property "related to Wikiproject:African Diaspora" we would likely get a lot of unsourced claims that suggest that certain people are part of that Wikiproject and thus part of the African Diaspora. Given that information about race is sensitive personal data this seems problematic to me. ChristianKl (talk) 22:04, 10 February 2017 (UTC)
It is a list that matters to them. Exactly because race is sensitive I tend to ignore it. This way of working helps because it is for them to decide who is included and who is not. Our gain is that they maintain the associated items so they can query Wikidata. Thanks, GerardM (talk) 08:24, 11 February 2017 (UTC)
Do you want to maintain items to query them later in another context, or do you want to query items solely for maintenance purposes? In both cases you could manually set up a list of items in scope of this project and provide links for convenient access. It should even be possible to create a specialized tool at Tool Labs with something like a project-wide watchlist. This would be a little more complex to set up than a direct solution with a claim in the affected items, but it allows us to continue separation of internal and external identifiers. —MisterSynergy (talk) 08:45, 11 February 2017 (UTC)
It's problematic if some institution can decide who counts as African diaspora in a way that they can mark it inside Wikidata without any checks. When Wikidata hosts the list it becomes responsible for it. Especially when it hosts it as the primary venue and we do more than hosting an external ID to another database. ChristianKl (talk) 10:22, 11 February 2017 (UTC)

Module:Wikidata updated[edit]

Hello! I have just updated our local Module:Wikidata, so project pages embedding Wikidata content (eg. those with many {{Q}} templates) now load faster and use less memory. Previously, even for simple label lookups, the whole item had to be loaded to memory. If you are curious, the memory usage on Property talk:P131 almost halved and the loading time also went down a bit. I did my best to make the change only internal, the output shouldn't have changed. If you see some problems, please report them. I believe this change has broken previous limits and made showcasing content on Wikidata easier and more friendly. Matěj Suchánek (talk) 21:36, 10 February 2017 (UTC)

@Matěj Suchánek: I am presuming (though just checking) that it is recommended for the sister wikis to update to this updated version — well those that have not further customised theirs?  — billinghurst sDrewth 11:20, 11 February 2017 (UTC)
As a note, if we are updating these base modules, it would be great if we can consider ensuring that the update is propagated through the wikis. If nothing else it should be in the weekly newsletter, though ultimately I would love to see an opt-in processes for the wikis to have updates pushed out by a bot.  — billinghurst sDrewth 11:23, 11 February 2017 (UTC)
I was going to add it to the newsletter but forgot it. Updating by a bot sounds interesting though some wikis may use a slightly modified version which the bot would overwrite. Matěj Suchánek (talk) 11:42, 11 February 2017 (UTC)
@Matěj Suchánek: which is why I suggest an opt-in approach, or guided opt-out. 1) it encourages minor improvements can be written into the master; 2) we have a true master; 3) the small wikis are better supported, especially as they struggle the most for numbers, knowledge and skill 4) we could hang off something similar @Yurik: is exploring for mw:Template:Graph series. As an aside have we ever analysed how many wikis have the modules, and their level of currency?  — billinghurst sDrewth 12:40, 11 February 2017 (UTC)
I believe there shouldn't be any problems. I have already noticed a small regression that looking up labels from redirect items doesn't work but it shouldn't be problem on clients. Matěj Suchánek (talk) 11:42, 11 February 2017 (UTC)
On svwiki we have a Module, that has been developed locally. That module has migrated far from the one you use on other projects. That does not prevent us from having a second module, that could be bot-updated regularly. @Larske, Ainali: mfl. -- Innocent bystander (talk) 13:23, 11 February 2017 (UTC)

Question. Procedure[edit]

You see a user adding systematically wikipedia urls (such as, for example, in reference URL (P854), sourcing statements that way, in a manual fashion. What would you say to him/her?

  • 1) Nothing. Pass.
  • 2) Stop, at the very least you should add a stable link such as
  • 3) Stop, you should not use reference URL (P854) but imported from (P143)
  • 4) Don't waste your time adding references 'manually' to wikipedias because their value is next to zero.

I think 1) and 2) are not ok, since it makes more difficult the re-use of data in wikipedias, but I'd like a second opinion. Strakhov (talk) 23:20, 10 February 2017 (UTC)

Just my two cents here, but I think 3) is the best option, provided that it is conveyed in a very courteous and good-willed manner. imported from (P143) is the way to go transferring information from Wikipedia, in my understanding. The recent Wikidata intro talk streamed by Wikimedia on YouTube discussed this topic, suggesting a method of referencing akin to 3). The rationale was that Wikipedia is a great source, but just like Wikipedia can't cite itself, Wikidata can't cite Wikipedia. With all that being said, the user is moving in the right direction and almost certainly acting in good faith, since they are adding references, just with the wrong property. Icebob99 (talk) 04:31, 11 February 2017 (UTC)
Thanks for the feedback! :) Strakhov (talk) 13:24, 11 February 2017 (UTC)

WikiCite 2017 applications open through February 27[edit]

WikiCite 2017 banner.svg

We just announced that applications for WikiCite 2017 (Vienna 23-25 May, 2017) are open until February 27, 2017. WikiCite 2017 is a 3-day conference, summit and hack day to be hosted in Vienna, Austria, on May 23-25, 2017. It expands efforts started last year with WikiCite 2016 to design a central bibliographic repository , as well as tools and strategies to improve information quality and verifiability in Wikimedia projects. Our goal is to bring together Wikimedia contributors, data modelers, information and library science experts, software engineers, designers and academic researchers who have experience working with citations and bibliographic data in Wikipedia, Wikidata and other Wikimedia projects. For this initiative to be successful, it is critical to get members from the core Wikidata community involved, and it would be fantastic to see you in Vienna. Thanks to generous funding from a number of organizations, we have (limited) travel funding available, if you're interested in participating please consider submitting an application. This year's event will be held at the same venue as the Wikimedia Hackathon and we'll be able to accommodate up to 100 participants. Any questions? Get in touch with the organizers at: --DarTar (talk) 18:31, 11 February 2017 (UTC)

Lost paintings[edit]

How should lost paintings listed? They appear here: Wikidata:WikiProject sum of all paintings/Missing collection; but when there are destroyed they will never have a collection. --Carl Ha (talk) 07:35, 10 February 2017 (UTC)

How about using "unknown value"? ChristianKl (talk) 09:10, 10 February 2017 (UTC)
"Unknown" would fit for paintings which disappeared (e.g. stolen), if they are destroyed better use "no value". Ahoerstemeier (talk) 10:20, 10 February 2017 (UTC)
And in cases where we know when and where an artwork was destroyed? Last known collection with an end-date? --HHill (talk) 14:39, 10 February 2017 (UTC)
I would add end time (P582) on the item level with the date the artwork was destroyed. ChristianKl (talk) 15:19, 10 February 2017 (UTC)
And for now lost manuscripts, which once had a shelfmark in a collection I would still add that information, as that is the identifier they are still cited by (cf. e. g. the manuscripts from the municipal library in Strasbourg destroyed in 1870). --HHill (talk) 15:40, 10 February 2017 (UTC)
lost artwork (Q4140840) or destroyed artwork (Q21745157)? Andreasm háblame / just talk to me 04:23, 12 February 2017 (UTC)
@Carl Ha: Wikidata:WikiProject sum of all paintings/Missing collection has all paintings that don't have collection (P195) set. That doesn't have to be now, but can also be in the history. You just have to qualify it with start time (P580) / end time (P582) / point in time (P585). We aim to have full provenance, so every collection since the painting was made. You can use significant event (P793) to model things like a theft or destruction. Please have a look at Wikidata:WikiProject Visual arts/Item structure for more information about how to model artworks. Multichill (talk) 17:17, 12 February 2017 (UTC)

Stop everything![edit]

This summit has both a Peakbagger mountain ID (P3109) and a peakware mountain ID (P3513). You can help lesser known peaks get their owns.

Hello. Please consider stopping everything you are doing right now and joining the ongoing effort to enhance our coverage of mountain (Q8502) in our database! (Q28736250) has just donated 67,591 IDs that I have immediately uploaded into Mix'n'Match (Q28054658). The result is this page, where you can help deploy the IDs on our items dedicated to summits through Peakbagger mountain ID (P3109). There is also, by the way, a new catalog for the much smaller Peakware (Q28493740) database, that you can use here to import data through our new peakware mountain ID (P3513). All this will lead to actual improvements on Wikipedias, as templates such as Template:Geographical links (Q28528875), that automatically display the external links on articles, are currently being adopted quite seriously, at least on the French version. So yes, please, stop everything and come to the mountains, where you'll reach a higher perspective on Wikidata! Thierry Caro (talk) 08:28, 11 February 2017 (UTC)

I did a few, but it will take a lot of time ... How was the automatic matching done? Did peakbagger donate the coordinates corresponding to the IDs? It would be possible to match a lot faster by mapping all this data.Koxinga (talk) 18:15, 11 February 2017 (UTC)
I'd agree - this would be much easier to work with if the summary data in mix-and-match had coordinates, or the country it's in, or something similar. You're going to get a lot of well-meaning mismatches otherwise - there's 19 Brown Mountains, 13 Smith Mountains, etc. Is it possible to extract this from the DB and update the entries in mix-and-match? Andrew Gray (talk) 17:37, 12 February 2017 (UTC)
It's still much much faster than without any catalog to work with. But it may be worth it to ask for more data~, as you said. Let me come back to you. Thierry Caro (talk) 17:56, 12 February 2017 (UTC)