Wikidata:Project chat/Archive/2018/05

This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Please add neutral sex (Q52261234) to sex or gender (P21) constraints.

Please add neutral sex (Q52261234) to sex or gender (P21) constraints. – The preceding unsigned comment was added by 2001:2d8:e089:bbf5::2dc0:c0a5 (talk • contribs) at 29. 4. 2018, 11:36 (UTC).

Done Wikidata:Database reports/Constraint violations/P21 – The preceding unsigned comment was added by Urirnal (talk • contribs) at 29. 4. 2018, 15:03 (UTC).

This section was archived on a request by: Matěj Suchánek (talk) 08:06, 3 May 2018 (UTC)

Merge Heinz Heinrich Gerhard Pirang 1876-1936

Heinz Heinrich Gerhard Pirang (Q28935260) 1876-1936 = Heinz Pirang (Q16357570) 77.179.61.171 01:45, 5 May 2018 (UTC)

done --Tagishsimon (talk) 01:54, 5 May 2018 (UTC)

Tagishsimon - thank you. 77.180.81.191 16:58, 5 May 2018 (UTC)

This section was archived on a request by: Tagishsimon (talk) 21:53, 5 May 2018 (UTC)

Merge Johann (Iwan) von Brevern 1812/1813-1885

Johann von Brevern (Q48840853) 1813-1885 = Ivan von Brevern (Q16403280) 1812-1885 77.180.81.191 16:57, 5 May 2018 (UTC)

Has been done. --Tagishsimon (talk) 21:50, 5 May 2018 (UTC)

This section was archived on a request by: Tagishsimon (talk) 21:52, 5 May 2018 (UTC)

Merge Carl Wilhelm Cruse

Q16360865 = Q52705607 80.171.248.200 05:19, 6 May 2018 (UTC)

Done --Tagishsimon (talk) 05:27, 6 May 2018 (UTC)

This section was archived on a request by: Tagishsimon (talk) 05:27, 6 May 2018 (UTC)

Fusion of Q666112 and Q39314414

I tried to merge this two itens but the merger didn't let me. I thought they were the same but a user mentioned it was a recent split in the historic of one of the items, so I'm no longer sure about it. Can someone explain me what is the criteria here and/or sort this mess? - Sarilho1 (talk) 20:52, 17 April 2018 (UTC)

You are trying to merge a Latin script name with a Cyrillic script name, they are different. Sjoerd de Bruin (talk) 21:01, 17 April 2018 (UTC)

Oh, thanks. I thought that the script didn't matter. Good to know, then. Are the interwikis then handled as different articles? - Sarilho1 (talk) 21:40, 17 April 2018 (UTC)

Interwikis are handled as separate. Seems like a very poor solution. --Tagishsimon (talk) 21:55, 17 April 2018 (UTC)

Interwikis are handled "directly" as separate. But from a wp, it is very possible to use a lua template to link all said to be the same as (P460) items.

this solution was chosen, after many discussions, by the Project Names, to avoid the preeminence of english language on others, and to avoid also that latin-alphabet would be the center of the system. This way, if a person is Russian, he will use Борис, and if the person was british, he will use Boris. It is the same for hebrew names, chinese, japanese, korean names, etc... There are still a lot of past errors to clean up, but this is the idea.

for more info on that, you may want to read the discussions on Wikidata:WikiProject_Names, and even participate there :)

@Harmonia Amanda: who may explain it better than me...--Hsarrazin (talk) 09:15, 18 April 2018 (UTC)

@Tagishsimon, Hsarrazin: Maybe this is the again another reason that every Wikidata users should support the way to fix Incubator sites' support: Drop the unfair "one sitelink per one wiki" limitation. --Liuxinyu970226 (talk) 10:57, 18 April 2018 (UTC)

It's strange that we are keeping these as separate items. There are no interwiki overlaps and all the sitelinks I can read consider the two names in the same article. The current item split is making Wikipedia articles from different sites more difficult to link together. Just to make the current situation more absurd, the Armenian sitelink falls into the Latin item even though it's yet another script. Can we instead accept that this is one male given name with two native forms and merge them? Deryck Chan (talk) 16:23, 23 April 2018 (UTC)
- And merged them with Q20682286 and Q19413460 as well, as there are no conflicting sitelinks?
  --- Jura 16:34, 23 April 2018 (UTC)
  - I support merging Q20682286 also, because the Polish sitelink lists both "Boris" and "Borys" spellings. I don't mind Q19413460 since it has a different spelling and no sitelinks. Deryck Chan (talk) 11:26, 24 April 2018 (UTC)
    - @Deryck Chan: So in your opinion, we should also consider merging Wu (Q11010927) and Wu (Q845163) because both are refering same surname "吳/吴"? --Liuxinyu970226 (talk) 09:24, 25 April 2018 (UTC)
      - @Liuxinyu970226: No, because Wu (Q845163) and Wu (Q2652694) are organised by Chinese surname origin and form minimal pairs in zh.wp and yue.wp, while Wu (Q11010927) merges all surnames that are romanised as Wu. Notice how nl:Wú belong to Wu (Q845163) rather than Wu (Q11010927), for example. In contrast, you can see that there is a single item for Zhuang (Q1594283) covering multiple romanized spellings, because there is only one extant Chinese surname that is romanized Zhuang or Chng. Deryck Chan (talk) 09:39, 25 April 2018 (UTC)

So how to describe such differents?

Just say "differents" via descriptions? Create a dummy item to say two different scripts surnames that can be translated to one same name in one language? waiting for phab:T54971? --Liuxinyu970226 (talk) 14:36, 1 May 2018 (UTC)

List of articles that exist in many Wikipedia language versions but not in a specified language version

Hello, I noticed that many language versions of Wikipedia have something like w:Wikipedia:Articles in many other languages but not on English Wikipedia. In the German Wikipedia (my home WP) such a list doesn't exist: a list of WP articles that exist in let's say at least 50 language versions but not in the German. I was wondering how it's being made. Stilfehler (talk) 15:31, 30 April 2018 (UTC)

@Stilfehler: You might want to ask at wikidata:Request a query, maybe they can help you out with some sort of query. Q.Zanden questions? 10:58, 1 May 2018 (UTC)

Can pick this up. Will continue at dewiki --Edgars2007 (talk) 11:20, 1 May 2018 (UTC)

Thank you so much! @Edgars2007: please keep me posted. I suggested to my German fellow wikipedians to have a full equivalent to w:Wikipedia:Articles in many other languages but not on English Wikipedia [1], but the first step may be having one in my user pages. Stilfehler (talk) 14:07, 1 May 2018 (UTC)

Wrong place?

Can anyone have a look at Wikidata:Property proposal/Norwegian war sailor register ship ID. I think I have Places this proposal. Pmt (talk) 10:48, 1 May 2018 (UTC)

Mapping the open movement on Wikidata

Hi all

I’m working on a hackathon/sprint project which will try and map the open movement on Wikidata, this will include organisations, communities, people and projects in open source software, open knowledge, open access publishing, open data, open hardware, open science and open rights.

A couple of questions:

What’s missing?: What kind of things within the open movement am I missing from the list above? Is 'open movement' the best term to use?
Queries: What queries could be run to get some nice graphics to show people to encourage people to take part?
Metrics: How could I measure who is taking part and what they have done? A lot of the people will be new to Wikidata and I’m worried adding a lot of additional steps will scare people away. My best guess is to ask them to sign up to a Programs and Events dashboard project? Measurement is not super high on my priority list as I’m doing this for fun, I’d rather people take part and I never knew than they don’t take part.

Any additional thoughts or links to things to read would be very much appreciated.

Thanks very much

--John Cummings (talk) 11:42, 28 April 2018 (UTC)

Which hackathon is that? I'm about to submit a Mozilla Sprint project that aims to do something similar for open science, so perhaps there is room for interaction. --Daniel Mietchen (talk) 22:00, 28 April 2018 (UTC)

@Daniel Mietchen: ha ha, its also a Mozilla sprint project :) --John Cummings (talk) 22:40, 28 April 2018 (UTC)

@John Cummings: My submission sits here. Let me know if that resonates with yours. --Daniel Mietchen (talk) 04:30, 29 April 2018 (UTC)

@Daniel Mietchen: here's mine, nice to know we aren't trying to do exactly the same thing :) Still struggling with markdown a bit and need to create some pages but its on its way. --John Cummings (talk) 08:47, 29 April 2018 (UTC)

@Daniel Mietchen: I created Wikidata:WikiProject Open as part of it, its written with new contributors in mind with hopefully clear steps. --John Cummings (talk) 11:55, 1 May 2018 (UTC)

@John Cummings: OK, I put in a ticket to our repo that could serve as a basis for interaction. --Daniel Mietchen (talk) 12:47, 1 May 2018 (UTC)

@Daniel Mietchen: 👍👍. --John Cummings (talk) 22:22, 1 May 2018 (UTC)

Process questions

I'm not a prolific contributor to wikidata but thought I would begin adding some items related to a subject I care about – women's basketball teams (specifically Division I teams)

I started working through a list of such teams, open the wiki data item to see what was already there and if missing:

Added country - by adding the name of the country
Added sport - by adding the name of the sport
Added official website - by adding the url of the website
Added topic's main category - by adding the name of the topic's main category
Added home venue - by adding the name of the home venue

Veteran wikidata editors will realize this approach was flawed. For many properties, the correct value to enter is the value, but it appears categories are an exception — you have to add the value prepended by "cat:".

I have two concerns.

The first is that, when designing a system, especially a system designed for users who are not experts, and there is a straightforward obvious way to implement something, as well as a less obvious way to implement something, one ought to choose the obvious way whenever possible. There are, of course, situations where the obvious way won't work and cannot be used. In such cases, a well-designed system will recognize the potential problem, and attempt to alert users trying to use the obvious method that it it will not work and ideally inform them of the issue and the way to do it correctly.

I don't know enough about how the underlying system works to know why the decision was made to adopt the nonobvious approach, but I'll assume there was a good reason. However, I note that the system did not follow up with an attempt to detect the obvious but incorrect option and let the user know. Is there a reason this is impossible?

My second concern relates to how expert reviewers address such errors. In most Wikimedia projects, a user making a one-off error may or may not get a specific response on their talk page, but if the user makes dozens of errors, someone will contact the user to let them know that their edits are not following the non-conventional protocol. Has wiki-data adopted a different convention? Is it truly considered acceptable protocol to simply revert such errors without informing the editor? Does anyone see that this is an unbelievable waste of time of both the editor and the reviewer?--Sphilbrick (talk) 13:59, 30 April 2018 (UTC)

It is actually a good idea. Is there any way we can enforce constraints that properties whose value can only be a category do not accept values which are not categories?--Ymblanter (talk) 19:53, 30 April 2018 (UTC)

It is interesting to note the Commons categories, such as:

UCLA Bruins women's basketball (Q8852098)

as used in:

UCLA Bruins women's basketball (Q7864012)

Do not require the prepending of "category:"

Why the different handling?

Why can't Wikipedia categories be handled the same way as Commons categories?--Sphilbrick (talk) 19:48, 1 May 2018 (UTC)

In sitelinks they are handled exactly the same: you do need to prepend "Category:" for Commons categories. There's also the Commons category (P373) value which can be set for Commons categories only, and where the prefix isn't used because it's already implied by the property. Ghouston (talk) 02:11, 2 May 2018 (UTC)

Jean de Saint Cyr (Q52443246)

I just created the above, and have never created previously on Wikidata. There is an existing article on en Wikipedia for Jean de Saint Cyr. So far, the Wikipedia article does not show up here, and it won't let me enter the article name. Wikidata has not linked itself to the English article. Is this just an issue of waiting for the two sites to link? Maile66 (talk) 20:42, 1 May 2018 (UTC)

Yuo should go to the Box Wikipedia, edit there and choose English wikipedia and the article name on wikipedia. Preferably also add Statements. Pmt (talk) 21:24, 1 May 2018 (UTC)

@Maile66: I've done the necessary. As Pmt says, you need to go to the item record you created, look for the Wikipedia box, hit edit, enter en as the language and then type in the article name ... and confirm the article name when the user interface offers it up to you. I've also added a bunch of claims for the individual. --Tagishsimon (talk) 21:31, 1 May 2018 (UTC)

Yes, thank you. I had previously tried editing in the Box Wikipedia. It would let me enter the information, but would not let me publish what I had entered. I'm wondering if that might be that I have edited so seldom directly on Wikidata, that perhaps I'm not autoconfirmed here. Or something of that nature. I'm an admin over at en Wikipedia, but that privilege doesn't carry from one Wiki site to another. Maile66 (talk) 21:36, 1 May 2018 (UTC)

Learning by doing is a nice approach.... @Maile66: What Message did you receive when trying to publish? Pmt (talk) 21:44, 1 May 2018 (UTC)

@Maile66: (EC) That's not it ... possibly your browser is not allowing the wikidata UI to work properly ... not sure if it javascript, or what. When you typed in en, did it offer you a bold en? Ditto the name - was a bold version of the name offered? I had exactly this issue when I first started to edit, but its now too long ago to remember what the cure was. --Tagishsimon (talk) 21:46, 1 May 2018 (UTC)

There was no message. "Publish" was grayed out and not an option, even after I entered information. I thought it might have been NoScript, so I disabled that to no avail. Tried it on both Firefox and Chrome. So, I'm thinking it's something I'm not doing correctly. Maile66 (talk) 21:51, 1 May 2018 (UTC)

OK. I can edit it now, since it's already been created. I just could not edit it before. So, it had to be my own error somehow. Maile66 (talk) 21:54, 1 May 2018 (UTC)

@Maile66: To check, do you fancy deleting the en wikilink just created (and publish). And then have another go at adding it ... and in particular, let us know if you see emboldened suggestions appear beneath the edit boxes as you enter data. --Tagishsimon (talk) 21:56, 1 May 2018 (UTC)

I am right at the point now where I removed the en wikilink. Publish is grayed out. Maile66 (talk) 22:00, 1 May 2018 (UTC)

@Maile66:Sorry, I drifted off into twitter. So it's not allowing you to publish after hitting the wastebin to delete? I see no update on the record. --Tagishsimon (talk) 22:05, 1 May 2018 (UTC)

What I can edit and Publish is the top box for Language, Label, Description, Also known as. Any of the boxes below will let me open the Edit feature, but the Publish is grayed out. Cannot Publish in anything below the top box. Maile66 (talk) 22:09, 1 May 2018 (UTC)

@Maile66:Yup. It's a UI issue, /probably/ at your end (though I note you've tried with two browsers). The UI goes off to fetch values and presents them back to you, based on whatever you type in. Your browser(s) are not allowing that process to happen. (Or nearest offer). I /think/ Privacy Badger was the cause of my problems, but as I say, too long. If I come up with anything after searching I'll come back to you. --Tagishsimon (talk) 22:13, 1 May 2018 (UTC)

Success on Chrome. Maile66 (talk) 22:19, 1 May 2018 (UTC)

And now success on Firefox. The whole issue on Firefox was the NoScript addon. If I disable that, it works. Sorry for all this bother. But everything is OK now. Maile66 (talk) 22:20, 1 May 2018 (UTC)

@Maile66: Good stuff (and final ping of the night). There's also this gadget available, for en.wiki (or any language wiki) ... gives you bunch of handy left menu links enabling the creation and amendment of wikidata items from the wikipedia article. I've used it for a couple of years and wouldn't be without it. --Tagishsimon (talk) 22:30, 1 May 2018 (UTC)

Thanks for the tool link, and thanks for all the help. Maile66 (talk) 22:43, 1 May 2018 (UTC)

Cannot merge Portal:National anthems (Q24820071) into national anthem (Q23691)

I see no reason to keep these two separate, thus removing something to try merging, but still unsuccessfully.--Jusjih (talk) 03:00, 29 April 2018 (UTC)

@Jusjih: We generally do not merge portal items with regular items, although I understand why you would want to do that in this case. It is entirely conceivable, for any sufficiently broad subject, to start a Wikipedia portal on it independent of the original article on the subject (take, for example, Portal:Bible (Q8588115) and Bible (Q1845)). This arrangement for the two items you mentioned at least admits the possibility of an existing Wikipedia portal. Mahir256 (talk) 04:14, 29 April 2018 (UTC)

Please read w:Wikipedia:Village pump (proposals)/RfC: Ending the system of portals to see what will happen there. I cannot find specific Wikipedia portal for national anthems. Thanks.--Jusjih (talk) 04:19, 1 May 2018 (UTC)

@Jusjih: If some certain project wanna drop Wikimedia portal (Q4663903) page system, let it go, but at least, unless, until and unexpectable if there are users that discuss to globally drop it on Meta-Wiki RFC (which we all know that its priority is huge huge and huge than any certain wiki projects), please do not make such fake-merge anymore. --Liuxinyu970226 (talk) 14:30, 1 May 2018 (UTC)

What is fake-merge? My question here does is whether it is worth keeping Portal:National anthems (Q24820071) and national anthem (Q23691) separate, not debating whether to keep any portals anywhere. If not merging, just revert my edits to Portal:National anthems (Q24820071) with care.--Jusjih (talk) 03:32, 2 May 2018 (UTC)

Jusjih, stop the rubbish. These are two different things. 77.179.60.238 22:26, 2 May 2018 (UTC)

Is there something broken at the moment with the API?

Hi, I'm trying to use the Wikidata Tools plugin for Google Sheets and for some reason it can't get any info from Wikidata, is there something broken at the moment? API? Something else. Tried asking on IRC already but no luck

Thanks

--John Cummings (talk) 16:03, 2 May 2018 (UTC)

Enwiki RFC raises concerns

There's a discussion going on enwiki related to wikidata at en:Wikipedia:Wikidata/2018 Infobox RfC.

Some editors have pointed out that editors on Wikidata can change the sourced info with incorrect one. How does Wikidata implement check on that?
Another concern raised is the spamming by humans and bots on Wikidata. How is that kept under check here?

Capankajsmilyo (talk) 04:56, 18 April 2018 (UTC)

Re monitoring changes to sourced statements: Wikidata_talk:Abuse_filter#Tag_changes_to_sourced_statements. --Yair rand (talk) 06:11, 18 April 2018 (UTC)

This concern has & will continue to be be raised every 4 months or so on language wikipedias, until watchlists / histories on language wikipedias are able to show changes to the article arising out of changes in wikidata. Is there a Phab ticket for that? fwiw, my experience of screwing up quickstatement runs is that I get an angry wikidata person knocking on my talk page within minutes, so in that respect we work like language wikipedias: people watch items and bark when idiots like me bork them. --Tagishsimon (talk) 12:28, 18 April 2018 (UTC)

Wikipedia watchlist integration has already improved a lot, it might be worth to test it again. As far as I know it is not yet complete and there are some issues to resolve (e.g. delayed integration sometimes), but the amount and quality of listed edits are meanwhile pretty good. —MisterSynergy (talk) 12:51, 18 April 2018 (UTC)

I don't know which watchlist are you talking about. The one I just tested was flooded with "linked to this/that language" and "removed from this/that language". That's definitely NOT what enwiki editors want. Capankajsmilyo (talk) 11:41, 19 April 2018 (UTC)

I talk about the Wikidata integration in the standard watchlist (activated by the “Show Wikidata edits in your watchlist” checkbox at en:Special:Preferences#mw-prefsection-watchlist). On my 8000+ article enwiki watchlist, I see 10 Wikidata edits out of the past 250 edits which are listed. This is a typical load these days at enwiki, and numbers at my (smaller) dewiki watchlist are very similar. If you refer to interwikilinks: whose were shown in the pre-Wikidata time as regular edits as well, and many editors do indeed care for them as they are listed next to the article. —MisterSynergy (talk) 11:59, 19 April 2018 (UTC)

I didn't actually know watchlist integration was as good as it is (having turned it on). Maybe language wikis need to think about turning wikidata changes on by default? But I still see nothing in page histories. --Tagishsimon (talk) 21:25, 19 April 2018 (UTC)

Perhaps one remark:

One main concern is that WD allows the modification of a value of a statement without any care about the linked source. If a vandal change a value of sourced statement, then the argument that WD data can be filtered by looking only at statements with source fails. This criticism is from a contributor who develops a particular system to track modifications compared to a referenced version of the article.

So if someone changes a value of a statement, the corresponding source has to be deleted.

Then the usual criticism is that WD will be more vandalized than WPs if all WPs are using data from the source. Snipre (talk) 13:33, 19 April 2018 (UTC)

@Capankajsmilyo: For your question about spamming, can you explain us how wp:en deal with spamming in their articles ? WP contributors have always higher expectations with WD than the ones they accept for their wp. Snipre (talk) 13:37, 19 April 2018 (UTC)

Enwiki use techniques like sanctions and block of user or lock on page itself to deal with spammers. See en:Wikipedia:Spam for the guidelines. Capankajsmilyo (talk) 14:46, 19 April 2018 (UTC)

@Capankajsmilyo: Yup. Same. [2]. Wikidata is not the wild-west some language wiki users think it is. --Tagishsimon (talk) 21:25, 19 April 2018 (UTC)

@Lea Lacroix (WMDE): Can you provide us some feedback about the technical feasibility of restricting the edition of WD items to registered users only but allowing IP to contribute in talk pages and community pages ? Snipre (talk) 13:40, 19 April 2018 (UTC)

Not a chance that would be supported. Wikidata is a wiki. --Yair rand (talk) 19:53, 19 April 2018 (UTC)

@Yair rand: The registration doesn't prevent anyone to edit. We really need to balance data protective measures against contribution freedom. If WD continue to have a negative evaluation among WP contributors, this can prevent more WP contributors to contribute to WD and to use WD data in WP.

Saying that WD is a wiki is not an argument when defining requirements for contribution right: wiki doesn't mean that contributions have to be done under IP. Registration is unique, so no sense to say that requires a lot of time, registration can be anonymous, no need to provide a real identity. Why IP should be able to contribute ? Especially in WD where most of contributions are done by bots or at large scale where identification of contributors is necessary. Snipre (talk) 11:37, 20 April 2018 (UTC)

Snipre (talk) 11:37, 20 April 2018 (UTC)

There are ~27.000 IP edits per day in English Wikipedia, and 1900 IP edits per day in Wikidata (both in 2018 according to Wikiscan). Why should we allow the former, but not the latter? —MisterSynergy (talk) 11:45, 20 April 2018 (UTC)

@MisterSynergy:

Because the amount of IP edits is low in WD compared to the ones of bots and registered users, so the loss won't be critical and we can expected that some IP editors will create an account to continue to contribute.
Data modeling of data plays a critical role in WD and absolute respect of the data format (correct use of properties, addition of the good qualifiers, whole description of sources,...), and for that kind of work, bots are more efficient than individual contributors. IP contributors represent a high risk to add data in the wrong format leading to useless contributions due to lack of knowledge.
One edit in WD can impact several dozens or even more articles in different WPs. It is necessary to have a real possibility to contact the editor when conflicting data are generated especially if the sources are not online.
The registration operation is a time consuming effort for vandals compared to the effort of modifying one statement, this will discourage most vandals as it will require more actions and efforts to be able to do one vandalism.
It will be more easy to track editions of vandals by using the username to revert them and once a contributor is considered as a vandal, the blocking of the account is definitive not like IP blockling due to dynamic IPs. Snipre (talk) 22:13, 20 April 2018 (UTC)

Technical feasibility is not the problem here. Instead of preventing a part of the editors to contribute, we should better continue improving the quality on Wikidata through tools to check data and sources, to patrol, to give a better overview of the data's quality. That's what the editors and the development team are already working on. Lea Lacroix (WMDE) (talk) 15:02, 21 April 2018 (UTC)

@Lea Lacroix (WMDE): The problem is that WP contributors doesn't accept as valid arguments "we will do something or we are doing something". They want to see what is currently in place to avoid vandalism in WD widespreading in WPs using WD data. The main problem is that to handle all requirement from WPs in term of data monitoring, the corresponding control in WD is huge.

Expecting that patrol will provide a sufficient answer is an error: what is the ratio modifications/active contributor in WD ? Same for watchlist : what is the ratio items/active contributors ? Then for a regular check between original databases and WD data, only online and open databases can be systematically checked. If someone use a book as reference, there is no way to perform an automatic control.

We were trying to show that WD proposes a set of well sourced data and it was possible to filter/extract that data. Answer from WP users: how do you prevent that someone change a value without modifying the source ?

In some point we have to balance the need of higher restrictions to protect data in order to provide enough guarantee about data quality and the will of offering the possibility to everyone to what's is want with a reject of WD by users. Snipre (talk) 15:07, 24 April 2018 (UTC)

@Snipre: I understand your concerns. But at the same time, as my personal experience, I feel this is not necessarily true, "vandalism ... widespreading in WPs using WD data". I suppose vandal rate varies depending on properties (or genre). At least region I'm editing, vandal rate is extreamly low in WD. I've been checking anatomy related identifier properties some years in WD (e.g. Terminologia Anatomica 98 ID (P1323), Foundational Model of Anatomy ID (P1402)). There are roughly about 10,000 data counts, as total. And data is used in roughly 50 different language Wikipedia editions now. Among those, through some years, apparent vandalism which I know was only once. I can't recall the exact item, but that was sexual organ item. One value was overwritten by text like "my d*ck is big lolol" or something like that. Yes, format constraint light-uped that edit. So that was smoothly reverted. So I think properties (or genre) difference makes huge difference in vandal rate in WD, as same as Wikipeida. --Was a bee (talk) 10:29, 29 April 2018 (UTC)
Although this is not smart style, but a brute-force way, editing with QuickStatements using local "validated excel file" repeatedly is simple answer to keep WD data in perfect quality. Because QuickStatements skips edit order, if exact same data already exists in target page. --Was a bee (talk) 11:12, 29 April 2018 (UTC)

@Was a bee: Sure, nobody proves until now that vandalism rate is higher in WD than in WP, but when you speak with WP contributors, they will always show you examples of wrong data or vandalism which were not corrected even after a certain time. Yes, some tools exist in WD, especially the constraints system, but who is taking care about that system ? Currently I have the feeling that people create constraints but don't curate the output.

There is a second criticism of WP contributors: we have a solid WD community, but if we calculate the ratio total items number/total active contributors, we have a very low monitoring capacity. That's fact, so we have to show our automatic monitoring systems in action and especially our capacity to treat the output of the control systems like constraint violation reports.

My feeling is that WD is composed mainly of individual contributors and we have very few wikiprojects with enough contributors to monitor some items classes (in my area, we are 2-4 contributors for more that 100'000 items), we struggling to establish a data model and we don't curate data at a sufficient rate to keep the data quality at a good level. The critical element is to have bots or automatic scripts performing control and reporting modeling errors or comparing WD data with external data sets to detect wrong data. Snipre (talk) 07:48, 30 April 2018 (UTC)

@Snipre: Yes, bot is one good tool for maintenance. If there is task specific auto-run bot, that's good. But at the same time, the basic reason why custom bot is needed for data maintenance in WP is basically that WP doesn't support any basic functionalities which enable maintaining large volume of data. We can not download infobox data from WP. So we can't assess or crosscheck or update the WP infobox data with external sources, without custom bot program in WP. But in WD, situation is different. For example:

All data is downloadable in spreadsheet format (w:Microsoft Excel format) as default functionality. (for example, run this query, and waiting 30 sec or so, CSV file is downloadable from "download" button at middle-right.)
If one had downloaded WD data in Microsoft Excel format, it is not difficult to compare it with other excel file. I personally use basically two Excel functions for that purpose, VLOOKUP and EXACT. If one don't like comparison in local, there is also a tool which enables comparison online (Mix'n'match), after upload excel file to this tool.
Batch editing (for example, hundreds, thousands, or more) from spreadsheet data (w:Microsoft Excel data) is possible by QuickStatements. To use QuickStatements, no need to learn program language like python or C++ or something else. No need to install PHP or Java or something else to own computer. If one has spreadsheet data, only from that, one can edit large volume of data in WD.

From these reasons, as my personal thought, I feel that field-specific automated-bot is not necessarily needed to keep quality in WD (surely it is good if there is, though). My thought is simple. Comparing WP and WD in criteria of quality, quantity and maintainability. Then if WD is better, then use WD. If WP is better, then use WP. In my current feeling, if data type is external identifier, in most cases WD is better than WP in that criteria. As additional info, external identifiers are current WDs' main product. Currently there are roughly 4,500 properties in WD (Wikidata:Database_reports/List_of_properties/all). Among them, about 2,600 are external-identifier. :)

And for identifiers, most important constraint is format constraint. If identifier data violate format constraint, 100% it is wrong data. So this is the most crucial constraint for identifiers. When I see various report pages of identifier data type, it is not rare that "format violation is zero" even if data count is 10,000 or 100,000. I think this fact is one evidence which shows why WD is good :D --Was a bee (talk) 09:11, 3 May 2018 (UTC)

number of points/goals/set scored

Hello. A league has a number of matches played and number of points/goals/set scored. There is no problem with number of matches played/races/starts (P1350). But there is a constrain with number of points/goals/set scored (P1351). It must be used as qualifier constraint. So, how can I add the information about the total goals scored in a league? Xaris333 (talk) 20:08, 30 April 2018 (UTC)

Notified participants of WikiProject Sport results I think it should be noticed of this discussion. --Sannita - not just another it.wiki sysop 08:55, 3 May 2018 (UTC)

@Xaris333: I removed the constraint for now, which was added on 1st of April. If there is a proposal how to deal with the ~1000 direct uses, we might want to re-add the constraint. —MisterSynergy (talk) 05:46, 4 May 2018 (UTC)

How do I finding special characters in a big table that are messing up a Mix n' Match import?

Hi all

I'm trying to create a Mix n' Match import but it fails because there is at least one special characters or spaces hidden in the 4000 lines somewhere. Does anyone know of any special tricks to find them? I think it could be a cyrillic letter that looks very like a western letter or something similarly not obvious.

Thanks

--John Cummings (talk) 09:08, 3 May 2018 (UTC)

I would read in the file in some programming language (I like Matlab) canvert characters to integers, and search for integer bigger than some threshold. There might be better ways. Can you post it somewhere in your sandbox? --Jarekt (talk) 11:46, 3 May 2018 (UTC)

You could try out this one. --Edgars2007 (talk) 14:04, 3 May 2018 (UTC)

Very handy indeed! Many thanks for the suggestions. The online tool did find a couple of strange characters, so definitely a thread to investigate. I'll compare with some previous imports that definitely worked and hopefully be able to tell whether it's a likely cause of the issue. Cheers again :) NavinoEvans (talk) 21:08, 3 May 2018 (UTC)

Sex or gender in the context of athletic teams

While working on adding statements for Women's Division I basketball teams I noticed that someone had added The property "sex or gender" with the value "female".

I actually editing myself but then wondered whether this was appropriate if the team members may have that attribute but the team itself doesn't.

While working on Q21531595, I see that the property is there but it's followed by a potential issue:

Type constraint Help Discuss Entities using the sex or gender property should be instances of one of the following classes (or of one of their subclasses), but Omaha Mavericks women's basketball currently isn't: person animal character that may or may not be fictional abstract being fictional animal character mythical entity

That seems to match my thinking which is that this statement shouldn't be included for athletic teams. Before I go back and remove such entries, I thought I'd double check here to make sure my thinking is correct.--Sphilbrick (talk) 17:38, 3 May 2018 (UTC)

Correct, sex or gender (P21) shouldn’t be used on team items. The alternative is:
⟨ Omaha Mavericks women's basketball (Q21531595)    ⟩ competition class (P2094) ⟨ women's basketball (Q2887217)    ⟩
. —MisterSynergy (talk) 18:13, 3 May 2018 (UTC)

Thanks--Sphilbrick (talk) 20:36, 3 May 2018 (UTC)

Sport ontology

Where can I find ontology description for items related with sports ?. In Wikidata:WikiProject Sports exists the common properties for People, Teams or competitions. However, I can't find information about sports season (Q27020041) or sports season of a sports club (Q1539532). The exemples I found, like 2013–14 Liverpool F.C. season (Q13108085) or 1993–94 Cleveland Cavaliers season (Q282856), are really poor in statements. Thanks, Amadalvarez (talk) 14:12, 28 April 2018 (UTC)

@Amadalvarez: there is no general sports ontology, unfortunately. Reasons include: (1) types of sports are quite different from each other, thus they may not be covered by the same ontology; (2) lots of items are here due to the sitelinks to Wikipedias; if we were free of them, some things would have been organized differently (i.e. better) than what we can do now; (3) understanding and handling concepts like sports competition (Q13406554), sporting event (Q16510064), sports league (Q623109), sports season (Q27020041) and a couple of others is something which many editors find difficult to do.

If you have specific questions, I can give some advice what to do and what not to do, based on the situation in other items that are instance of sports season of a sports club (Q1539532), and my experience. —MisterSynergy (talk) 06:14, 4 May 2018 (UTC)

Thanks @MisterSynergy:. I've working in WD full-powered infoboxes in cawiki gathering in a few "core infoboxes" the large amount of them created as long of WP life and I agree with your description. However, in order to avoid repeat the "free style of everyone" here in WD, I rather ask before invent the wheel. After ask here, I let a message with specific questions in User_talk:Xaris333#Sport_ontology. Some have response, some other not yet. So, I invite you to take a glance and give your experience/opinion there, sharing it with Xaris, who has create several sport related items. Thanks, again. Amadalvarez (talk) 11:42, 4 May 2018 (UTC)

I plan to have a look at that discussion later this day. —MisterSynergy (talk) 14:17, 4 May 2018 (UTC)

How should I reference data copied from Commons?

Over time I copied a lot of data from Commons templates. I often used imported from Wikimedia project (P143)Commons Creator page (Q24731821) or similar as a references. This time I will be transfering data from pages that have c:Template:Artwork like c:Category:Large Figure in a Shelter - Henry Moore (LH 652c) or File:Diego Velázquez 050.jpg. I was thinking about referencing such transfers with imported from Wikimedia project (P143)Wikimedia Commons (Q565)reference URL (P854)url statements but I can not get reference URL (P854) to work with QuickStatements as URLs with spaces do not seem to be compatible with QS. Would reference like imported from Wikimedia project (P143)Wikimedia Commons (Q565)image (P18)"file name" or imported from Wikimedia project (P143)Wikimedia Commons (Q565)Commons category (P373)"category name" be OK? Or is there a better way to reference such transfers? --Jarekt (talk) 20:30, 2 May 2018 (UTC)

Wikimedia import URL (P4656) is meant for that.
--- Jura 21:48, 2 May 2018 (UTC)
There is a big difference between 'source' or 'import location' and 'reference'. In these cases, you're sourcing/importing the info from Commons (it's where the info is currently held), but that's not the reference for that info (it's not where the authoritative information was provided). Can you follow the info back to the original reference and include that where available? Thanks. Mike Peel (talk)

As in majority of statements supported by imported from Wikimedia project (P143) "references" it is usually not clear where the data come from. That was a big issue before Wikidata. I know that actual external references are much more important but imported from Wikimedia project (P143) / Wikimedia import URL (P4656) "references" at least inform you where the data was imported from and can help you dig out external references. User:Jura1, thanks for mentining Wikimedia import URL (P4656). I did not know about it. But I might not be able to use it, as I was unable to get QuickStatements to work with URL type statements passed through URL (see Help:QuickStatements#Running_QuickStatements_through_URL) and that is the mechanism I use to speed up import of individual statements from infoboxes on Commons. That why I was wandering about using image (P18) / Commons category (P373) as reference qualifiers. --Jarekt (talk) 00:22, 3 May 2018 (UTC)

Q4115189 P31 Q5 S4656 "https://commons.wikimedia.org/wiki/Category:Large_Figure_in_a_Shelter_-_Henry_Moore_(LH_652c)"

The above gives version1 and version2. I know that some combination don't work in the first version of QuickStatements, but do in the new one. If you really need it, maybe a "Wikimedia Commons import file" could be created.
--- Jura 10:55, 4 May 2018 (UTC)

Jura I finaly think I know what is happening, See my bug report / feature request. I am trying to add ability of c:Module:Artwork to create QS codes to pass some artwork metadata present on Commons but missing on Wikidata, and at the moment I am testing it on rare case where there is an "official" image of an artwork defined on Commons but image (P18) is missing here. At the moment there are a few pages in c:Category:Artworks with Wikidata item: quick statements which can be used to test setting image (P18) through QS but the tool is not able to add reference. I will not deploy the ability to upload other statement untill I figure out some acceptable reference/source I could add to them. In case of c:Module:Creator I used imported from Wikimedia project (P143)Commons Creator page (Q24731821) and the item had Commons Creator page (P1472), similar with c:Module:Institution. In case of c:Module:Artwork I would like to provide a link to the actual page where information was copied from, so I am stuck at the moment. --Jarekt (talk) 15:02, 4 May 2018 (UTC)

I think the source url is missing quotes (and using the wrong property). I'd also skip the date if the date is today.
--- Jura 15:12, 4 May 2018 (UTC)

Changes to languages spoken, written or signed (P1412) without prior discussion?

Could someone please point out where was it discussed/decided that languages spoken, written or signed (P1412) was to be changed from "languages spoken" (as it was created) to "languages spoken, written or signed" (as it is now)? Andreasm ^{háblame / just talk to me} 05:25, 4 May 2018 (UTC)

number of platform tracks (P1103) seems likely, there are Dutchism existing on property names. --Liuxinyu970226 (talk) 05:28, 4 May 2018 (UTC)

It has an interesting history: non-native language spoken [3], language spoken [4], languages spoken [5], languages spoken or published [6], languages spoken or written [7], languages spoken, written or signed [8]. The latest version dates from 2016-06-03, so it seems overdue for another change. Ghouston (talk) 10:30, 4 May 2018 (UTC)

Well, I think someone pointed out that "Ancient Greek" isn't really spoken, but written. Then someone found that sign language (also included) isn't really written nor spoken. So we ended up with the current label.
--- Jura 10:44, 4 May 2018 (UTC)

I'm confused about whether symbolic and computer languages should be included. If not, can it be changed to "Natural languages spoken, written or signed"? :) Ghouston (talk) 10:52, 4 May 2018 (UTC)

I changed it from "languages spoken or written" to "languages spoken, written or signed" in this edit, because deaf people do not (generally) "speak" sign languages. I recall discussing it, but I can't find that discussion. – The preceding unsigned comment was added by Pigsonthewing (talk • contribs) at 16:09, 4 May 2018 (UTC).

Probaly here —MisterSynergy (talk) 16:21, 4 May 2018 (UTC)

Request for assistance

I'm not quite sure if this is the appropriate place to post this request or not, but I'm writing to request help from a Wikidata administrator or other appropriate authority. (If this isn't the correct place to post this request, my apologies; I'd appreciate it if someone could direct me to the appropriate place. My apologies in advance for this long message, but I need to give context about the problem and the actions I've taken to try to resolve it.)

Someone deleted my User Page (User:47thPennVols) from Wikidata on April 7, 2018, indicating that it was "Out of project scope", but did so without reaching out to me first via my Talk page to provide any guidance to me or to advise me that he/she would be deleting my User Page. I've tried reaching out to this Wikidata user, who is apparently a Wikidata administrator (via that user's talk page), to find out why my User Page was deleted, but have received no response. (I'm not posting the Wikidata administrator's user name here because I'm not upset with that administrator or trying to get him/her in trouble in any way. I just need help getting my User Page restored because I believe that it should not have been deleted.)

As a bit of background, I only discovered the deletion of my User Page while working on a bio of article for the English version of Wikipedia (as part of the April 2018 drive by Women in Red to increase the number of women's biographies on Wikipedia). I had located the correct given name for my bio subject (who was only listed in Wikidata by her nickname), and thought I should add her correct given name to her Wikidata entry (Q24009728) to help other Wikipedians who might be researching her life. That's when I noticed that a Wikidata administrator had deleted my user page roughly three weeks earlier. I've been a member of Wikipedia since 2015 (and have been writing articles on and off since then as User:47thPennVols), and always try to do the right thing but, like many of fellow Wikipedians, am still learning all of the ins and outs of Wikipedia procedures. I hadn't made any changes to any Wikidata entries prior to this year, but began doing so by making minor edits because I had found info that wasn't available on Wikidata, and thought it might be helpful to other Wikipedia researchers. (These minor edits are supported by primary sources which are included in the Wikipedia-English bios I've writte.) I find myself wondering if I accidentally committed some sort of violation of a Wikidata rule, which might have prompted the Wikidata administrator to delete my User Page, but I have no way of knowing because that administrator didn't provide any warning or explanation.

I would appreciate it if a Wikidata administrator could communicate with me directly so that I know why my User Page was deleted and whether or not it's possible for that deletion to be undone. (If it can, then I'd also appreciate it if a Wikidata administrator could revert that deletion.) Thank you in advance for your help. 47thPennVols (talk) 17:00, 28 April 2018 (UTC)

@47thPennVols: I am not the admin who deletes it, but I restored your user page. You should link it to your user page on English Wikipedia, and add babels to it. --Okkn (talk) 18:38, 28 April 2018 (UTC)
- @Okkn: Thank you so much for your help. How do I link my Wikidata User Page to my English Wikipedia page? (Although I'm not new to Wikipedia, I'm relatively new to Wikidata, and am still trying to figure out the similarities and differences between the two.) 47thPennVols (talk) 19:15, 28 April 2018 (UTC)

Just a link in the form 47thPennVols, for instance. --Tagishsimon (talk) 21:05, 28 April 2018 (UTC)

@Tagishsimon: Thank you so much for your help! (My apologies for the delay in responding. I've had my "nose to the grindstone", reseearching and editing bios for the WikiProject Women in Red over on English Wikipedia, and only just saw your message this morning.) 47thPennVols (talk) 15:55, 4 May 2018 (UTC)

@Romaine: Why did you delete it? ChristianKl ❪✉❫ 13:59, 4 May 2018 (UTC)

@ChristianKl: Although I never received a response from Romaine re: why he/she deleted my User Page for Wikidata, @Okkn: was kind enough to restore it for me. (Thank you, again, Okkn.) Kind Regards. 47thPennVols (talk) 15:55, 4 May 2018 (UTC)

The only reason why I would delete a user page is on request or when it is spam. Looking back in the logs I see that another user marked your user page as out of project scope, and in a quick review of that deletion request I might have found your user page looking relatively similar to the many many spam user pages that have been created or attempts of people to write a Wikiepdia article on the wrong wiki. I realise now that this review I made by mistake, my apologies for that. Romaine (talk) 00:52, 5 May 2018 (UTC)

Identify face in image

To improve the quality of images added, is it possible to get a list of Qids (humans) whose image doesn't have any face in it? Capankajsmilyo (talk) 03:13, 29 April 2018 (UTC)

I don't think there should be any. Maybe request a list at Wikidata:Bot requests. The 700,000 images would need to be analyzed by some algorithm and negative results listed.
For a simple start, maybe one could attempt to check images in P18 that should only be in image of grave (P1442) or plaque image (P1801) based on Commons categories. Note that in some rare cases, these show portraits or sculptures suitable for P18.
--- Jura 06:32, 30 April 2018 (UTC)

There are plenty of open-source AI libraries which can identify a face in an image with high precision. So we can use it and tag Q items with the wrong images. Can it be applied? Capankajsmilyo (talk) 19:44, 2 May 2018 (UTC)

You could try to use that to add a qualifier that tags them as pass or flag for review.
--- Jura 21:45, 2 May 2018 (UTC)

AI is in python and Wikipedia seems to use Lua. How to move ahead? Capankajsmilyo (talk) 04:04, 5 May 2018 (UTC)

Can someone press the button to create an approved new property?

Hi

I'm running an in person project in the next days which requires a new property Wikidata:Property_proposal/Directory_of_Open_Access_Journals_ID (I started it almost 2 months ago) which has been approved and is ready to be created. Could someone who has the magic powers please press the button for me so the property is created? I'm sorry to try and jump the queue but the project is stalled without it.

Thanks very much

--John Cummings (talk) 14:54, 2 May 2018 (UTC)

Done--Micru (talk) 21:32, 2 May 2018 (UTC)

John Cummings, Micru, that is controversial, there was some strong opposition. Is it a repetition of ISSN? 77.179.61.171 20:43, 4 May 2018 (UTC)

As I explained in the discussion, DOAJ uses ISSN numbers as identifiers but they are sometimes not an exact match, e.g where multiple different format versions of the publication exist. It also allows for DOAJ to begin having pages for other kinds of information that are not journals, e.g publishers. --John Cummings (talk) 21:26, 4 May 2018 (UTC)

The decision appeared to be measured and with precedent, 77. What do you hope to achieve by rehashing it here? --Tagishsimon (talk) 21:33, 4 May 2018 (UTC)

Wikitext highlighting out of beta

Wikitext syntax highlighting, also known as CodeMirror, has been moved out of Beta Features and is available in the 2017 Wikitext Editor on all wikis. Syntax highlighting helps you see problems in your wikitext before previewing or publishing text. Please try out the tool if you did not do so while it was being developed, and feedback is welcome. - Keegan (WMF) (talk) 18:55, 4 May 2018 (UTC)

Year and Calender Year

Currently, 2015 (Q2002), 2018 (Q25291), 2020 (Q25337) and other year items are instances of year (Q577), but year (Q577) is an instance of unit of time (Q1790144), so I think it is inappropriate for the value of instance of (P31) in year items. Because 2018 (Q25291) is not an unit of time (Q1790144), but may be an instance of time interval (Q186081) or point in time (Q186408), I think each year item should be a subclass of calendar year (Q3186692), and calendar year (Q3186692) should not be a subcalss of year (Q577), but a subclass of time interval (Q186081) or point in time (Q186408).

I drew a plan for the data model, as shown in the figure. Do you have any thoughts?--Okkn (talk) 11:50, 24 April 2018 (UTC)

If you have items for the years (AD/CE) 2016, 2017, 2018, etc., then I think they should be instances of the calendar they are part of and not of Q577.
The problem with Q577 might be that eventually, you will have an article in sitelinks that covers both (or has some appendix that also covers calendar year) and you will end up debating with some contributor from that wiki about the question if it should be a class or an instance of one or the other, or if there should be a qualifier "except in xzwiki" or "applies to part" "enwiki" (and other things that have nothing to do with the question you are asking).
--- Jura 12:16, 24 April 2018 (UTC)

I share Jura's concern. In the case of the English Wikipedia, there are two articles, w:Year and w:Calendar year, so it is obvious which article should be linked to which item even though the English Wikipedia article "Year" does briefly describe calendar years. I don't know how this will work out in other languages. Jc3s5h (talk) 13:12, 24 April 2018 (UTC)Ghouston (talk) 05:39, 25 April 2018 (UTC)

I Support this proposal from Okkn, it looks logically consistent and makes sense, though I'm sure there are elaborations that could be added to describe other calendars etc. ArthurPSmith (talk) 12:28, 24 April 2018 (UTC)
I also Support the proposal by Okkn. Regarding the discussion about linking to Wikipedia articles: I don't think the way we design our ontologies should be directed at providing a Wikidata item for every mix of entities present in some Wikipedia article. The thematic delimitation of Wikipedia articles is not mainly driven by ontological, but by pragmatic considerations. Take the example of articles about museums – they often cover the building and the organization in one article. There is no need to mirror this combination in the ontology; just link to the WD-item that is the primary object of the article and ignore the other objects. --Beat Estermann (talk) 13:34, 24 April 2018 (UTC)
unit of time (Q1790144) is currently a subclass of unit of measurement (Q47574) and time interval (Q186081), which seems OK. I don't think calendar year (Q3186692) should be a subclass of point in time (Q186408). Ghouston (talk) 05:39, 25 April 2018 (UTC)
- @Ghouston: I don't think
  ⟨ unit of time (Q1790144)    ⟩ subclass of (P279) ⟨ time interval (Q186081)    ⟩
  is correct, because unit is not a temporal entity (Q26907166). And, If you say he was born in 1962, for example, isn't this 1962 a point in time (Q186408), in a sense? --Okkn (talk) 08:34, 25 April 2018 (UTC)
  - The instances (second, minute, etc.,) are intervals of time, aren't they? They are defined that way. Maybe time interval (Q186081) shouldn't be a subclass of temporal entity (Q26907166), since temporal entity (Q26907166) seems to be about "things that take place in time" rather than time itelf. Saying that an event took place in 1962 isn't the same as saying the whole year is a "point in time". Ghouston (talk) 08:54, 25 April 2018 (UTC)
    - I would say that second, minute, etc. are duration (Q2199864) or used with duration (P2047) but are not time interval (Q186081). The description of time interval (Q186081) states "temporal extent having a beginning, an end and a duration". Second, minute, etc. lack a beginning or end, they only have durations. Jc3s5h (talk) 14:47, 25 April 2018 (UTC)
      - @Jc3s5h: But duration (Q2199864) is a subclass of time interval (Q186081).
        ⟨ duration (Q2199864)    ⟩ subclass of (P279) ⟨ time interval (Q186081)    ⟩
        is incorrect? --Okkn (talk) 16:12, 25 April 2018 (UTC)
        If we accept duration (Q2199864) as a subclass of time interval (Q186081) it seems to me that makes them the same thing. duration (Q2199864) is also a subclass of scalar magnitude (Q28733284) and I think that is the correct superclass, not time interval. Jc3s5h (talk) 17:58, 25 April 2018 (UTC)
        I guess it should be unit of time (Q1790144) subclass of (P279) duration (Q2199864) and time interval (Q186081) subclass of (P279) duration (Q2199864) subclass of (P279) scalar magnitude (Q28733284). Ghouston (talk) 05:39, 27 April 2018 (UTC)
        @Ghouston:, I don't agree that time interval (Q186081) subclass of (P279) duration (Q2199864). Time interval requires two numbers, the start time, and the end time. Duration is a scalar and only contains one number. Jc3s5h (talk) 08:37, 27 April 2018 (UTC)
        I'm thinking of duration (Q2199864) as being the set of all possible durations, which can start at any point in time, and time interval (Q186081) is then a subset of those durations which start a particular point in time. Ghouston (talk) 08:52, 27 April 2018 (UTC)
        OK, I guess when we abstract something in a subclass to put it in a superclass, we always throw away lots of information, so you're reasoning seems sound. Jc3s5h (talk) 09:43, 27 April 2018 (UTC)
        @Ghouston, Jc3s5h: We can't accept
        ⟨ velocity (Q11465)    ⟩ subclass of (P279) ⟨ speed (Q3711325)    ⟩
        , can we? The relation between time interval (Q186081) and duration (Q2199864) are different from this? time interval (Q186081) could be defined as a pair of "start time" and "duration". --Okkn (talk) 12:11, 27 April 2018 (UTC)
        I know there's a tendency for "subclass" items to slowly mutate step by step into completely different things, so it's a concern. Every real duration has a start time. A duration without a start time is either an abstract duration (like the time it would take my electric kettle to boil a litre of water) or a duration which actually happened but where the start time is unknown. Maybe it is very similar to velocity (Q11465) vs speed (Q3711325): a speed in reality is always associated with a direction, only an abstract value or unknown value is a speed alone. Ghouston (talk) 23:22, 28 April 2018 (UTC)
    - @Ghouston: I'm not sure whether
      ⟨ unit of time (Q1790144)    ⟩ subclass of (P279) ⟨ time interval (Q186081)    ⟩
      is correct or not. But I think that the superclass of year (Q577) and that of calendar year (Q3186692) can't be the same item. Do you believe
      ⟨ calendar year (Q3186692)    ⟩ subclass of (P279) ⟨ year (Q577)    ⟩
      is correct? --Okkn (talk) 16:12, 25 April 2018 (UTC)
      - Yes, I agree year (Q577) and calendar year (Q3186692) are different, and one can't be a subclass of the other. Ghouston (talk) 05:41, 27 April 2018 (UTC)
        @Ghouston, Jc3s5h: Ok. What can be the superclass of calendar year (Q3186692), then? --Okkn (talk) 07:10, 27 April 2018 (UTC)
        ⟨ calendar year (Q3186692)    ⟩ subclass of (P279) ⟨ time interval (Q186081)    ⟩
        , since a calendar year has a well-defined start point as well as a duration, while a year (Q577) is apparently just a duration that can start at any point in time and is still a subclass of unit of time (Q1790144). Ghouston (talk) 07:23, 27 April 2018 (UTC)
        
        I agree with Ghouston, calendar year is a subclass of time interval. More precisely, all the specific calendar years, such as 2001 (Q1988), in any calendar system, form the set of all calendar years, and the set of all calendar years is a subset of the set of time intervals. Jc3s5h (talk) 08:46, 27 April 2018 (UTC)
        Ok, I agree with you.
        ⟨ calendar year (Q3186692)    ⟩ subclass of (P279) ⟨ time interval (Q186081)    ⟩
        may be right. --Okkn (talk) 12:11, 27 April 2018 (UTC)
@Jura1, Ghouston, ArthurPSmith, Beat Estermann, Jc3s5h: By way of experiment, I have edited year (Q577), calendar year (Q3186692), common year (Q235729), leap year (Q19828), unit of time (Q1790144), duration (Q2199864), Julian year (Q217208), Gregorian year (Q39628023), tropical year (Q189607), and year BC (Q29964144) to construct a logically consistent model. Is there anything you'd like to add or remove? --Okkn (talk) 16:07, 29 April 2018 (UTC)

I have changed the duration of common year and leap year, expressed in seconds, by adding ±2 seconds. This allows for the possibility of positive or negative leap seconds in June or December. Of course, the chance of a negative leap second or two leap seconds in the same year are remote. The standard defining leap seconds gives preference to June and December but allows a leap second in any month, but I think the chances of 3 leap seconds in 1 year so small we can neglect it.

Also, I question the opposite of (P461) properties on common year and leap year. Yes a Gregorian or Julian calendar year must one or the other of leap year and common year, but they are not opposites in the sense of good is the opposite of evil. Jc3s5h (talk) 17:47, 29 April 2018 (UTC)

In my opinion, plus-minus signs should not be used subjectively. In this case, duration of year can be (365 days + 1 second), but can't be (365 days - 1 second). The use of "±" is valid? And the reason to choose the value of "2" is also a little subjective.

Common years are sometimes called "non-leap year", so I think they are the opposite of leap years. --Okkn (talk) 15:55, 30 April 2018 (UTC)

As far as I know, use of the plus-minus sign is the only way to add an uncertainty in the user interface. If you know a way to enter 31,536,000 -0 +2, I would like to learn about it. Also, negative leap seconds are allowed by the standard, although they are not expected to occur. (The second sequence would be 57, 58, 0, 1.) Evaluating uncertainty always involves a subjective judgement about which events are just too outlandish to include. We could write 31,536,000 - 31,536,000 +2, allowing for the end of the world just as the new year begins. Jc3s5h (talk) 17:09, 30 April 2018 (UTC)

I have added subclass of (P279) orbital period (Q37640) to year (Q577). We could argue that since there are several units of measure that fit the description of year, we should state that year is a subclass of unit of measure, rather than a instance of unit of measure. I have not made the latter change; I'm want to know what others think. Jc3s5h (talk) 18:26, 29 April 2018 (UTC)
Would you also create a new item that could be used as value for P31 on items for years such as 2018 AD/CE? That item would be a subclass of calendar year (Q3186692).
--- Jura 20:12, 29 April 2018 (UTC)

@Okkn: @Jura1: I was pleased to see that items like 1 BC (Q25299) have alias indicating their name in astronomical year numbering (1 BC is a.k.a. 0, 2 BC is a.k.a. −1, etc.). This emphasis what period of time the year covered, rather than how the name of the year is written. But the description for year BC (Q29964144) states "any year item that is suffixed by B.C or B.C.E", which puts an emphasis on how the name of the year was written. Perhaps the description should be something like "any year of the Roman, Julian or Gregorian calendar before AD 1."

For the sake of symmetry, I support Jura's suggestion. Jc3s5h (talk) 21:10, 29 April 2018 (UTC)

In the first place, I don't really think year BC (Q29964144) is need. However, relations between year BC (Q29964144) and year (Q577) are similar to that between historical country (Q3024240) and country (Q6256). historical country (Q3024240) is a subclass of country (Q6256), and we don't have an item of "currently existing country". --Okkn (talk) 15:55, 30 April 2018 (UTC)
- I'm not really convinced by it either, but items for years BC could be an instance of both.
  --- Jura 17:22, 30 April 2018 (UTC)

I woud say a year BC is an instance of both years BC and one of leap year or common year (in the Julian calendar, or just calendar year before AD 8 because leap status is uncertain or not applicable before then). Also, I though we didn't declare an item to be an instance of a superclass where a chain of transitive superclasses would lead to the superclass in question; am I right? Jc3s5h (talk) 17:32, 30 April 2018 (UTC)

As we have seen in this discussion, the superstructure can be quite convoluted and subject to frequent changes. So generally it's easier to figure which class(es) an item should be an instance of and then work from there upwards.
--- Jura 17:39, 30 April 2018 (UTC)

@Jura1, Ghouston, ArthurPSmith, Beat Estermann, Jc3s5h: I am sorry for pinging you many times. I think the superclass of calendar month of a given year (Q47018478), calendar month (Q47018901), calendar day of a given year (Q47150325), calendar date (Q205892), and calendar day (Q47164464) should be fixed too. And, 21st century (Q6939) and 22nd century (Q11177) are not the instance of century (Q578), are they? --Okkn (talk) 06:24, 4 May 2018 (UTC)

How can we document this?

This is an interesting discussion, but how can we make sure that 5, 10, 20 years from now someone can know why it was done this way? Should we consider the creation of a property to link statements to discussions like this one? Or would it be interesting to have annotations for the statements? It is not only about this particular case, there are many occasions that it is not clear why an editor has chosen a particular statement.--Micru (talk) 14:12, 30 April 2018 (UTC)

I think it's important to have some mention of the documentation on the user interface item page. Nobody ever reads the talk page. Even so, for the time being, we should put links to some documentation on the relevant talk pages. There should also be a link from Help:Dates.

In this case, most of the affected items are all about years and calendars, so a single page in the Help: space (or a relevant Wikiproject) would do; it would be linked to from relevant places. In the more general case, it would be good to have a way to document a rational for a single property within a single item. Jc3s5h (talk) 15:34, 30 April 2018 (UTC)

Support @Micru: That's a good idea! By doing that, the stability of the ontology of Wikidata will be increased. Can this kind of information be added as a source of the statement? --Okkn (talk) 15:55, 30 April 2018 (UTC)

For the month-related items, maybe it is month (Q5151) that should be changed. Just remove instance of (P31) unit of time (Q1790144). Everything else in "moon" seems to be related to calendar months rather than lunar months. Jc3s5h (talk) 11:40, 4 May 2018 (UTC)
- @Jc3s5h: month (Q5151) has already been used as a unit of time (Q1790144) in many items. --Okkn (talk) 10:29, 5 May 2018 (UTC)

Merge

Merge en:Category:Israeli Air Force generals (Q8556730) with de:Kategorie:Kommandeur Luftstreitkräfte (Israel) --Isranoahsamue (talk) 21:16, 5 May 2018 (UTC)

Has been done. --Tagishsimon (talk) 21:50, 5 May 2018 (UTC)

General != Kommandeur. 80.171.248.200 05:21, 6 May 2018 (UTC)

Flood flag

If I have to make about 8500 edits using QuickStatements (8 edits per item), should I request a flood flag? I did so the last time, but I'm not sure now as I saw a few users doing many such edits without any flag. It was a bit annoying because my watchlist was littered with pseudobot edits, but maybe this is typical here and a number of 8k edits is nothing big enough to bother the bureaucrats? Wostr (talk) 17:09, 4 May 2018 (UTC) I made a request (1), but I'll appreciate an info whether one should request flood flag in such situations or not. Wostr (talk) 20:10, 4 May 2018 (UTC)

Users^[who?] mostly don't care and run QuickStatements without the flag. If you are still hesitant, run your batches in background, ie. via QuickStatementsBot. Matěj Suchánek (talk) 16:52, 6 May 2018 (UTC)

#AfricaGap - What % of humans is from Africa? - Politicians from Africa

The Gender gap is one of our most succesful projects. The gap has been closed by more than one percent. I sincerely doubt that 1% of the humans that we know is from Africa. Africa is not my priority, that is Turkey (#WeMissTurkey) and its history. However, I have a small project where I create Listeria lists for the national politicians of African countries.

These lists show how much there is missing. The history of these lists will show the developments of the content. When there are categories for a specific type of politicians, I have included the property "category contains" so they can be queried by bots for additional properties in Wikidata.

The way Listeria list work mean that they can easily be copied to other Wikipedias. They will show the local articles and, the labels will be in the local language. This is a way to ask attention for the #AfricaGap.

Thanks, GerardM (talk) 08:46, 5 May 2018 (UTC)

Just to note, there is Wikidata:WikiProject every politician seeking to improve, at least, politicians ... not sure if you've come across it, GerardM? --Tagishsimon (talk) 14:47, 5 May 2018 (UTC)

Yes there is. This project of mine is a way to show what happens for Africa. It is where we are weakest. Thanks, GerardM (talk) 16:02, 5 May 2018 (UTC)

@GerardM, Tagishsimon: - Creation of items for humans should speed up. WD only has 4.6 mio. How about 10 mio by end of year? 77.180.81.191 17:01, 5 May 2018 (UTC)

I'd be interested to know what fraction of the humans on Wikidata are politicians, sportspeople or actors. These are the occupations where notability basically only requires employment. Ghouston (talk) 01:07, 6 May 2018 (UTC)

The tool Denelezh, even if dedicated to gender gap, provides these metrics, based on the property occupation (P106): among 4,255,732 humans in Wikidata, 655,523 are athletes (athlete (Q2066131)), 400,676 are politicians (politician (Q82955)) and 212,460 are actors (actor (Q33999)). — Envlh (talk) 10:41, 6 May 2018 (UTC)

That's an interesting table, thanks. Ghouston (talk) 11:33, 6 May 2018 (UTC)

The tool requires a minimum number of people to even recognise the people from a country. For many current African countries they do not reach the threshold. Thanks, GerardM (talk) 12:00, 6 May 2018 (UTC)

As I mentioned here and on my blog, the Ottoman empire / Turkish history is what I concentrate on. It should be relatively easy to import all national politicians.. <grin> you could say to one of them, you are not notable when you are not on Wikidata </grin> GerardM (talk) 17:17, 5 May 2018 (UTC)

number of item mergers

I just checked and saw I did around 200 item mergers until now (mainly wikiatricles in different languages, some categories etc.). How many items have been merged so far on wikidata and do we have any assessment on how many more mergable items are we missing? DGtal (talk) 07:48, 6 May 2018 (UTC)

We have around 1.78M redirected items as of today [9], which to my knowledge corresponds to the number of mergers we’ve done. No idea how to estimate the amount of duplicates that require a merge, to be honest. —MisterSynergy (talk) 08:14, 6 May 2018 (UTC)

Do we know what caused most of the duplicates? DGtal (talk) 09:50, 6 May 2018 (UTC)

SELECT ?value (COUNT(?item) AS ?count)
WHERE {
  ?item owl:sameAs ?tgt .
  ?tgt wdt:P31 ?value .
}
GROUP BY ?value
ORDER BY DESC(?count)

Try it!

According to this query, the most common redirects involve categories, people, taxon and disambiguation pages. Assuming that redirects correspond with mergers, then these are the most common causes for duplicates. --Shinnin (talk) 10:36, 6 May 2018 (UTC)

A weird deletion request on Wikidata:Wikidata_in_Wikimedia_projects

Hi all

I just looked at Wikidata:Wikidata_in_Wikimedia_projects and there is a suggestion that it should be deleted because of a translation issue, can someone who knows about those things take a look?

Thanks

--John Cummings (talk) 14:58, 6 May 2018 (UTC)

@SLV100: if we deleted everything that isn't finished, I'm not sure if much would remain
--- Jura 15:57, 6 May 2018 (UTC)

Obsolete language code for Belarusian (Taraškievica) edition of Wikipedia

Hello, could you please help me to change in Wikidata the language code for Belarusian (Taraškievica) edition of Wikipedia from obsolete be-x-old to current be-tarask? --Kazimier Lachnovič (talk) 15:44, 6 May 2018 (UTC)

Do you mean here? It isn't possible for me either (and I think it hasn't been since the site was moved). You can ask at WD:DEV about the plans. Matěj Suchánek (talk) 16:56, 6 May 2018 (UTC)

Merge 2x Peer Anton von Saß = Peer Anton von Sass

Q52152855 = Q52693212 92.229.132.140 17:52, 6 May 2018 (UTC)

Merged --Pasleim (talk) 18:19, 6 May 2018 (UTC)

Multi-lingual project chat links broken

The links in the panel at the top of this page, to project chat in other languages, are not clickable. Can someone fix this, please? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:47, 12 May 2018 (UTC)

Done There is an issue with the use of <bdi> in link text that causes said bdi tag to be rendered outside the link (the link is still there on the page, but there is no text inside said <a> element).

@Pigsonthewing: Mahir256 (talk) 16:21, 12 May 2018 (UTC)

This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:55, 12 May 2018 (UTC)

Award received

award received (P166) now should also have the inverse statement winner. As winner have Wikidata property related to sports events (Q28106586) how can this be corrected? I am bringing this matter up here because this will have influence on many items. Pmt (talk) 14:45, 1 May 2018 (UTC)

I think P1346 on winner was incorrect. Awards don't use P1346 in general.
--- Jura 15:53, 1 May 2018 (UTC)

generally not. Just imagine trying to add all people who were awarded Legion of Honour (Q163700)! --Hsarrazin (talk) 06:47, 7 May 2018 (UTC)

Showcasing workflows on video

Since everyone has different ways of working on Wikidata, I was thinking that it would be cool to record some videos of editors showing how they collect data, use various tools, or just edit some items. Would anyone volunteer to be interviewed via Hangouts to show their secret sauce? It doesn't need to be long/short, whatever you feel like sharing is more than enough. Just an experiment :) --Micru (talk) 20:57, 1 May 2018 (UTC)

At WikidataCon, there were a number of folks who sat down and documented their Wikidata workflow for Jan Dittrich (WMDE). Some of the results results can be seen in the Commons gallery here: commons:Category:Boards_of_WikidataCon_2017. That said, I'd be happy to try out a video recorded session explaining some of the workflows I use. -- Fuzheado (talk) 21:03, 2 May 2018 (UTC)

@Fuzheado, Micru: I would be happy to take part in such an activity; working with languages I don't speak, I have developed many habits that have made handling items in said languages much easier. Mahir256 (talk) 04:47, 7 May 2018 (UTC)

How to find a Wikidata entries by its GPS position?

Specifically, can I open a map and zoom into a certain place and get all Wikidata items displayed that have a proper GPS coordinate?

This would be very useful to identify WD and WP items, because often the name search shows too many item or it does not give any results because of language barriers.

Cheers Ceever (talk) 13:06, 3 May 2018 (UTC)

Hello, you can try WikiShootMe. It's also connected to Commons, the blue dots you see are the pictures on Commons that have localisation but are not connected to a Wikidata item. Lea Lacroix (WMDE) (talk) 13:59, 3 May 2018 (UTC)

I think this highlights a missing aspect of Search on wikidata - our search seems to be pretty much a generic wikimedia search (though with some Wikidata-specific enhancements) based on labels and descriptions. What would be *REALLY* useful would be to add a property search where search behavior was modulated by datatype: for geographic coordinates look for things "close" to a specific point; for dates look in a time range; for external id's look for either exact matches or partial (exact) matches; for quantities search by range with unit conversion; for URL's we have the external links search (though that could be better too) etc. Do we have any phabricator tickets looking into making this easier? Yes you can do all the above with SPARQL but... ArthurPSmith (talk) 18:00, 3 May 2018 (UTC)

For coordinates, we do have these kind of maps (here: items about things around Berlin + 20 km radius). Technically the map could show all coordinates we have in Wikidata, but the Query Service might time out, and the client machine will probably not be able to render it properly due to the large amount of data. —MisterSynergy (talk) 05:51, 4 May 2018 (UTC)

And, don't forget, the search doesn't even give results for simple text information stored in text fields other than labels and description (e. g. postal address, birth name etc.). That's really very annoying. --Anvilaquarius (talk) 10:19, 7 May 2018 (UTC)

Some questions about developer (P178)

Since it is a subproperty of creator (P170), developer (P178) should be used only for the software creator, not for the actual developer.

Shouldn't be renamed into "software creator" in order to avoid confusion?
Shouldn't have an inverse property to indicate what software has been created by a particul person or organization?
How to indicate the actual developer if it is different from the software creator? With maintained by (P126)?--Malore (talk) 23:58, 3 May 2018 (UTC)

I'm not sure what you mean by "only for the software creator, not for the actual developer". Isn't a developer a creator? Is it even supposed to be used for software? It seems like programmer (P943) would be more specific. Ghouston (talk) 00:04, 4 May 2018 (UTC)

@Ghouston:You're right, programmer (P943) is more appropriate. However, they are ambiguous because:

developer is commonly associated with software: out of 29000 items using developer (P178) (Wikidata Query results), 24000 are instances/subclasses of software (Wikidata Query results);
developer and programmer are commonly used to indicate someone who writes software, not only the creator. This leads to misuse the properties (for example, the value of developer property of Lastpass item is LogMeIn, though it's only the current maintainer).--Malore (talk) 22:21, 4 May 2018 (UTC)

You are suggesting that the "creator" is the person who creates the first version. I'm not sure if that's the right interpretation. If person 1 creates version 1 and person 2 modifies it to create version 2, wouldn't we say that version 2 was jointly created by person 1 and 2? developer (P178) is marked as a subproperty of creator (P170), which would mean that every developer is also a creator. Ghouston (talk) 07:30, 5 May 2018 (UTC)

@Ghouston: The creator is the person who creates the first version. The person who creates version 2 is the creator of version 2. Furthermore, IMO a programmer can contribute code without creating any version. Finally, I noted that subject items of "developer" are "software developer" and "software house", so what is the difference between "developer" and "programmer"?--Malore (talk) 14:21, 6 May 2018 (UTC)

Normally I guess there will be a single Wikidata item for software with multiple versions, so it will often have multiple developers aka creators. There are developers of things other than software, like real estate projects, but developer (P178) claims to have subject of software developer (Q183888) and software company (Q1058914), so I suppose real estate developers aren't in scope. I'm not sure if the difference between a "programmer" and a "software developer" is anything other than branding, like the way organizations have come up with numerous job titles for the field over the years. enwiki currently has separate articles for the two, but they are marked as merge candidates. Ghouston (talk) 07:33, 7 May 2018 (UTC)

Perhaps a more useful distinction would be between a software publisher (which may be an organization or a person) and a software developer / programmer as a person. Ghouston (talk) 07:39, 7 May 2018 (UTC)

@Ghouston: I think you're right about the distinction. However, I still think that developer and programmer are ambiguous and that the developer of a specific version can't be considered the creator of the software.--Malore (talk) 12:03, 7 May 2018 (UTC)

Modifiers for "Material used" and similar concepts

In my work with clothing and textiles, I frequently come across sources that say a certain garment or fabric was "traditionally" or "originally" made of wool or silk or linen, but now is also made of other fibers, synthetics, or blends. I'd like a range of qualifiers to record this information that includes "originally", "traditionally", "usually", "often", "occasionally", "sometimes", "latterly", and perhaps "nowadays" (this is similar to refine date (P4241), but for concepts other than dates). I assume there are other domains than textiles that could use a set of modifiers like this (materials used for sports equipment come immediately to mind). Does this seem like a useful set of qualifiers, and what might we call it? - PKM (talk) 21:41, 5 May 2018 (UTC)

Refine use? Qualify use? Refine use or application? Seems self-evidently useful, if there's not already a qualifier that'll serve. Support. --Tagishsimon (talk) 21:48, 5 May 2018 (UTC)

Could nature of statement (P5102) be an option? - Valentina.Anitnelav (talk) 08:25, 6 May 2018 (UTC)
- Thanks, I didn't know about nature of statement (P5102). I think it would work - "often" and "rarely" are certainly on my list, and I could add "originally" (AKA traditionally) to the one-of constraint list and I'd be good. I think "nature of statement" is a confusing label - perhaps "refine statement" would be clearer? - PKM (talk) 18:34, 6 May 2018 (UTC)
  - I agree that the label could be improved (I would lean towards "modality of statement"). This should probably be discussed at its talk page. - Valentina.Anitnelav (talk) 07:27, 7 May 2018 (UTC)

Leeuwarden's 225 names

Those of you interested in the names of places, or in Wikidata's use of aliases, may be interested in this article on Leeuwarden's 225 names. How should we reflect them, in\ Wikidata? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:52, 6 May 2018 (UTC)

Use 225 times native label (P1705) or official name (P1448) with qualifiers start time (P580) and end time (P582). --Pasleim (talk) 08:33, 7 May 2018 (UTC)

Software blocks fixing - Human item - Latvian name in label copied to several other language-specific labels

I tried to fix, but software blocks me:

Could not save due to an error.
The save has failed.
As an anti-abuse measure, you are limited from performing this action too many times in a short space of time, and you have exceeded this limit. Please try again in a few minutes.

Fixed some on each of the following items, but not all:

Wikidata weekly summary #311

Here's your quick overview of what has been happening around Wikidata over the last week.

Discussions
- New request for comments:
  - Sort identifier statements on items that are instances of human
  - Make "developer" and "programmer" properties clearer

Events
- Wikidata and GLAM workshop day in the context of the EuropeanaTech Conference, Rotterdam, Monday 14 May 2018
- Wikidata access methods, slides by Dan Scott

Other Noteworthy Stuff
- More than 850 living people' articles from the English Wikipedia which have date of death or place of death on their Wikidata item: manual checks needed. You can also check Category:P570 missing in Wikipedia
- Florian will be improving Wikidata support in the Wikipedia plugin for OpenStreetMap's JOSM editor, for the Google Summer of Code 2018
- Prssanna Desai will work on improvements for the Query Service during Google Summer of Code

Did you know?
- Newest properties:
  - General datatypes: Deutsche Bahn station category, has grammatical gender, has grammatical person, Wikimedia outline, assistant director, island of location, possible medical findings, suggests the existence of, has evaluation, evaluation of, greater than, less than
  - External identifiers: Dictionary of Algorithms and Data Structures ID, Behind The Voice Actors character ID, HanCinema film ID, Italian School ID, Directory of Open Access Journals ID, LGDB game ID, LGDB emulator ID, LGDB tool ID, LGDB engine ID, TFRRS athlete ID, All About Jazz musician ID, Ontario public library ID, Swedish Literature Bank book ID, WikiCFP event ID, WikiCFP conference series ID, ICAA film catalogue ID, Stepwell Atlas ID
- New property proposals to review:
  - General datatypes: IMDA rating, is program committee member of, officialized by, KAVI rating, topographic map, child monotypic taxon, Köppens klimaklassifisering, Möllendorff transliteration, attested, geographic center, season of club or team, sports competition competed at, factorizsation, coastline, forest cover, vehicles per capita, audio transcription
  - External identifiers: e-MEC entry, Norwegian war sailor register ship ID, Thesaurus For Graphic Materials, GNOME Wiki ID, Israel Film Fund ID, The New Fund for Cinema and Television (Israel) ID, Cinema Project ID, Chinese Political Elites Database ID, Israeli Movie Testimonial Database Person ID, Israeli Movie Testimonial Database Movie ID, JMA Seismic Intensity Database ID, Portale della Canzone italiana IDs, Trustpilot company ID, Ester ID, Dictionary of Swedish Translators
- Properties lacking use created > 3 months:
  - Sign@l journal ID (P4726), INRAN Italian Food ID (P4729), produced sound (P4733), IBM code page ID (P4734), IBM graphic character global ID (P4736), Line Music album ID (P4748), Line Music artist ID (P4747), is proceedings from (P4745), National Historic Ships certificate no. (P4750), Manus Online ID (P4752)
- Deleted properties: Réunion des musées nationaux ID (P5100)
- Query examples:

Development
- Allow continuing Wikidata entity dumps (phab:T193688)
- Make sure Wikidata entity dump scripts run for a short amount of time (phab:T190513)
- Add monolingual language code shy (phab:T184783)
- More work on preparing QuickStatements to be run by other parties (phab:T193606, phab:T192079, phab:T192365)
- Expand references with constraint violations (phab:T177970)
- Improving validation of Form representations on WikibaseLexeme (phab:T193011)
- Fixing a bug that replaces a representation by another (phab:T192264)
- Enable finding forms using wbsearchentities API (phab:T191981)

You can see all open tickets related to Wikidata here. If you want to help, you can also have a look at the tasks needing a volunteer.

Monthly Tasks
- Add labels, in your own language(s), for the new properties listed above.
- Comment on property proposals: all open proposals
- Suggested and open tasks!
- Contribute to a Showcase item.
- Help translate or proofread the interface and documentation pages, in your own language!
- Help merge identical items across Wikimedia projects.
- Help write the next summary!

Read the full report · Unsubscribe · Lea Lacroix (WMDE) 14:11, 7 May 2018 (UTC)

AdvancedSearch

From May 8, AdvancedSearch will be available as a beta feature in your wiki. The feature enhances the search page through an advanced parameters form and aims to make existing search options more visible and accessible for everyone. AdvancedSearch is a project by WMDE Technical Wishes. Everyone is invited to test the feature and we hope that it will serve you well in your work!

Birgit Müller (WMDE) 14:53, 7 May 2018 (UTC)

Dataset in JSON

Hello, I was working on an English-Sanskrit translator based on Deep learning and need dataset for that. How can I download the Wikidata dataset of sanskrit words and their English counterparts in JSON format? Capankajsmilyo (talk) 14:18, 7 May 2018 (UTC)

check out this query. The results can be downloaded in JSON. It currently only returns 1000 results but you might be able to optimize it to get more results. --Pasleim (talk) 11:37, 8 May 2018 (UTC)

Draft for the RDF mapping of Wikibase Lexeme

Hello all,

One of the things we're expecting a lot about lexicographical data on Wikidata is the ability to run queries. As previously announced, this will not be available for the first release on May 23rd, but you can already add some ideas of queries.

One of the steps to move forwards with the ability to query the data, is to have a RDF mapping ready. This task has been started by Tpt (thanks!) who created a draft for RDF mapping of Wikibase Lexeme. If you have knowledge on the topic, feel free to have a look and let comments directly on the talk page.

Cheers, Lea Lacroix (WMDE) (talk) 12:56, 8 May 2018 (UTC)

Concepts used in a process

I have a reliable source that says fulling (Q1585730) of cloth uses "moisture, heat, pressure, and friction". I have added <uses> moisture (Q217651), heat (Q44432), pressure (Q39552) and friction (Q82580), but the constraints suggest that each of these concepts should have the inverse statement <used by>. Do we really want heat (Q44432) to have a list of every process in every discipline that uses heat? Please advise before I head down that path. - PKM (talk) 18:58, 30 April 2018 (UTC)

I don't think the inverse constraint makes sense in this case. ChristianKl ❪✉❫ 14:19, 4 May 2018 (UTC)

Agreed, they can be considered inverses, but it should not be enforced by constraint. ArthurPSmith (talk) 14:49, 4 May 2018 (UTC)

Agree to removal of the inverse constraint (Q21510855) from uses (P2283) which was added by Laddo with this edit! I've also put a note at the talk page. --Marsupium (talk) 21:07, 8 May 2018 (UTC)

Indeed, the inverse constraint is not appropriate. LaddΩ chat ;) 23:05, 8 May 2018 (UTC)

Project to map the open movement on Wikidata

Hi all

I'm part of the Mozilla Open Leaders program this year and for the course I'm running a project this week to try and improve coverage of the open movement on Wikidata. I'm hoping that we can capture a lot of knowledge from different parts of the open movement including open data, open source software, open hardware, open science, OERs etc. This will both improve Wikidata and hopefully make Wikidata a more useful tool to map these different communities to help them understand and work together better. We may even get a few new people editing Wikidata.

The main thing I'd like help with is to share this tweet about the project to to get contributions from a wide number of people. You are also very welcome to take part yourselves.

Whilst this a short term project I've created Wikidata:WikiProject Open and User:I JethroBT (WMF), User:NavinoEvans and myself have created a much improved version of the Wikidata:Dataset Imports to help this work happen over the longer term.

Thanks

--John Cummings (talk) 20:56, 8 May 2018 (UTC)

Inconsistency between redlists

There’s a question for query wizards over at en.wiki at WikiProject Women in Red. Can anyone shed some light there? Thanks. NotARabbit (talk) 02:40, 9 May 2018 (UTC)

@NotARabbit: Problem identified & sorted - I left a note in the WiR thread. --Tagishsimon (talk) 03:42, 9 May 2018 (UTC)

@Tagishsimon: Thank you! NotARabbit (talk) 03:49, 9 May 2018 (UTC)

geni.com - read data from Wikidata

What tool can I use in which way, to read the Item name from Wikidata and fill it into this table? Second question, how to add there new data in the Items? Thank you very much! Regards, Conny (talk) 14:16, 6 May 2018 (UTC).

If there are more than one entries, it should be marked and manually updated. Conny (talk) 14:59, 6 May 2018 (UTC).

@Conny: You can get the query service to give you wikidata items matching the P2600 values - example report for 5 of the values - and then merge them into the table using a spreadsheet, for instance. Not sure what you mean by "how to add there new data in the Items?". I note there is no context for the table on the P2600 talkpage, so I'm not quite sure what the mission / problem is. (Although Quickstatements is probably the answer for adding data to wikidata once you have the QId.) --Tagishsimon (talk) 15:20, 6 May 2018 (UTC)

@Tagishsimon: Oh great, thank you. Quickstatements seems my answer to add claims to the Items. Happy now, Conny (talk) 15:36, 6 May 2018 (UTC).

@Conny: good; ping me if you need a hand with anything. --Tagishsimon (talk) 15:40, 6 May 2018 (UTC)

@Tagishsimon, Conny: Many easy to find errors in the list you prepared and so it is questionable how it was created. See my edits, more transparent. But that an ID is found on an article about a person does not mean it is the ID of the person described in the article. A possible next step could be to remove lines if the ID is already in Wikidata for the article in question. 92.229.132.140 19:20, 6 May 2018 (UTC)

Thanks for your work. I did some cleaning (discpages and metapages), if you think it is important - go on :) . Regards, Conny (talk) 19:28, 6 May 2018 (UTC).

@Conny: - thanks for YOUR work, even if I was not satisfied with the result :-) Look for Kaufhaus in your list - this is not a person. A Mix'n'match catalog would help, maybe loading all IDs found in Wikimedia projects. @Tagishsimon:, could you create one? enwiki also has more than 1000 IDs https://en.wikipedia.org/w/index.php?title=Special:LinkSearch&limit=5000&offset=0&target=http%3A%2F%2F%2A.geni.com . User:Edgars2007 did some querying of the WMF servers for BBLd https://quarry.wmflabs.org/query/7718, maybe this can be done for geni.com too. 78.55.177.69 11:33, 7 May 2018 (UTC)

There is one. Can later (probably not today) do a scan for all Wikipedias. If you want me to include only those items, that haven't got Geni already, say so. --Edgars2007 (talk) 12:01, 7 May 2018 (UTC)

Edgars2007, could you collect just the IDs? On which page they appear is less relevant, since a geni.com-person-link can exist on wiki pages not about a single person and on pages of persons related to that person. For dewiki an sql query is not that relevant anymore, they deleted hundreds of geni-links already. Can in quarry different wikis be combined, e.g. enwiki, etwiki, lvwiki, ltwiki, ruwiki, plwiki? 78.55.254.253 13:25, 9 May 2018 (UTC)

Was it a good idea to create items for these authors?

I'm interested in creating Wikidata items related to scientific publications in paleontology. Today I made items for each of the authors of a paper that I did not realize had a pre-existing item. This item used some kind of "string" format rather than referring to items for the authors. Was it a bad idea for me to have created those items for the authors or should the current item for the paper be reformatted to refer to them for authorship data? I'm not very experienced here on Wikidata and I was hoping you could offer some guidance on how to handle situations like this and about handling authorship more generally. Abyssal (talk) 01:47, 8 May 2018 (UTC)

Items like Maria Luísa Morais (Q52782144) are likely to be nominated for deletion for not meeting Wikidata:Notability. Basically they need either a sitelink or "serious and publicly available references". Ghouston (talk) 02:25, 8 May 2018 (UTC)

I'm not sure if it's possible to save that item or not. They have 33 publications listed at [10], but I can't see any online information could be put into the Wikidata item. If their CV was online somewhere it would give some information, but I'm not sure if a CV counts as a serious reference. Ghouston (talk) 02:42, 8 May 2018 (UTC)

@Ghouston : The paper referenced in the OP described the genus and species Cardiocorax mukulu is there any reason why its authors couldn't be given Wikispecies pages and the items kept on that ground? Abyssal (talk) 02:47, 8 May 2018 (UTC)

Yes, if they are in scope for Wikispecies they can have a sitelink, and that's good enough for Wikidata. Ghouston (talk) 02:49, 8 May 2018 (UTC)

@Ghouston : Now that we've confirmed that these author items deserve to exist, how do we use them in the item on the paper itself? Abyssal (talk) 03:37, 8 May 2018 (UTC)

Create author (P50) statements and then delete the author name string (P2093) statements. Ghouston (talk) 03:46, 8 May 2018 (UTC)

@Abyssal: If the authors have ORCID iDs, and their papers DOIs, and the latter are listed on the former's ORCID record(s), then ORCIDator will do that for you. See also m:WikiCite. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:06, 8 May 2018 (UTC)

@Abyssal: I believe we have generally concluded that authors of scientific articles, if they can be identified as individuals in some fashion (for example with an ORCID identifier, or a sufficiently unique name and affiliation in their published works) do meet wikidata's notability criteria via the structural need purpose (to interlink their authorship records) and you don't need further evidence of notability here. @Daniel Mietchen, Fnielsen, Egon Willighagen: among others have done some work on changing author name strings to author items in our data. ArthurPSmith (talk) 13:45, 8 May 2018 (UTC)

ORCID is as trustable as IMDb as people can create their ID's themselves. I higly doubt that only some ORCID ID is enough to be notable. Sjoerd de Bruin (talk) 14:30, 8 May 2018 (UTC)

And I didn't say it was. ORCID combined with authorship of a paper with a wikidata item, however, is sufficient to identify a person uniquely and create an item for them. ArthurPSmith (talk) 17:48, 8 May 2018 (UTC)

The paper is cited to support a claim in Cardiocorax mukulu (Q20722001). This creates a structural need to create items for A new elasmosaurid from the early Maastrichtian of Angola and the implications of girdle morphology on swimming style in plesiosaurs (Q29037679), its authors, and its publisher. Jc3s5h (talk) 16:50, 8 May 2018 (UTC)

Keep Yes keep making items for authors. Yes add the author property to items with author name strings. This author, Morais, now has more properties. In general I advocate that authors of publications which Wikidata indexes have their own Wikidata items. I think that the quality of ORCID and IMDb and Wikidata content is currently comparable. Misinformation and hoaxes are not common on any of these. Since Wikidata is open, cross checks with other databases will make misinformation more identifiable, but for now I think we should cross import data from databases like this. I wish we could import the entirety of ORCID and IMDb, or at least have these staged for consideration in a Wikibase instance.

A person named as author of any publication with a Wikidata item meets Wikidata:Notability. When any individual is the author of multiple publications in Wikidata then an item for that person becomes very useful. Blue Rasberry (talk) 17:51, 8 May 2018 (UTC)

There's actually no structural need in this case, because author name string (P2093) allows the authors to be named without creating items for them. I had the impression that the articles in Wikidata are generally here because they are used on a Wikimedia project somewhere; or is it possible to create items for articles at random? Ghouston (talk) 03:15, 9 May 2018 (UTC)

I don't think "structural need" should be interpreted to mean "there is no way to represent this information at all unless an item is created". I think, rather, if there is a structural need for the information, and creating an item is the normal and preferred method for representing that type of information, then the ~~information~~item should be created. Also, if an author is cited to support one claim, there is a good chance other works by the same author will be cited to support other claims, and having an item will allow the various works to be linked.

In addition, creating an item for the author is called for by Help:Sources. Jc3s5h (talk) 12:06, 9 May 2018 (UTC)

So, any article can be added provided it's used to source a statement somewhere, and all of its authors can also be created. That brings a lot of scientists, and other writers like journalists, into notability, which doesn't actually bother me. Creating items for the 5154 authors of Combined Measurement of the Higgs Boson Mass in p p Collisions at √s=7 and 8 TeV with the ATLAS and CMS Experiments (Q21558717) would be fun. Ghouston (talk) 12:51, 9 May 2018 (UTC)

If you have enough information to say that multiple papers are written by the same author there's a structural need for the author item as "author name string" doesn't hold the information that multiple papers are written by the same person. ChristianKl ❪✉❫ 13:04, 9 May 2018 (UTC)

Yes, it could be useful in this case, assuming they can be reliably linked as the same person, whereas making one for somebody hypothetically named "S. Brown" who we only know works at CERN, would be basically a waste of an item. Ghouston (talk) 13:17, 9 May 2018 (UTC)

We don't know him or her as just S. Brown of CERN, we know him or her as S. Brown of CERN, the author of Qnnnn. Later, Brown might write another article that explicitly mentions Qnnnn as being his or her work, in which case we would be able to expand on the structure that was started upon citing Qnnnn. Jc3s5h (talk) 16:21, 9 May 2018 (UTC)

Pinging more than 50 participants in a given WikiProject

WikiProject India has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

So WikiProject India had 82 members before I culled from that list those who lacked more than 50 contributions on Wikidata, those who didn't edit (or edited three times or less) in the year 2018, and those with less than 50 edits in 2018 but none in the last month (bringing the number of participants down to 48 members). This filtration made the ping I added above work properly, but it wouldn't have worked before this filtration. Apparently there is a limit on the number of users one can ping in a single edit, about which Zolo informed me on {{Ping project}}'s talk page, which makes it difficult to call on the members of a large WikiProject. Is there any good way to get around this to make {{Ping project}} work in the event a single WikiProject has more than 50 participants? Mahir256 (talk) 02:10, 8 May 2018 (UTC)

I don't think it will be possible to mention more than 50 users in one edit, but you may file a request on phabricator if you have some arguments for raising the limit. Maybe you could just split the list into two lists and modify template:ping project in a way that you could call the second list, by e.g. {{ping project|India|2}} → Wikidata:WikiProject_India/Participants/2. But this would require to make two signed edits to call every member of the wikiproject, i.e.:

{{ping project|India}} ~~~~
{{ping project|India|2}} ~~~~

Originally the limit was higher I think, but the current number of 50 was introduced to prevent situations where someone accidentally type {{Wikidata:Project Chat}} instead of [[Wikidata:Project Chat]]; spammers are the second reason (I remember that we had similar problem with 'thanks' notification in pl.wiki — dynamic IP spammer was sending a few hundred (per user) thanks notifications), so if it can be achieved in other way, I'd suggest leave the limit like it is. Wostr (talk) 18:01, 9 May 2018 (UTC)

Demande de correction de masse sur les articles ayant "cheval de course" en P31

Bonjour, un nombre assez conséquent d'articles concernant des chevaux ont la donnée "cheval de course" renseignée en P31 ; or cela est une erreur, aucun cheval n'est "de course" par nature, seulement si le propriétaire décide de le faire courir (un "cheval de course" peut n'avoir dans les faits jamais couru, mais simplement eu une carrière de reproducteur). Est-il possible de corriger ce renseignement de propriété en "cheval" tout court avec un bot ? Et d'en créer une nouvelle, propre aux animaux, qui serait l'équivalent de "occupation" (réservé aux humains), mais pour les animaux ? D'avance merci ! --Tsaag Valren (talk) 15:59, 8 May 2018 (UTC)

@Tsaag Valren: Oui, un bot peut faire le travail ou toi-même si tu veux. Tu peux préparer les données dans une feuille excel selon une structure définie et lancer un script qui 1) efface la déclaration P31:cheval de course,2) crée la déclaration P31: cheval et 3) crée une déclaration occupation: cheval de course. Il s'agit de QuickStatements 2, voir Help:QuickStatements/fr. Toutefois, il faudrait avant de te lancer dans une modification de masse, valider un concept de modélisation des données de chevaux sur Wikidata, car si ton idée n'est pas clairement présentée et acceptée, on peut facilement modifier par le même moyen ton travail. C'est le risque avec WD, de voir des changements à grande échelle de la structure des données grâce à des outils automatisés.

J'ai vu que tu faisais partie du projet Wikidata:WikiProject Equines, je te proposes donc d'ouvrir une section ou une sous-page "Data modelling" et d'y préparer une description du modèle en séparant bien le modèle de la race de celui du cheval pris dans un sens individuel.

A première vue, les éléments sur les races devraient être classifiés en subclass of (P279): horse (Q726) et chaque cheval individuel devrait instance of (P31): une race de cheval.

Exemple: Freiberger (Q673441) devrait avoir la déclaration subclass of (P279): horse (Q726) et Vaillant, un étalon reproducteur de cette race devrait avoir la déclaration instance of (P31): Freiberger (Q673441).

L'idée de mieux préciser les différents rôles que peut avoir un cheval pris dans un sens individuel est intéressante, et on devrait discuter de l'extension de la propriété occupation (P106) à d'autres catégories que les seuls humains. Il suffit de modifier les contraintes d'utilisation de la propriété, après annonce/discussion sur la page de discussion de ladite propriété.

Mais pour faire accepter ce type de modification, il faut présenter un concept clair d'utilisation de la propriété, en clair sur quel type d'éléments cette propriété sera utilisé et lister les principales valeurs associées à cette propriété: course hippique voire utiliser les sous-course genre attelage, courses de haies,..., reproducteur, voire d'autres occupation genre utilisation dans la police montée (je suis sûr que l'on peut trouver des chevaux décorés via leur fonction dans la police), utilisation dans le trait militaire,...

Bref, il faut analyser un peu toutes les situations possibles et connues sur les chevaux et proposer un modèle général capable de stocker ces informations. Une fois ce modèle défini, tu peux lancer toutes les modification que tu veux à grande échelle en justifiant que tes modifications sont le fruit d'une décision prise par le projet Equine.

Voilà mon commentaire. Snipre (talk) 01:24, 9 May 2018 (UTC)

Pour ce genre de modification, j'utilise PetScan plutôt que QuickStatements (mais peu importe).

Pour la race, cela me semble une mauvaise idée de la mettre en P31 (il me semble qu'il y a déjà eu des discussions sur le sujet, notamment autour des croisés et des sans race clairement définie).

Et oui cette discussion devrait être continuer sur le projet correspondant. PS: sinon il y a Wikidata:Bistro pour les discussions générales en français ;)

Cdlt, VIGNERON (talk) 10:07, 9 May 2018 (UTC)

Thank you for participating in the global Wikimedia survey!

Hello!

I would like to share my deepest gratitude for everyone who responded to the Wikimedia Communities and Contributors Survey. The survey has closed for this year.

The quality of the results has improved because more people responded this year. We are working on analyzing the data already and hope to have something published on meta in a couple months. Be sure to watch Community Engagement Insights for when we publish the reports.

We will also message those individuals who signed up on the Thank you page or sent us an email to receive updates about the report.

Feel free to reach out to me directly at egalvez@wikimedia.org or at my talk page on meta.

Thank you again to everyone for sharing your opinions with us! EGalvez (WMF) (talk) (by way of Johan (WMF) (talk)) 09:04, 9 May 2018 (UTC)

Thank you for spamming my watchlist three times. Sjoerd de Bruin (talk) 09:50, 9 May 2018 (UTC)

Shape files for countries

Hi all

I recently discovered that the Wikidata Query Service could in theory create heatmaps and other kinds of maps using areas (like the one below) for countries however this is currently not possible because it does no hold the shape files for the countries. To me this is a quite large missing piece for visualisations. Does anyone know where these could be obtained from and what the process would be to import them?

Thanks

--John Cummings (talk) 15:33, 7 May 2018 (UTC)

I want this too. At the bottom of this page there is a map where the user can click a country and get the right website for that country. There are 1000s of cases where Wikidata could be doing this sort of thing with a world map. I have had people ask me about this. Blue Rasberry (talk) 15:51, 7 May 2018 (UTC)

As I understand it, the big problem is licensing. Shapefiles can now be stored on Commons (in the Data: namespace, with extension .map) -- however, WMF are insisting that only CC0 data is permitted within this namespace.

Very map sources of shapefiles (eg OpenStreetMap, UK Government) require, at the very least, attribution of authorship -- and are therefore excluded from use.

This appears to be an ideological position -- there doesn't seem to be any great technical issue involved, apart from making sure uses (eg WDQS) display attribution and licensing texts. Jheald (talk) 16:29, 7 May 2018 (UTC)

Thanks @Jheald:, can you point to where WMF say they require the shape files to be CC0? @Bluerasberry: I wonder if the US Federal agency produce these files? They would be CC0, any ideas? Maybe NOAA or NASA? --John Cummings (talk) 17:00, 7 May 2018 (UTC)

See mw:Help:Map Data, talk page, and linked phabricator thread. Jheald (talk) 17:08, 7 May 2018 (UTC)

... which leads to phab:T178210, which contains what seems to be WMF's lead statement on this so far (requested in connection with c:Commons:Deletion_requests/Data_talk:Kuala_Lumpur_Districts.map):

"Will the tabular and map data features support non-CC0 datasets?"

Currently, the tabular and map data features require a license field, that supports SPDX codes to identify the dataset's license. The feature currently supports CC0. In the future, it may support additional Free Licenses, including CC BY-SA or ODbL.

Before additional licenses can be allowed, the Wikimedia projects should (1) support attribution and other obligations contained in the license (such as when displayed in the Graph extension and other consumers of tabular and map data), and (2) provide users with appropriate community guidelines on what material and license is acceptable. This support may require additional feature development that is not currently planned, but open for future open source contributions.

Personally, I'd like to see Commons take a lead on this, and tell WMF that the community *will* accept map files under open licenses, with workarounds to get round the current technical limitations (eg comment lines in the file saying the stated licence is incorrect, the true licence is ...), and tell the WMF that since Commons *will* be accepting these files, it is therefore now a technical priority to make sure open licences that are not CC0 are accurately presented.

But I don't know whether Commons has the chutzpah and self-confidence in itself as a community to go for a stand like that, to force the point. Jheald (talk) 17:48, 7 May 2018 (UTC)

For reference c:Commons:Village_pump/Proposals/Archive/2017/10#Proposal_to_include_non-CC0_licenses_for_the_Data_namespace is probably the most extensive discussion of this so far on Commons, to date. Jheald (talk) 21:52, 7 May 2018 (UTC)

Please only CC0. Wikidata is CC0, and using simply shapefiles, especially for countries on a world map, should not be bundled with other licences. KISS. 78.55.177.69 17:53, 7 May 2018 (UTC)

Why not? I presume you wouldn't object to an SVG file stored on Commons being used as a backdrop, or returned as one of the images by a WDQS query. So why object to Commons providing the same information in a shapefile format? Jheald (talk) 18:15, 7 May 2018 (UTC)

OK, so the question seems to be where can we find CC0/public domain shape files for countries? (I also asked here) --John Cummings (talk) 18:10, 7 May 2018 (UTC)

@Jheald: @Bluerasberry:, I think I found some :) Natural Earth Data, it clearly states PD/CC0 on its terms of use page. Does anyone know anything about maps who would be able to see if they are importable and would be able to upload some? I'm very happy to do grunt work but no idea how to do it. --John Cummings (talk) 18:24, 7 May 2018 (UTC)

Hi, I am a contributor/member of the "NaturalEarth-Wikidata concordances" project. The next version- Natural Earth v4.1 is very close to release, and will contain the "wikidataid" for a lot of tables ( like admin_0_countries,rivers,lakes,airports ) Current Status: https://github.com/nvkelso/natural-earth-vector/pull/249 ; https://github.com/nvkelso/natural-earth-vector/issues/224 ; For example my matching sheets: naturalearth-wikidata-20180208-admin_0_countries ; --ImreSamu (talk) 09:37, 8 May 2018 (UTC)

@ImreSamu:, fantastic news, if there's anything people can do to help please let us know :). --John Cummings (talk) 09:40, 8 May 2018 (UTC)

I am not sure if this is the best way. We have Katrographer extension, that use geoshapes straight from OSM (see in use for example in cawiki templates example or commons template infobox wikidata example). There is no need to upload shapes to Commons (current implementation is very clumsy - impossible to use upload form, no categorization, very slow (see my exapmle) etc.), just mark node/way/relation with wikidata tag (currently 527 K nodes, 180 K ways and 282 K relations tagged. What about WD+OSM federated queries?--Jklamo (talk) 09:56, 8 May 2018 (UTC)

@Jklamo: both are useful in different ways I think, eg your suggestion would not work for areas that do not exist in OSM like historic regions, distribution areas for species or any other non geographical or political area. Your suggestion would be great to have as an option, do you know how we get from where we are now to this being a functional option which works inside the Wikidata Query Service? Thanks John Cummings (talk) 12:34, 9 May 2018 (UTC)

@Sic19: You have made shapes at User:Sic19#GeoShapes. Do you have any guidance for this conversation? Is there documentation published somewhere? Blue Rasberry (talk) 14:25, 9 May 2018 (UTC)

The most useful documentation I've found is the Katrographer extension page that Jklamo referred to above, which covers the creation of maps using Commons map data and external OSM data. Very basic licensing information is given on Help:Map Data. I've experimented with both geoshapes stored on Commons and those imported from OSM - there are benefits and drawbacks to both approaches - and I would suggest gaining familiarity with the datatype is necessary before planning to use it at scale. For example, on my userpage there are two very similar looking examples of the National Library of Wales collections - on the left is from Commons and the right is OSM data - I like the option to combine geoshapes on Commons to show the Library's galleries but being able to link from the OSM data to a SPARQL query is nice too. Shame that they can't be combined though. Commons geoshape data is held back by licensing and OSM by the relative lack of Wikidata tagged objects. There is loads of potential to do interesting work with this datatype and it will improve with time I suspect.

I've previously shared a few thoughts about the licensing situation on commons:Data talk:Chepstow Castle.map. One last thing, it is quite easy to produce your own geoshapes at geojson.io and then copy the data to Commons. Simon Cobb (Sic19 ; talk page) 18:27, 9 May 2018 (UTC)

@John Cummings: A world political boundaries shapefile dataset is available on a cc-0 license here: http://dx.doi.org/10.7488/ds/1789 and can be viewed/converted to geoJSON at mapshaper.org. It looks OK and we can put it on Commons - let me know if you need any help. Simon Cobb (Sic19 ; talk page) 18:59, 9 May 2018 (UTC)

@Sic19:, thanks, I notice there's also a more detailed repo at http://hdl.handle.net/10672/124, however I can't see any license information. I have no idea how to upload the shape files, if someone can tell me how to do it I'm happy to create a page with a to do list on and make a start. Thanks, --John Cummings (talk) 14:18, 10 May 2018 (UTC)

It's also cc-0 and really nice quality. The file size is a problem - Commons geoshapes are limited to just over 2,000kb and quite a few individual countries in the high quality dataset are well over that limit. I don't think you'd gain much from the extra detail if the primary use is visualising data at country level - I've made an example in my Sandbox of the Central African Republic from both datasets (red border = high quality; yellow = low) - it is only when you zoom in that the difference becomes obvious. Simon Cobb (Sic19 ; talk page) 18:45, 10 May 2018 (UTC)

New feature for the Query Service: check the location of the browser

Hello all,

The Wikidata Query Service now offers the possibility to build queries including your current location. You can use the code [AUTO_COORDINATES] in a query to ask for the location. When running the query, the browser will ask for current location.

For example, here's a query showing the items that are located around you, with markers colored depending on P31:

#defaultView:Map
SELECT ?place ?placeLabel ?image ?coordinate_location ?dist ?instance_of ?instance_ofLabel ?layer WHERE {
  SERVICE wikibase:around {
    ?place wdt:P625 ?location.
    bd:serviceParam wikibase:center "[AUTO_COORDINATES]".
    bd:serviceParam wikibase:radius "1".
    bd:serviceParam wikibase:distance ?dist.
  }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
  OPTIONAL { ?place wdt:P18 ?image. }
  OPTIONAL { ?place wdt:P625 ?coordinate_location. }
  OPTIONAL { ?place wdt:P31 ?layer. }
}

Try it!

Related to this feature, two new improvements have been done on the interface, on any query displayed as a map:

When the map is displayed, a "marker" button is included on the left side to show your current location
A mini-map is displayed on the top-down corner of the map to show a bigger view of the location

Feel free to test it with your favorite queries and let us know if you encounter any problem. Lea Lacroix (WMDE) (talk) 14:08, 8 May 2018 (UTC)

@Lea Lacroix (WMDE): The browser did not ask for my location, it just showed the map of Berlin without any prompt.--Micru (talk) 14:18, 8 May 2018 (UTC)

The browser (Firefox) asked permission to show location. Did you allow to remember your choice earlier for Wikidata from the same browser for "nearby" or something else? --Titodutta (talk) 17:58, 8 May 2018 (UTC)

👍, --John Cummings (talk) 20:58, 8 May 2018 (UTC)

I got a location in Berlin too, although I'm in Australia. I agreed to the browser location request, but I don't think my browser knows my location. Ghouston (talk) 03:03, 9 May 2018 (UTC)

Thanks for your feedback. Indeed, when you don't accept to share location or the browser is not able to detect it, the current set up displays some coordinates in Berlin instead. Lea Lacroix (WMDE) (talk) 06:09, 9 May 2018 (UTC)

Great new feature! It works for me, but it is slower than my existing queries using BIND(geof:distance(?coord1, ?coord2) AS ?distance) combined with filter(?distance < 1000) where I define coord1 as the center of the circle. I get timeouts modifying your example query when I go beyond 1000 km even though there are fewer than 1000 matching places.--37.201.98.147 11:58, 10 May 2018 (UTC)

Creating a list of Q values

I've been trying to populate the 349 NCAA (DI) women's basketball teams with several properties. So far so good (except for main category) but exceedingly boring to do manually. I took a look at Quickstatements and think that may be the way to go, but I don't know how to populate a spreadsheet with the Q-values for each team. Obviously, it can be done manually, but there must be an easier way.

There's a table on this page; the first column contains a link to each Wikipedia article, but I don't know how to use that to automatically populate a spreadsheet with the q values. I'm hoping this is basic, and someone can tell me how to do it (or tell me if my general approach is wrong).--Sphilbrick (talk) 17:56, 9 May 2018 (UTC)

@Sphilbrick: - this report gives you the QIDs of 339 of the teams. Missing are the QIDs of

all of which are, in the templates, piped links, to redirects. You'll need to visit each of these articles and click thru to wikidata to get the QID. hth. Ping me if you have issues with anyof the above or with quickstatements. --18:49, 9 May 2018 (UTC)

Thanks, I may have some questions, but let me start with that.--Sphilbrick (talk) 19:18, 9 May 2018 (UTC)

Merge 2x Aderkas

Q53580264 = Q4057501 85.182.13.41 21:19, 15 May 2018 (UTC)

Merged --Pasleim (talk) 21:22, 15 May 2018 (UTC)

This section was archived on a request by: Matěj Suchánek (talk) 15:42, 16 May 2018 (UTC)

Persistent vandalism by User:ديفيد عادل وهبة خليل 2

https://www.wikidata.org/w/index.php?title=Property_talk:P2600&action=history , he is deleting content, switching "URL" to "ULR" ... nasty user 78.55.254.253 13:27, 9 May 2018 (UTC)

Several complaints about reverts and other edits on the talk page of that user:

Not showing any understanding of the problems:

"I use Google Translate.The phrase means "Musician of Iran"."
"You are the one who is vandalising by adding wrong descriptions"
"It makes no sense for the novice to give advice to the expert because the expert knows the correct"

78.55.254.253 13:37, 9 May 2018 (UTC)

While the user you cite here seems to have made some mistakes, at least he (David) is consistently editing with the same account. Your IP address (if you are a single person) seems to change by the minute, so if you make mistakes of that sort, it is impossible for anybody to track any pattern. Can you explain why you cannot edit with a regular account? Or at least sign your edits in a way that identifies you as an individual? ArthurPSmith (talk) 13:51, 9 May 2018 (UTC)

If you don't want IPs on wikidata then shut them down. "seems to change by the minute" - already from the page history alone one can see that is a false claim made by you. And calling repeated section removal "mistake" and not "vandalism" is a mistake in itself. In this version there is no section Mix'n'match, no section Quarry, and the former was linked to in user_talk and causes trouble because some users may have seen the vandalized version. 77.179.86.244 11:16, 10 May 2018 (UTC)

Interesting that 77.179.86.244 is responding as if they were the same person as 78.55.254.253. That's quite a difference in IP address values, about 8.5 million potential IPv4 addresses separate the two. Can I suggest, if you really cannot or refuse to set up your own account, you keep some wikitext of at least some sort of self-identifying signature or contact so your comments and edits can be identified? How does one notify you of a problem with something you have done? If I comment on the talk page of one of the IP addresses you use, will you see the comment? I noticed you (I assume it was you) mass-editing the formatter URL's of many properties a few days ago, and I was curious at the purpose, but there didn't seem to be any way to ask. While I haven't particularly noticed you making problematic edits, your criticism of others is sometimes harsh and unwarranted, but there's no way to follow up about that with you other than in a public location like this. It's a bit frustrating - all the rest of us are quite identifiable and addressable, but you are not. It has been suggested to shut down IP edits, but I know there's a lot of reluctance here on that and up to now at least it's not something I've been in favor of. But I do wish habitual IP users would do something to be a little more identifiable, or perhaps just more restrained. ArthurPSmith (talk) 19:02, 10 May 2018 (UTC)

@ArthurPSmith: please have a look at the block log. The user has contacted me at dewiki afterwards, and I had a short discussion in German with them. I have no intention to change or lift the block. —MisterSynergy (talk) 19:08, 10 May 2018 (UTC)

Why are you deleting a useful table?and how do you behave as an expert? David (talk) 13:56, 9 May 2018 (UTC)
- Useful for what? You are not involved with the item, and only reverted because you are a reverter. The user that added it had deleted data. It is all in the history of the talk page. Stop your nasty edits. GO AWAY! 77.179.86.244 11:16, 10 May 2018 (UTC)

Q1043197: "The save has failed"

According to this request, in order to carve out the Italian pages Divinazione and Mantica (more equivalent to Fortune-telling), I've tried to add the page Divinazione to the Q1043197 element, but I've recevied the error: The save has failed. The link itwiki:Divinazione is already used by item Q8011595. You may remove it from Q8011595 if it does not belong there or merge the items if they are about the exact same topic. But I prefer leave the item Q8011595 as it is now. --Skyfall (talk) 08:41, 11 May 2018 (UTC)

Same person

This, Johann Leisentritt (Q1695275), is the same person as this, Leisentrit, Johann (ADB) (Q27584192). --Gerda Arendt (talk) 12:27, 16 May 2018 (UTC)

Hi Gerda Arendt, the first item is a person, the second item is not a person but an article in the ADB (itself about the person). Everything seem ok as it is. Cdlt, VIGNERON (talk) 13:46, 16 May 2018 (UTC)

Fine, I will possibly never fully understand. What confused me was that searching for the person ONLY brought me to the ADB. --Gerda Arendt (talk) 13:48, 16 May 2018 (UTC)

Maybe an analogy could help, merging Johann Leisentrit (Q1695275) and Leisentrit, Johann (Q27584192) would be like merging God in Christianity (Q825) and Bible (Q1845). The Bible is about God but it's not the same thing as God. Johann Leisentrit (Q1695275) and Leisentrit, Johann (Q27584192) have different data that belong to different items, for instance the first begins to exists in 1527 (date of birth) and the second in 1883 (date of publication).

For the search problem, as the publication has the same name as the person, I can understand the confusion. I've added alias to help find both.

Cdlt, VIGNERON (talk) 14:27, 16 May 2018 (UTC)

(The one with Bible (Q1845) is a very good example, VIGNERON. I hope you wouldn't mind if I used it whenever explaining a similar problem.) Matěj Suchánek (talk) 17:55, 17 May 2018 (UTC)

This section was archived on a request by: Matěj Suchánek (talk) 17:55, 17 May 2018 (UTC)

What's the point to having two equivalent properties?

Why there are both radius (P2120) and diameter (P2386) properties, aren't they equivalent?--Malore (talk) 01:09, 7 May 2018 (UTC)

I agree that it would make sense to delete on of them. ChristianKl ❪✉❫ 08:07, 7 May 2018 (UTC)

Some measures are traditionally given as radius, others as diameter. Even if technically they are the same, it makes sense to keep both properties. Same as with different units, in theory they could be converted to SI, but we keep them as they are originally listed.--Micru (talk) 08:27, 7 May 2018 (UTC)

We don't have two different properties for different units and practically for the sake of the query service different units do get converted to SI. ChristianKl ❪✉❫ 11:09, 7 May 2018 (UTC)

But still we store them as they appear in the source. --Micru (talk) 11:30, 7 May 2018 (UTC)

If we maintain both, there should be a property constraint that warn if an item has only one of them or if the value of the diameter is not twice the radius.--Malore (talk) 11:42, 7 May 2018 (UTC)

Malore, "twice the radius" - could there be rounding errors? 78.55.177.69 11:48, 7 May 2018 (UTC)

Micru, for places (e.g. of birth) the sources often indicate a name, but in WD this is matched to an item that can have several names, i.e. it is not stored as in the source. Sometimes I used "stated as" to indicate the name in the source. Maybe there could be a property "diameter or radius" and the editor has to indicate what of the two it is? But queries that do calculation then also have to work on that. There might be new problems with a merged property. 78.55.177.69 11:48, 7 May 2018 (UTC)

I think radius and diameter should be kept as two different properties. In graph theory, the radius of a graph is the minimum over all vertices u of the maximum distance from u to any other vertex of the graph. And one speaks of the diameter rather than a diameter (which refers to the line itself). Also For a convex shape in the plane, the diameter is defined to be the largest distance that can be formed between two opposite parallel lines tangent to its boundary. So used in math radius and diameter may have quite different uses. Pmt (talk) 17:40, 7 May 2018 (UTC)

@Pmt: Actually, radius is defined as "distance between the center and the surface of a circle or sphere" and diameter as "the diameter of a circular or spherical object", so I think that they shouldn't be used in cases such as your examples.--Malore (talk) 12:54, 9 May 2018 (UTC)

For information, diameter (P2386) is used 10 358 times and radius (P2120) only 355 times (and only 4 items using both properties). Cdlt, VIGNERON (talk) 21:01, 11 May 2018 (UTC)

Is possible to mark a statement as problematic?

I often see statements that are probably wrong but I haven't time or competences to correct them. Is there a way to manually mark them as problematic?--Malore (talk) 01:23, 11 May 2018 (UTC)

Deprecate with a reason for deprecation. Ghouston (talk) 03:43, 11 May 2018 (UTC)

Yes, but... Deprecation suggests a statement is definitely false.

But often one is not in a place to make such a call. Instead one may want to indicate that "this statement seems odd to me. I don't have evidence that it is false. But I would like to see more or better evidence that it is true."

On text wikis, one can typically indicate this by tagging a statement with {{Citation needed}}.

It would be nice to have a similar mechanism for statements on Wikidata -- perhaps by setting a sourcing circumstances (P1480) or nature of statement (P5102) qualifier with some appropriate value. Jheald (talk) 18:10, 11 May 2018 (UTC)

I think adding nature of statement (P5102) with citation needed (Q3544030) would be a neat way to indicate this in general. Do we need an RFC to decide if this is the right way to do it? ArthurPSmith (talk) 18:25, 11 May 2018 (UTC)

Support this solution. - PKM (talk) 21:02, 11 May 2018 (UTC)

I still think deprecation is good. If a statement can't be verified from sources, you shouldn't need to prove if wrong before deprecating it, because proving a negative is often impossible. citation needed (Q3544030) statements will just stay there for ever, since there's nothing more you can do with them, and the probably false data will be in the database forever. Ghouston (talk) 23:32, 11 May 2018 (UTC)

Also, deprecation doesn't have to mean "definitely false", since using not been able to confirm this claim (Q21655367) is an option. It's just a value that shouldn't be added until it can be verified. Ghouston (talk) 23:36, 11 May 2018 (UTC)

That kind usage should be added to Help:Deprecation though, since it does currently say the value is superceded or wrong. Ghouston (talk) 23:40, 11 May 2018 (UTC)

Is it allowed/possible to create two "same" claims for a given entity?

Do you have examples of entities that have claims of wikibase-item type with twice or more times the same target entity. In pseudo-triples (P31 or else, as long as it points to another entity):

S1 P31 O again . S1 P31 O again .

Ideally without qualifiers/references - but would love to see examples of all cases.

Thank you! – The preceding unsigned comment was added by 88.111.72.93 (talk • contribs) at 17:45, 11 May 2018‎ (UTC).

No, duplicate claims are collapsed into a single claim in wikidata (and regarding triples, this is forced in RDF). ArthurPSmith (talk) 18:23, 11 May 2018 (UTC)

It's certainly possible to have two identical claims for a single entity if they have different qualifiers (although QuickStatements cannot create them). See eg for David Davis (Q300023) the two different claims for position held (P39) member of the 54th Parliament of the United Kingdom (Q35647955).

I am not sure if such multiple claims can be created without differencing qualifiers, but I would imagine that they can.

In the RDF model, I presume there would only be one triple of the form wd:Qxxx wdt:Pyyy wd:Qzzz, but there could be multiple sets of triples of the form wd:Qxxx p:Pyyy ?stmt1 . ?stmt1 ps:Pyyy wd:Qzzz, wd:Qxxx p:Pyyy ?stmt2 . ?stmt2 ps:Pyyy wd:Qzzz, etc. Jheald (talk) 19:01, 11 May 2018 (UTC)

It is possible. I tested at the sandbox (diff). But in what case, is this useful? --Was a bee (talk) 21:49, 11 May 2018 (UTC)

Igbo language main page

Could an admin please change the Igbo home page from Ihü mbu to ihu m̀bụ here: https://www.wikidata.org/wiki/Q5296, also is this the right place to ask about the actual display name of Igbo, which should be Ìgbò as is written in Igbo language publications and in dictionaries, sources can be given. Thanks. Ukabia (talk) 18:50, 11 May 2018 (UTC)

@Ukabia:

Done for the link, for the record: you could have done it yourself.

For the name of the languages, I'm not sure where to do the change (but this is probably the best place to find an answer), by the way, is it « Ìgbò » or « ìgbò »? and some source could be good, yes (the english Wikipedia article, en:Igbo language, says « Ị̀gbò »).

Cdlt, VIGNERON (talk) 20:47, 11 May 2018 (UTC)

@VIGNERON: Thanks, this is the BBC news site in Igbo with Ìgbò written (capitalised first letter). There's some other sources, like this, to keep it short the name is written Ìgbò in a number of published sources. Ukabia (talk) 20:54, 11 May 2018 (UTC)

Bot-populating family names?

We currently have a significant number of entries for human (Q5) that don't have family name (P734) (out of the 100,000 current uses of Wikidata infoboxes on Commons on all topics, 27,000 are humans without family names). Hopefully that will improve now that it's higher up in the suggested properties, but they're quite important for sorting people categories on Commons, so I'm wondering if there's a good way to populate those by bot. Maybe by:

looking for instance of (P31)=human (Q5) with a value for given name (P735) but not for family name (P734), where the label minus given name (P735) has no spaces after pre/post whitespace-trimming (and equals the label of another property that has a description like 'family name' (might work for simple names, like Ad Wouters (Q15917124), but not for complex ones - and maybe could be expanded to at least handle Western middle names)
Do something similar with DEFAULTSORT parameters in Commons categories, since they're mostly of the form "Last, First" (only works for entries that have commons categories - maybe 0.5 million? Might be able to cross-check against the commons category name and the Wikidata labels)
Ditto by extracting the information from commons:Template:PeopleByName (but that's only used in ~30k categories)
Possibly also auto-creating new 'family name' items where they don't already exist

Any thoughts? Is this worth looking into further, or is it best left to human edits? Thanks. Mike Peel (talk) 23:52, 7 May 2018 (UTC)

For given names, sometimes I filter by nationality.
--- Jura 05:54, 8 May 2018 (UTC)
Same as #Request for a bot operator to run a bot: Would you add this to WD:Bot requests? Matěj Suchánek (talk) 08:46, 12 May 2018 (UTC)

Request for a bot operator to run a bot

Many categories including "National politicians in Africa" include the "category contains" "human" followed by what the category is about. For instance: "Position held" "President of Chad". My request is for a bot operator to run a bot for all the categories marked in this way in every Wikipedia. Thanks, GerardM (talk) 12:08, 10 May 2018 (UTC)

Would you put this request to WD:Bot requests? I can't spend time working on this now but could do in summer. Matěj Suchánek (talk) 08:44, 12 May 2018 (UTC)

Making query that take all events with localisation and page view from wikipedia

Hi, Me and my friend want to do the app that will show events on the map from given period of time. But we struggle with SPRAQL and I came here for help.

We want that query will take all events and it subclass objects that have coordinates. Then get when this event happend or between which dates. Then sort it by page view of each page in wikipedia.

But we can't find a way to add page views to the results. Also we are not sure that this query get all events.

Now we have https://query.wikidata.org/#PREFIX%20xsd%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema%23%3E%0A%0A%23%20limit%20to%2010%20results%20so%20we%20don%27t%20timeout%0ASELECT%20%3Fevent%20%3FeventLabel%20%3Fdate%20%3Fcoordinate_location%20WHERE%20%7B%0A%20%20%3Fevent%20%28wdt%3AP31%2Fwdt%3AP279%2a%29%20wd%3AQ1656682.%0A%20%20OPTIONAL%20%7B%20%3Fevent%20wdt%3AP585%20%3Fdate.%20%7D%0A%20%20OPTIONAL%20%7B%20%3Fevent%20wdt%3AP580%20%3Fdate.%20%7D%0A%20%20BIND%28%28NOW%28%29%29%20-%20%3Fdate%20AS%20%3Fdistance%29%0A%20%20OPTIONAL%20%7B%0A%20%20%20%20%3Fevent%20rdfs%3Alabel%20%3FeventLabel.%0A%20%20%20%20FILTER%28%28LANG%28%3FeventLabel%29%29%20%3D%20%22en%22%29%0A%20%20%7D%0A%20%20FILTER%28%28BOUND%28%3Fdate%29%29%20%26%26%20%28%28DATATYPE%28%3Fdate%29%29%20%3D%20xsd%3AdateTime%29%29%0A%20%20FILTER%28%280%20%3C%3D%20%3Fdistance%29%20%26%26%20%28%3Fdistance%20%3C%20365%29%29%0A%20%20%20%7B%20%3Fevent%20wdt%3AP625%20%3Fcoordinate_location.%20%7D%0A%7D%0ALIMIT%2010

I only found wikimedia API that answer question about page views. What we are missing here?

Thanks for help in advance. Wojtek, Poland – The preceding unsigned comment was added by 77.252.198.66 (talk • contribs) at 10. 5. 2018, 21:37‎ (UTC).

This is not possible yet. Matěj Suchánek (talk) 08:42, 12 May 2018 (UTC)

Tricky conditions for a CIL property

Hi,

I'm hesitating to propose a new property for the CIL identifier. The Corpus Inscriptionum Latinarum (Q691007) is a collection of old Latin inscriptions. There is a template on multiple Wikmedia projects: Template:CIL (Q6737381) (and almost no copyright problem).

There is two tricks that might block the proposal:

the template convert roman numbers to arabic numbers; is it possible for a property to do that? (linking to db.edcs.eu seems perhaps the most important to me)
the identifier is made of several parts (at least two: number of the volume and number of the inscription in the volume, but up to four parts)

Can someone answer this two points?

Any other idea and suggestion are obviously welcome.

PS: the situation is almost the same for the L'Année épigraphique (Q1399128) and Template:AE (Q6663771) and other inscriptions register.

@El Caro:

Cdlt, VIGNERON (talk) 07:08, 12 May 2018 (UTC)

I am searching for an article I submitted several years ago but put on hold for reference validations

Dear Editor,

I submitted my article for Lillian Juhlin Ripley-Mandl a few years ago but put the project on hold. Now I wish to re open that work for my deceased artist mother. I have all her references in my possession as they are brochures and articles from show catalogues and pamphlets that have not made it to the internet. Juhlin R. is how she signed her work. She was listed in the University of Arizona retrospect showing of Tucson's Early Modern Artists and worked at her art for fourty years before her death in 2011. I am hoping to re-open her file, as it is blocked at this time.

Sincerely, Susan R. Keippel – The preceding unsigned comment was added by 162.220.220.139 (talk • contribs) at 13:24, 12 May 2018‎ (UTC).

The article is here --Tagishsimon (talk) 14:30, 12 May 2018 (UTC)

Wikidata - Data Resolver

Hi,

I am currently diving into the wikidata project and am mainly interested in information on persons. While getting the json data is not a problem at all, parsing the information for further analysis is rather challenging. I am looking for a way to simplify the retrieval of information. For example, i would like to have specific functions like:

gender = person.get_gender()

For testing purposes i have written a simple python base class and deriving from that a more specific person-class: Here is for example the mentioned gender method of the person class:

   def get_gender(self):
       GENDER_DICT = {
           "Q6581097": "male",
           "Q6581072": "female",
           "Q1097630": "intersex",
           "Q1052281": "transgender female",
           "Q2449503": "transgender_male"
       }
       if "P21" in self.json_item["claims"]:
           try:
               id_val = self.json_item["claims"]["P21"][0]["mainsnak"]["datavalue"]["value"]["id"]
               gender = GENDER_DICT[id_val]
               return gender
           except Exception as e:
               print(e)
               return None

Does a library like this (also for other languages than python) exist? I'd rather use and contribute to an existing library than duplicating it.

Thanks and Greetings,

Niklas – The preceding unsigned comment was added by OSHist (talk • contribs) at 18:35, 10 May 2018‎ (UTC).

Hello @OSHist: I'm just a copy-and-paste level programmer, so I don't know well. But I know that there is Lua library which provides generic functions (Extension:Wikibase_Client/Lua). --Was a bee (talk) 05:41, 13 May 2018 (UTC)

Try Pywikibot. Matěj Suchánek (talk) 10:31, 13 May 2018 (UTC)

Link back to relevant Wikipedia article?

It's easy to get to Wikidata from Wikipedia, but not the other way around. It would be nice, after making a quick Wikidata edit, to be able to return to the article that I had been on — without having to fiddle with my browser's Back button. 1980fast (talk) 18:50, 12 May 2018 (UTC)

The links are at the bottom of the Wikidata items. I'm not sure why they aren't put in the sidebar too: it would be more familiar to people who don't use Wikidata often. Ghouston (talk) 00:52, 13 May 2018 (UTC)

Wikidata linking to Wikipedia redux

An RFC at the English Wikipedia again brings up the issue of deleting any ~~link~~ reference to Wikidata for people that do not have their own entry in the English Wikipedia, but appear in Wikidata: w:Wikipedia_talk:Manual_of_Style#New_RFC_on_linking_to_Wikidata. This RFC wants to remove hidden text with a Q-number ~~links~~ to people that have been deleted the English Wikipedia but still have entries in Wikidata. For example "William D. McDowell". This will let an editor know that if an article is recreated or a new entry is created for this person, an entry at Wikidata already exists. This will hopefully reduce duplication of Wikidata entries. It also allows Wikipedia to disambiguate people that appear in articles and lists that may never have Wikipedia entries. This way a person can know that say "John Smith, Mayor of Yourtown<!-Q123456-->" in an article on Yourtown is the same person as "John Smith, President of BigCompany<!-Q123456-->" that appears in the article on BigCompany. It will allow someone who creates an article in the future to search for hidden text on the string "Q123456" and find both entries and create the properly disambiguated link. Please add your thoughts at the English Wikipedia RFC no matter which way you feel about the issue. --RAN (talk) 01:52, 10 May 2018 (UTC)

@Richard Arthur Norton (1958- ): Why is this matter brought up on Wikidata? (and marked With an yellow header). Pmt (talk) 11:37, 10 May 2018 (UTC)

Because it involves Wikidata, is that not self evident by the use of the word "Wikidata" in the text? It is highlighted because it is not a "one and done" issue. Most issues brought up here only require one person to answer, then the issue is done. Once this is no longer at the bottom of the list, people still need to see it and read it. --RAN (talk) 12:46, 10 May 2018 (UTC)

Be warned that normally, mentioning enwp RfCs about Wikidata here results in en:WP:CANVAS accusations. Thanks. Mike Peel (talk) 13:40, 10 May 2018 (UTC)

@Mike Peel: Can you quote me the passage in en:WP:CANVAS you are referring to, and you think I am violating? The word "Wikidata" does not appear in :en:WP:CANVAS and of course it says to add a notice on "the talk page or noticeboard of one or more WikiProjects or other Wikipedia collaborations which may have interest in the topic under discussion [and/or] a central location (such as the Village pump or other relevant noticeboards) for discussions that have a wider impact such as policy or guideline discussions." I believe that is exactly what I did. --RAN (talk) 14:19, 10 May 2018 (UTC)

@Richard Arthur Norton (1958- ): I'm just saying what tends to happen, see [11] and [12] from the infobox RfC. Thanks. Mike Peel (talk) 15:24, 10 May 2018 (UTC)

Oh, ok. I understand, thanks. I see the infobox wars are also continuing. I am amazed that it was a non issue at Wiki Commons recently. --RAN (talk) 15:27, 10 May 2018 (UTC)

Removed the highlighting. I find it annoying, and no issue ever required it to be brought into attention.--Micru (talk) 13:53, 10 May 2018 (UTC)

@Richard Arthur Norton (1958- ) : Following your link I am ending up in english wikipedia, reading a header New RFC on linking to Wikidata and a subheader RFC question Should we ban links to wikidata within the body of an article? In wich way is this inflicting an user on Wikidata? Pmt (talk) 15:36, 10 May 2018 (UTC)

I think you are asking why I am bringing up the topic since it involves English Wikipedia → Wikidata linking (more of interest to English Wikipedia users and less of interest to Wikidata users) and not Wikidata → English Wikipedia linking (more of interest to Wikidata users and less of interest to English Wikipedia users). I also mentioned that "This will hopefully reduce duplication of Wikidata entries", as I have multiple times recreated Wikipedia articles that were not notable at the time of deletion, and made a new Wikidata entry because I did not realize that the a "Jimmy Smith" redlinked in an article was the same person as "Jim A. Smith", who had an entry already. You do not have to respond if the topic does not interest you. About 80% of all topics brought up here have no interest to me, so I ignore them. You can do the same thing, it will save us both lots of time to devote to topics that do matter to us. RAN (talk) 16:01, 10 May 2018 (UTC)

@Richard Arthur Norton (1958- ): What I am Reading is; Please add your thoughts at the English Wikipedia RFC no matter which way you feel about the issue. and ...why I am bringing up the topic since it involves English Wikipedia → Wikidata linking... I still means that Things specially matters English wikipedia on one issue should not be discussed on Wikidata. So far you have involved 5 users at wikidata using their time. And you are further asking wikidata users to add their thougts to English wikipedia. Are the users of English wikipedia aware of you call for comments here at wikidata? Pmt (talk) 16:56, 10 May 2018 (UTC)

I can only say it one more time, if you have no interest in this topic, please move on to the next topic. The "5 users" you mentioned can decide how best to use their time themselves. You are spending a lot of time writing about wasted time, which is the definition of irony. I already addressed why it was posted in this venue. Please do not ask me the same question again, the answer will still be the same. Takk skal du ha. --RAN (talk) 17:08, 10 May 2018 (UTC)

I don't see any linking, just some obscure way to include Wikidata-identifiers in the source code. Are HTML comments even searchable? Sjoerd de Bruin (talk) 16:04, 10 May 2018 (UTC)

Yes, they are searchable with "insource:". It searches the raw unformatted text. I changed the wording here and the RFD, calling it a link was incorrect, it really is a hidden annotation. RAN (talk) 16:09, 10 May 2018 (UTC)

It seems to be about both linking and showing the QIDs in the code - the latter is a fallback that some people have been using to preserve the info in the case that the links aren't allowed. Thanks. Mike Peel (talk) 17:05, 10 May 2018 (UTC)

As a Wikipedia editor first and Wikidata editor second, I am glad for this tip. I don't follow MOS talks as much as I used to do, so I would have missed this discussion. Syced (talk) 06:32, 11 May 2018 (UTC)

@Richard Arthur Norton (1958- ): Did you check Wikidata:WikiProject every politician? There may be something related to your theme. --Was a bee (talk) 15:47, 13 May 2018 (UTC)

Did you see the WSJ article that mentioned the deletion of local politicians from Wikipedia? Sometimes the article is visible sometimes behind paywall, depending how you get to it. --RAN (talk) 19:15, 13 May 2018 (UTC)

Trachytes

I normally edit on Wikispecies, but create pages on Wikidata to enter taxon authors, and to sometimes create needed links on disambiguation pages and repositories. I do not know how to create disambiguation page on Wikidata, but one is necessary for Trachytes on this wiki. Here, Trachytes refers to particular igneous rock type. On Wikispecies, it would refer to genus of Acari mites. This needs to be differrentiated. Can anyone help? Neferkheperre (talk) 19:19, 13 May 2018 (UTC)

@Neferkheperre: in essence, this was a bad label on one item. We now have Trachytes (Q7831431) and trachyte (Q332748) and I think all is good. --Tagishsimon (talk) 19:30, 13 May 2018 (UTC)

But to answer the actual question - there are no disambiguation pages on Wikidata. To distinguish two items with the same label the description should give the necessary information which of the two is the one to choose. Ahoerstemeier (talk) 19:37, 13 May 2018 (UTC)

All is well. I checked it out. Rock name is correct as trachyte. I am now remembering my igneous petrology class from 1972. Thank you both. Neferkheperre (talk) 20:16, 13 May 2018 (UTC)

Armenian chat page

For the benefit of our Armenian-speaking colleagues, with whom I have had the pleasure of working for the last three days, I have created Wikidata:Խորհրդարան as an equivalent of this page. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:55, 12 May 2018 (UTC)

I think it would be helpful if an admin were to edit Mediawiki:Villagepump-url/hy to point to that page, so that the sidebar links there. --Yair rand (talk) 07:58, 14 May 2018 (UTC)

Done Also, if I were told the translation for the link text, I could add it to MediaWiki:Villagepump/hy. Matěj Suchánek (talk) 12:19, 14 May 2018 (UTC)

Expressing an ambiguous match to two items

I have an interesting problem with Ralph Rokeby (Q18530701) and Ralph Rokeby (Q16863491). One of these two men was a Member of Parliament, and can be matched to History of Parliament ID (P1614):1558-1603/member/rokeby-ralph. However, we have no way of telling which - both the History of Parliament article and the Oxford DNB articles on them seem to agree that it's unclear. I don't really just want to pick one, and we can't easily put the data on both.

Is there a good way to represent this? I was thinking of creating a third item with the political data and said to be the same as (P460), but not sure if this is the best way to go about it. Andrew Gray (talk) 14:37, 13 May 2018 (UTC)

@Andrew Gray: I agree with a new item, and would suggest some sourcing circumstances (P1480) qualifier on the two said to be the same as (P460) statements. Perhaps also add different from (P1889) between the two existing items? --Lucas Werkmeister (talk) 14:45, 13 May 2018 (UTC)

I agree with using both different from (P1889) and said to be the same as (P460) to show that they are "inherently ambiguous". It lets the next person to come across the two entries know it has been looked at already and not resolved. This had been discussed previously about coming up with a new "inherently ambiguous" tag to show that they have already been looked at, but not been resolved. different from (P1889) and said to be the same as (P460) was the compromise over creating something new. I am still for having a new tag, so we can easily find all the ones with a "What links here" single click. --RAN (talk) 18:57, 13 May 2018 (UTC)

Creating a new item was what I did with John Bradley (Q21543222) (permalink). It looks like the ambiguity has now been resolved. Jheald (talk) 18:50, 13 May 2018 (UTC)

The text is a great idea, but links to the others would be nice too, so you easily click between them for comparison. --RAN (talk) 19:01, 13 May 2018 (UTC)

@Lucas Werkmeister, Richard Arthur Norton (1958- ), Jheald: Thanks all. I've created a new item and done the interlinking from Ralph Rokeby (Q53492592). Andrew Gray (talk) 21:10, 14 May 2018 (UTC)

outdated descriptions still describing things as "upcoming"

I've searched for the term "upcoming" to check for items of upcoming films and found quite a few outdated items of films, tv shows, games and other things that have actually already been released (sometimes 3-5 years ago), but were still described as "upcoming" in their descriptions. I've updated the English and German descriptions on those items, but the problem likely also exists in other languages that I don't speak well enough. Maybe users who are proficient in other languages could do a similar search and update other outdated descriptions. --Kam Solusar (talk) 10:28, 14 May 2018 (UTC)

Not sure if "upcoming" should be in the description at all. Personally, I'd remove it from all films.
--- Jura 10:32, 14 May 2018 (UTC)

Agree with Jura - and I am pretty sure this is the policy on most Wikipedias as well, for obvious reasons. Jane023 (talk) 12:44, 14 May 2018 (UTC)

+1. VIGNERON (talk) 14:59, 15 May 2018 (UTC)

Wikidata weekly summary #312

Here's your quick overview of what has been happening around Wikidata over the last week.

Discussions
- Open request for adminship: Adminor

Events
- Past: GLAM forum in Yerevan, Armenia, 10-12 May 2018
  - slides for 'What is Wikidata: How can GLAMs work with Wikidata?' presentation by Andy Mabbett
  - A Wikidata workshop was given by Liam Wyatt and appeared at the national Armenian television
- Wikidata workshop day at GLAMwiki conference in Rotterdam, May 14th
- Wikidata workshop in Paris, May 18th
- Next Wikidata IRC office hour: May 29th at 18:00 (UTC+2, Berlin time) on the channel #wikimedia-office

Press, articles, blog posts
- Wikidata: a platform for your library’s linked open data by Stacy Allison-Cassin & Dan Scott, in the journal Code4Lib (also posted on Reddit)
- Enriching Reconciled Data with OpenRefine, by Karen Hwang

Other Noteworthy Stuff
- New feature for the Query Service: check the location of the browser
- New monolingual code available: shy (Shawiya)
- You can have a look at the draft for the RDF mapping of Wikibase Lexeme
- New feature for the Query Service: check the location of the browser

Did you know?
- Newest properties:
  - General datatypes: item for this sense, season of club or team, Möllendorff transliteration, geographic center, coastline
  - External identifiers: none
- New property proposals to review:
  - General datatypes: IGAC rating, taxon described in publication, jockey, Norsk fjordkatalog-ID, Wikidata:Dataset Imports, toponym
  - External identifiers: Bugs! artist ID, Bugs! album ID, KKBOX artist ID, KKBOX album ID, Norsk pop- og rockleksikon ID, Odnoklassniki profile ID, Kunstenpunt organisations, Rockipedia artist ID, Rockipedia album ID, Rockipedia label ID, Rockipedia area ID, Norsk historisk leksikon ID, Univ-droit jurist ID, Relationship Science profile ID, D&B Hoovers company profile, Victorian Heritage Register ID, CIVICUS Monitor country entry, FloraCatalana ID
- Underused properties created >3 months:
  - Songwriters Hall of Fame ID (P4757), V Live channel ID (P4756), MONA ID (P4758), Harvard botanical journal ID (P4754), Images d'Art artwork ID (P4761), has boundary (P4777), lot number (P4775), Arcade artwork ID (P4764), biological phase (P4774), crates.io ID (P4763)
- Query examples:

Development
- Fix the bug where changes in the watchlist and Recent Changes on Wikipedia should have been shown but were not (phab:T192673)
- Clarify error message for the merge API (phab:T180296)
- Add violation type to restrict which entity types a property can be used (phab:T164744)
- Added a constraint to blacklist values for a property (phab:T183092)
- Fix a bug showing the wrong alias in the edit summary when editing an alias (phab:T190492)
- Fix some bugs related to displaying thumbnails in statements (phab:T193880, phab:T192667, phab:T193499)
- Continue looking into dispatch issues (phab:T194602)
- Working on adding an integer constraint (phab:T167989)
- Preparing to deploy WikibaseLexeme extension on Wikimedia cluster (phab:T168260)
- Making sure that the form ID counter is preserved when clearing the lexeme via the API (phab:T192264)
- Applying the same validation to the language code of the Lemma and the representation (phab:T191504)

You can see all open tickets related to Wikidata here. If you want to help, you can also have a look at the tasks needing a volunteer.

Monthly Tasks
- Add labels, in your own language(s), for the new properties listed above.
- Comment on property proposals: all open proposals
- Suggested and open tasks!
- Contribute to a Showcase item.
- Help translate or proofread the interface and documentation pages, in your own language!
- Help merge identical items across Wikimedia projects.
- Help write the next summary!

Read the full report · Unsubscribe · Lea Lacroix (WMDE) 14:54, 14 May 2018 (UTC)

Celestial Coordinate System/Astronomical Coordinate System and Gaia

Hello All,

I am sure this has been asked before, but I could not find a recent discussion on the same topic. Sorry if this discussion is repeating any previous issues.

I was wondering why Wikidata does not yet have support for any Celestial Coordinates (Horizontal/Alt-Az, Equatorial, Ecliptic, Galactic, etc...)?

And,

Will Wikidata merge (Robotically/Automated) astronomical data from the Gaia Mission Data Release 2? (What properties will it use?)

Wallacegromit1 (talk) 00:21, 13 May 2018 (UTC)

I don't think is possible import data from Gaia: License --ValterVB (talk) 08:56, 13 May 2018 (UTC)

@ValterVB: that a secondary point and the license says « The Gaia data are open and free to use, provided credit is given to 'ESA/Gaia/DPAC'. » it seems pretty compatible with Wikidata. Anyway, there is plenty of other sources (some in the public domain for a long time, where credit might still be needed). Cdlt, VIGNERON (talk) 10:12, 13 May 2018 (UTC)

@Wallacegromit1: It isn't correct, they say "provided credit is given to 'ESA/Gaia/DPAC'". Our license is CC0, this meaning that people can use Wikidata without credit, so we can't provide this data because, if not, we use Wikidata to "cleaning" license from other data, and trasform data that have "citation necessary" to data CC0. --ValterVB (talk) 10:38, 13 May 2018 (UTC)

@ValterVB: Not exactly, the sentence is very shallow, it just says we use need to provide credit, not that the re-user need to provide credit too (and devil advocate that a problem for the re-user, not for us). And CC0 doesn't not mean *at all* that credit is not require (see CC0 itself], in many countries credit is still required (including France for 100 % sure; but for most civil law jurisdictions and, to a lesser extent, in some common law jurisdictions too). Anyway, this is not the main point as 1. we can always ask the Gaia for precision (and maybe get an explicit agreement, but first we need to have the technic possibility which is the main point here) 2. as I said there is plenty other sources. Cdlt, VIGNERON (talk) 10:49, 13 May 2018 (UTC)

These data are provide only by them, no doubt that any use of their data request citation. If someone post their data using Wikidata, don't add credit becauase it isn't request by CC0 license so we authorize someone to publish their data without citation: thing that they asked explicitly. Remember also that ESA is in Europe and we must follow "sui generis database right" --ValterVB (talk) 11:26, 13 May 2018 (UTC)

@ValterVB: « These data are provide only by them » really? I was assuming that other data set would give coordinates too (like the NASA who is usually under PD-gov, or other astronometric missions than Gaia (Q767805) like Hipparcos (Q555846) or even older sources like the Almagest (Q155952) which is clearly PD-old, not as complete nor precise as Gaia obviously but more than enough for most uses of these data). I was also assuming that we'd only take data for the items we already have (wich is only a very small fraction of the billion stars of Gaia - we have around 12000 stars in Wikidata right now - as it's not realistic at all to plan to create 1 *billion* items, it would mean multiply the size of Wikdiata by more than 21!!) so the sui generis database right wouldn't really apply (even more as we wouldn't take all the data for each astronomical object, not the standard error for instance).

For the CC0, first and again: it doesn't mean we authorize someone to publish data without citation, not for French citizen at least (and from what I read on it:Diritto_d'autore#Diritto_morale_d'autore it seems to be a bit the same for Italy). Then, there is what is legally require and hat is morally requires, even if credit is not legally required, it's always a good moral practice to give credit. Finally, it's not uncommon that databases allow that only the first re-user has to give credit; indeed it's a bit strange but it's logical, you can't ask to control something you haven't control upon. Cdlt, VIGNERON (talk) 11:04, 15 May 2018 (UTC)

@VIGNERON: If the data are available from other sources compatible with CC0 we must use this source and not use dataset with limited permission. About "it doesn't mean we authorize someone to publish data without citation", actually we implicitly say this, because CC0 License say:"You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission.", nothing is said about obligation to citation. --ValterVB (talk) 18:22, 15 May 2018 (UTC)

@ValterVB: it depends a lots of what you call « the data », is it « all the date » or « some of the data »? Almost 1800 years ago, Ptolemy (Q34943) gave coordinates for around 1000 astronomical objects which are similar to what Gaia measured (data under PD-old which is not perfectly compatible with CC0 ; and BTW, putting data in Wikidata could allow to do interesting query to compare the completeness and precision of these data). The completeness, the precision and the granularity is obviously far different but beyond the specific data, it's the same information. Anyhow, the bottom-line is that we shouldn't import data from Gaia before at least ask them.

For CC0, you are quoting the « human-readable summary » which is quite different from what the « full legal code » which says which explicitly says « the greatest extent permitted by, but not in contravention of, applicable law » (alinea 2, and « moral rights retained by the original author(s) » alinea 1) so for France (and probably Italy) where citation is mandatory by law (and can't be released, then citation is mandatory is required by CC0 (but then there is the international issue, does an American re-user need to respect the law of data released by a French citizen? it's quite tricky, the only thing clear is that, as a French citizen, I must respect French law and always give citation).

Cdlt, VIGNERON (talk) 21:20, 15 May 2018 (UTC)

@ValterVB: and @VIGNERON: Thanks for the Gaia Clarification. Any ideas on Celestial Coordinates? Wallacegromit1 (talk) 11:57, 13 May 2018 (UTC)

Coordinate support

The lack of support of astronomical coordinates is a long-standing issue. It's being tracked at phab:T127950. Thanks. Mike Peel (talk) 14:09, 13 May 2018 (UTC)

@Lydia Pintscher (WMDE): Maybe a suitable student project?
--- Jura 16:00, 15 May 2018 (UTC)

Ah yeah why not. I'll add it to the tracking ticket for them. --Lydia Pintscher (WMDE) (talk) 17:37, 15 May 2018 (UTC)

I suggest this is not quite so free-standing as one might think. In this old Prjoect chat thread, two issues with geographic coordinates are discussed, but they have never been resolved:

Should the precision reflect the way the data was entered in the user interface, so if entered to the nearest arc second, the precision would be 2.7777777777778e-4 degrees, or should it reflect normal engineering and scientific practice, where the precision value only has one or two significant figures?
Should the precision reflect how accurately the value is known, or should it reflect the size of the object?

So these issues should be resolved and the user interface modified to conform to the chosen meanings before similar data types are created; new data types should adopt the same meaning as geographical coordinates.

Separate Rafi (Q19968298) in two items?

Notified participants of WikiProject Names

Rafi is both a name of arabic origin and a hebrew name, nickname of Rafael. Currently, both hebrew and arabic people are linked to Rafi (Q19968298). Wouldn't be better to separate Rafi (Q19968298) in two different items?--Malore (talk) 15:48, 15 May 2018 (UTC)

Two or more. There shouldn't be more than one "native label". Someone additional ones were added to this item.
--- Jura 15:57, 15 May 2018 (UTC)

Ok, I'm going to split it.--Malore (talk) 16:07, 15 May 2018 (UTC)

I split the item and I linked israelians to the new Rafi (Q53564788) item, but probably there are still israelians linked to the arabic name--Malore (talk) 16:48, 15 May 2018 (UTC)

@Malore: I cleaned up labels/descriptions/aliases and properties (given names that are variations are linked together with said to be the same as (P460)). I didn't clean up the uses. --Harmonia Amanda (talk) 20:53, 15 May 2018 (UTC)

@Malore: and I merged Rafi (Q53555961) and {Rafi (Q19968298), which are both about the name رفیع. We create one item by string, no matter how many different transliterations they may have. All different transliterations are added as aliases, the most frequent transliteration is the label (and of course the description clearly state what exact string the item is about). --Harmonia Amanda (talk) 21:01, 15 May 2018 (UTC)

Here is one for the Latin script version: Q53580904.
--- Jura 21:35, 15 May 2018 (UTC)

how to insert given names hypocorisms?

Notified participants of WikiProject Names

Is there a correct way to insert hypochorisms of given names (e.g. Johnny for John). I noted sometimes it's used nickname (P1449) with instance of (P31) → hypocorism (Q1130279) as qualifier.--Malore (talk) 16:05, 15 May 2018 (UTC)

Some items have additional P31 statements, e.g. Q4166211#P31. Maybe Lexemes will be able to provide more detail.
--- Jura 16:27, 15 May 2018 (UTC)

I was talking about linking from the name to the hypocorism, that often doesn't have a Wikidata item. Should we create an item for every hypochorism?--Malore (talk) 16:44, 15 May 2018 (UTC)

« Should we create an item for every hypochorism? » yes, we should (provided there is sources for that). On how to link the person the surname, nickname (P1449) is usually the good solution (and it could solve the problem where a given name who have several hypochorisms, Jack could stands be John or Jackson for example). Cdlt, VIGNERON (talk) 21:31, 15 May 2018 (UTC)

Currently we left things that weren't directly needed for people's names to possible dictionary functions.
--- Jura 21:38, 15 May 2018 (UTC)

Adding colors as CMYK, Hex, and more

I would like to add school colors to entries of athletic sports teams. E.g.Albany Great Danes women's basketball (Q29468771) I do see the property for color:

color (P462)

But my impression is that takes values such as "purple", and "gold".

While I would like to include those values, many schools specify in addition to those ordinary English words, values such as:

Pantone (PMS)
CMYK
RGB
Web/Hex

As an example see this page

I think I can express the RGB property using:

sRGB color hex triplet (P465)

But while I see Q values for:

Pantone Matching System (Q749816)
CMYK color model (Q166432)

I don't see the ability to express those as properties nor do I see any way to express the hex values. Am I missing something?--Sphilbrick (talk) 14:17, 4 May 2018 (UTC)

I suppose we could use <named as> "Pantone 19-3642" as a modifier for "purple", but perhaps we need a new property "Color specification" as a modifier for "color", with fields "color system" (Q-item) and "color id" (string). I would support such a property. - PKM (talk) 20:38, 4 May 2018 (UTC)

We have sRGB color hex triplet (P465) to indicate a color in the hexadecimal notation. --Pasleim (talk) 20:42, 4 May 2018 (UTC)

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘Yes, I erred in suggesting that the RGB property could be used to specify the RGB values. In fact, as you notice, it is intended for the hex values.

I'll restate my comment using a specific example. One common color used by athletic teams can be expressed five common ways:

Color name: Purple
PMS: 269
CMYK: 78, 100, 0, 33
RGB: 70, 22, 107
sRGB color hex triplet: #46166b

There are properties for option 1 and option 5. I hoped I had simply overlooked properties for the other three, but if not, what's the next step to request such properties?--Sphilbrick (talk) 13:31, 5 May 2018 (UTC)

Option 3 and 4 can be computed from option 5 and vice versa. For option 2 you could request a new property here but I'm quite skeptical about the legal use of PMS in a CC0 project. --Pasleim (talk) 11:46, 7 May 2018 (UTC)

Yes, regarding Pantone, that concern occurred to me as well. Guess I'll put thison ihold a bit and work on something else. Thanks.--Sphilbrick (talk) 17:46, 9 May 2018 (UTC)

Pasleim "Option 3 [...] can be computed from option 5 and vice versa." - only if the conversion is specified and lossless. 213.39.185.159 09:52, 16 May 2018 (UTC)

Adding to documentation

During the recent Wikimedia Conference, I had some pretty lengthy conversations with User:John Cummings and (separately) User:Jens Ohlig about ways we could improve the help for Wikidata. I believe both thought I had some good ideas, and the end result of the discussions is that I said I'd be willing to put a total of about 80 hours over the next few months into working on this on a volunteer basis (but not a lot more because I do stuff like this for a living, and there's only so much of it I'm interested in doing as a volunteer), though I think the total task is a lot larger than that. (That is, I'd be suggesting changes/additions that would go well beyond what I myself am likely to implement as a volunteer.) I gather from the exchange I had at Wikidata talk:Introduction#Editing this page that even where I have simple, concrete changes I'd like to make, it would probably be a poor idea to "be bold" and jump in and edit help pages, etc. I'm thinking that the best place to start would be for me to spend about 10 hours writing up roughly what I'd propose we add/change, my rationale for that, etc. Should I just do that in a draft in my own user space here, or what? And if I do invest that 10 or so hours in writing up my ideas, is there some process by which I can bring my ideas forward for discussion, beyond just linking that page from here? - Jmabel (talk) 03:50, 12 May 2018 (UTC)

@Jmabel: Perhaps we should start a Wikiproject for Wikidata Help, so the suggestions and comments are all in one place, even after you move on to other things, with updates here. - PKM (talk) 19:37, 13 May 2018 (UTC)

@Jmabel:, I'm really interested in finding ways to make documentation on Wikidata better and this is a great place to start. I'm planning on putting together an RFC to outline all the documentation needed for different user journeys. I think that outlining what you plan to do would be a great start (perhaps something less detailed than 10 hours work?). I'd be really interested to know about your process for writing documentation and how you approach it. I've recently started learning about designing for learning, I wonder what you think the main outcomes should be for the reader of this page? Let know if I can be of help, --John Cummings (talk) 22:09, 13 May 2018 (UTC)

@John Cummings: Do you think the best first step is for you to start the RFC or for me to start writing notes? (and, if the latter where?) - Jmabel (talk) 00:21, 14 May 2018 (UTC)

Hello @Jmabel, John Cummings: two things that could be relevant for your work:

The existing WikiProjects WikiProject Documentation and WikiProject Welcome
During the Wikimedia hackathon, this weekend in Barcelona, there will be a documentation corner. I expect some discussions to take place at that moment on Wikiproject Documentation.

In general, I'd say that preparing your suggestions in draft (subpages of your userpage) then sending a message on the talk page of the Wikiproject Documentation, asking for feedback, is a good way to go.

Cheers, Lea Lacroix (WMDE) (talk) 08:47, 14 May 2018 (UTC)

FWIW, almost no chance I will have much together before this upcoming weekend. - Jmabel (talk) 15:03, 14 May 2018 (UTC)
I've started in at User:Jmabel/Documentation thoughts. Very preliminary. Given that that hackathon is this weekend, and any writing I do will presumably still be very cursory at that time, I'm not sure how best to put any of this on their radar. - Jmabel (talk) 01:05, 15 May 2018 (UTC)

Notified participants of WikiProject Documentation Feel free to have a look at the draft from Jmabel above, and let comments :) Lea Lacroix (WMDE) (talk) 12:14, 16 May 2018 (UTC)

I started posting to WikiProject Documentation and will be inviting others to do the same. Blue Rasberry (talk) 17:12, 16 May 2018 (UTC)

Merge two IDs

Hi, how can I merge to IDs that are the same thing? The IDs are Q2697764 and Q27914557, that are about the same species, Physalaemus nattereri, but the first use the old name, Eupemphix nettereri. Mr. Fulano (talk) 21:20, 15 May 2018 (UTC)

In Wikidata taxonomic data model protonyms are not merged, rather they are just interlinked. Just take a look: Physalaemus nattereri (Q27914557) original combination (P1403) Eupemphix nattereri (Q2697764). Leave them as they are.--Pzgulyas (talk) 07:59, 16 May 2018 (UTC)

Format of mobile country code (P2258)

Please see Property talk:P2258.
--- Jura 08:15, 16 May 2018 (UTC)

Change data type of P1402 from string to external-id

Foundational Model of Anatomy ID (P1402) 213.39.185.159 08:40, 16 May 2018 (UTC)

Wikidata Zurich Datathon

Hi everyone,

We are organizing the Wikidata Zurich Datathon on June 16.

The goal of the event is to help organizations prepare their data in a Wikidata-friendly way. For example, the people from Open Data Zürich and OGD Kanton Zürich will bring some of their data sets.

We will organise teams around data sets. We are looking for further team instructors, so that we can ensure that there will be at least one knowledgeable person per team. If any Wikidata user would like to help and join us, we would be very grateful and we would love to hear from you. Feel free to send me an email via the wiki. You can also write an email to wikidatazurich@gmail.com if you prefer to do so.

This event is organized by Universität Zürich, Wikimedia Schweiz, Stadt Zürich - Open Data Zürich, Fach- und Koordinationsstelle OGD - Open Data Kanton Zürich.

Kind regards, Cristina Sarasua (talk) 10:13, 16 May 2018 (UTC)

Dispute in Embassy of the United States in Israel

Two days ago, American embassy in Israel moved from Tel Aviv to Jerusalem, changing title of the Tel Aviv offices to "Embassy of the United States of America: Branch Office". One user insist on having two separate items about the Tel Aviv location – one about the embassy (1966–2018), and another about the branch office (since 2018). I argue we should have it as a single item (Embassy of the United States, Tel Aviv (Q2897374)). Please, leave a comment on Talk:Q2897374. --Triggerhippie4 (talk) 17:09, 16 May 2018 (UTC)

Clarify elements items

Notified participants of WikiProject Physics

IMO each element (hydrogen, helium...) has two meanings:

chemical element (Q11344): atoms with the same atomic number;
simple substance (Q2512777): any substance that is composed exclusively by atoms with the same atomic number.

Actually, hydrogen (Q556) - derived from the English Wikipedia article about hydrogen (that talks about both meanings) - seems to refers to the first meaning only. hydrogen atom (Q6643508) - derived from the Wikipedia article about hydrogen atom - refers to the same meaning. The same is true for other elements.--Malore (talk) 20:24, 12 May 2018 (UTC)

@Malore: Wrong, chlorine (Q688) is a chemical element (Q11344) and dichlorine (Q1904422) is a simple substance (Q2512777). And please provide the definition of substance if you want a successful discussion. Snipre (talk) 20:41, 12 May 2018 (UTC)

And for me, Wikipedia article about hydrogen atom is not about the hydrogen atom but about the mathematical modeling of the hydrogen atom. Snipre (talk) 20:43, 12 May 2018 (UTC)

@Snipre:I'm pretty sure about the two meanings - "element atom" and "simple substance" - of what we commonly call element because it's confirmed by IUPAC.

As regards the definition of substance, it should be matter with specific chemical composition and properties. It can be a chemical compound or a simple substance.

I don't have a definite answer about the relation between chlorine and dichlorine because before allotropes was discovered it was thought an element could form only one simple substance. The options are:

allotropes are subclasses of simple substances;
allotropes are different simple substances and the chemical element is simply one of them (the first known and associated with that element).

In the first case:

oxygen intended as simple substance includes both dioxygen and ozone;
chlorine as a simple substance include dichlorin - his unique stable allotrope.

In the second case:

dioxygen and ozone are two different simple substances and what we commonly call oxygen is dioxygen;
dichlorine is the only stable allotrope of chlorine and the two terms indicate the same thing.

Since isomers are considered different compounds, it would be more consistent to consider allotropes as different simple substances.--Malore (talk) 19:36, 13 May 2018 (UTC)

We have dihydrogen (Q3027893) as an instance of simple substance (Q2512777) of hydrogen (Q556). chemical element (Q11344) and simple substance (Q2512777) are clearly different concepts, so we should separate them.
An atom refers to something like one particle, in general, but a chemical element doesn't. If a chemical element refers to one particle, they can't have physical properties such as phase, melting point, or density. By the way, is an isotope subclass of chemical element (Q11344) or atom (Q9121)? protium (Q15406064) and deuterium (Q102296) are really subclasses of hydrogen (Q556)? Aren't they subcalsses of hydrogen atom (Q6643508)? --Okkn (talk) 08:03, 13 May 2018 (UTC)

@Okkn: I agree with you that chemical element (Q11344) and simple substance (Q2512777) are two different concepts. I think that chemical element can refer to both the simple substance and the atom or to the atom only (as IUPAC confirms). In the former case, it clearly have phase, melting point, or density, while in the latter case it doesn't.

As regards isotopes, they refer to atoms. Maybe, they could refer also to substances composed only of a specific isotope (isotopically pure elements), but I don't know if such a usage exists.--Malore (talk) 13:15, 13 May 2018 (UTC)

@Malore: I know the IUPAC definitions but the problem is that WD can't support two different concepts or definitions for the same item, because this creates problem for the WD classification, constraint system and use. Currently we have one item for each simple substance of each element. And simple substance is a subclass of chemical substance.

diamond (Q5283) is an instance of simple substance (Q2512777)

graphite (Q5309) is an instance of simple substance (Q2512777)

dihydrogen (Q3027893) is an instance of simple substance (Q2512777)

simple substance (Q2512777) is a subclass of chemical substance (Q79529)

The relation between chemical element and corresponding simple substance is done by two ways

dioxygen (Q5203615) has part(s) (P527) oxygen (Q629)

or dioxygen (Q5203615) is subclass of allotrope of oxygen (Q428653)

So no need to consider that a chemical element in WD is described by "A pure chemical substance composed of atoms with the same number of protons in the atomic nucleus". WE just have to use for chemical element the IUPAC definition "A species of atoms; all atoms with the same number of protons in the atomic nucleus." And using that definition can't be applied to single atom. Snipre (talk) 11:39, 14 May 2018 (UTC)

@Snipre: Ok, so allotropes are different simple substances. What I'm trying to do is having different items for different meanings, while actually element items cover different concepts. If we consider an element as "A species of atoms; all atoms with the same number of protons in the atomic nucleus." it can't have properties like density, boiling point, melting point...

Shouldn't be better

⟨ dioxygen (Q5203615)  

 ⟩ has part(s) of the class (P2670) ⟨ oxygen (Q629)  

 ⟩

than

⟨ dioxygen (Q5203615)  

 ⟩ has part(s) (P527) ⟨ oxygen (Q629)  

 ⟩

?--Malore (talk) 13:57, 14 May 2018 (UTC)

Yes, using an axiom-style approach works for me: it intuitively separates the concept of an atom with that of an element. I didn't know about has part(s) of the class (P2670), thanks... --Egon Willighagen (talk) 14:46, 14 May 2018 (UTC)

Notified participants of WikiProject Chemistry

@Malore: This and related ontological issues for physics and chemistry have been extensively discussed before here, though without entirely coming to a satisfactory conclusion. The central problem I believe is that the definitions commonly used by scientists (for example by IUPAC) are not ontologically pure or clear - "chemical compound" is a bad case, which is defined as much by what it is not (i.e. it is not a simple substance) as by what it is. As you note, "chemical element" seems to encompass two contrasting definitions. Last year I made an attempt to rationalize some of the basic items for elements and isotopes etc. and we made a bit of progress. Since wikidata items for physics and chemistry are (almost?) all classes and not individual physical objects, some rather deep thinking is required to understand what the relations are actually implying - there is a lot of what might be thought of as overlap in meaning. For example, graphite is the class of arrangements of carbon atoms in a certain structure. If the item for carbon as an element is considered to be the class of all carbon atoms, then a particular physical piece of graphite would be a subclass of carbon (since it contains a subset of the carbon atoms of the universe, specifically those at a particular physical location at that time etc). But if the item for carbon as an element also includes all simple substances made of carbon, then an individual piece of graphite is also an instance of the class. So should graphite (as a class) be a subclass of carbon, or not? It gets quite confusing. Maybe we should duplicate all our chemical element entries to make this separation more explicit in wikidata, even though it differs from what IUPAC does? I would note that the ChEBI ontology seems to make this distinction, with for example "carbon atom" (CHEBI:27594) vs "elemental carbon" (CHEBI:33415). ArthurPSmith (talk) 13:32, 14 May 2018 (UTC)

@ArthurPSmith: My idea is exactly to duplicate all chemical elements similarly to what ChEBI ontology does. Actually, some chemical elements are already separated in element atom and element substance (for example, carbon (Q623) and carbon atom (Q47001846)).

As regards IUPAC, it have only one entity for both meanings but it distinguishes the two concepts.--Malore (talk) 14:09, 14 May 2018 (UTC)

@ArthurPSmith: Your work about classification is a very good summary, but one point is missing: the definition of the different concept. Creating links between items is a non-sense without the definition: the definitions are generating based on the links, so you need to handle a complete network of relations to understand one single definition. If we first fix the definitions then links will be obvious.

Just an example. If I take your proposition and the IUPAC definitions, we have a problem:

IUPAC definition for chemical element: "A species of atoms; all atoms with the same number of protons in the atomic nucleus."
IUPAC definition for atom:"Smallest particle still characterizing a chemical element. It consists of a nucleus of a positive charge (Z is the proton number and e the elementary charge) carrying almost all its mass (more than 99.9%) and Z electrons determining its size."

Then the following relations are wrong

⟨ neon (Q654)    ⟩ instance of (P31) ⟨ chemical element (Q11344)    ⟩
⟨ neon (Q654)    ⟩ subclass of (P279) ⟨ atom (Q9121)    ⟩

How neon, a species of atoms, can be a subclass of atom, a particule ? To be correct, atom should be defined as a species of particles. This is possible, definitions can be adjusted, but at the end all definitions should be coherent together.

The main difficulty we have to work together especially when defining relations is the lack of common definitions. Snipre (talk) 20:13, 14 May 2018 (UTC)

So I think there is a further problem - it's more than just 2 different definitions involved here. If you look for example at hydrogen atom (Q6643508) it is (mostly, at least the enwiki version) about the isolated hydrogen atom. I.e. a proton (or other hydrogen nucleus) and an electron in a vacuum, what the energy levels and quantum states are, etc. So we have the concepts of "isolated hydrogen atom", "hydrogen as a substance" and "hydrogen atoms in no matter what context" (i.e. isolated, in substances, or in compounds with other elements). It's not just a problem for the elements - every chemical (whether simple or compound) has a dual or perhaps multiple meanings - the individual molecule, the molecule in a group of other identical molecules, the molecule in other contexts such as in aqueous solution, adsorbed on a surface, intercalated in a solid, ionized and combined into a salt, etc. I think the wikidata item for a chemical (including for an element) should default to meaning the widest possible meaning, i.e. the molecule or atom in any of these possible contexts. I believe that's the meaning the ChEBI ontology uses, with more specific entities created when needed. But that does exclude the "simple substance" meaning from IUPAC, as that's a specific context and instances of such a class would contain large numbers of physical atoms, not individual atoms in certain contexts. I am wondering if the "allotrope of" items can serve the purpose of meeting the "simple substance" definition of an element, without having to create new items for everything (although we only have a few "allotrope of" items right now). ArthurPSmith (talk) 12:32, 15 May 2018 (UTC)

I have a strong preference to keep them separate. Many elements have more than one simple substance, like carbon which has graphite, diamond, buckyballs, etc. Each simple substance has different physicalchemical properties. By keeping these concept different items, we can correctly link the properties to their matching substance. --Egon Willighagen (talk) 14:35, 14 May 2018 (UTC)

Could I ask someone to clarify how the separation would look like in case of, let's say, sulfur? How many items there would be and in which properties like mass, element symbol, atomic number, electronegativity, oxidation state should be placed? What would be the relation between these elements after separation and items about allotropes of sulfur and isotopes of sulfur? Wostr (talk) 16:24, 14 May 2018 (UTC)
- So based on my comment above (12:32 15 May 2018) I believe sulfur (Q682) should be the class of all sulfur atoms (or ions) in whatever context; allotrope of sulphur (Q1094078) is then the class of all simple substances made up of sulfur. The class of isolated sulfur atoms, or sulfur atoms in other contexts would be subclasses of sulfur (Q682). hexasulfur (Q5748936) would be an instance of allotrope of sulphur (Q1094078). Some things are not clear to me - for example sulfide (Q221205), does that count as an allotrope? The physical properties of the substance should be statements on the items for the simple substances, not on the item for the element. There should probably be a way to link from an element to its standard simple substance form at STP (or if it has multiple stable forms like carbon, to all of them). I think we should add items for allotropes of all the elements and their STP simple substance forms (if they don't already exist) and these relationships, to clarify this. Does this make sense to everybody? ArthurPSmith (talk) 12:48, 15 May 2018 (UTC)

@ArthurPSmith: If allotrope (Q21198401) is a subclass of simple substance (Q2512777) which is a subclass of chemical substance (Q79529), then all instances of allotrope (Q21198401) must follow the definition of chemical substance (Q79529). And the definition of chemical substance (Q79529) says: "Matter of constant composition best characterized by the entities (molecules, formula units, atoms) it is composed of. Physical properties such as density, refractive index, electric conductivity, melting point etc. characterize the chemical substance", so as it is not possible to measure the density or melting point of ions (we measure only density/melting point of salt), thensulfide (Q221205) is not a instance of chemical substance (Q79529), and therefore not a instance of allotrope (Q21198401). QED. Snipre (talk) 14:22, 15 May 2018 (UTC)

By the way about your proposition, do we really need the classification level "allotrope of X" ? Why can't we simplify the structure by defining

* hexasulfur (Q5748936) is instance of allotrope (Q21198401)

* hexasulfur (Q5748936) has part sulfur (Q682)

The retrieving of all allotropes of sulfur requires looking at all items with instance of allotrope (Q21198401) AND has part of sulfur (Q682).

If we really want to keep "allotrope of X" then we have to create, for symmetry, the corresponding for all chemical elements even if only one simple substance exists for one element. I just fear the day when someone will create the item "radioactive allotrope of X having a half life greater than 1 minute discovered in USA in twentieth century" Snipre (talk) 14:30, 15 May 2018 (UTC)

That could work. However, now I'm a bit confused again - are hexasulfur (Q5748936) and octasulfur (Q7076759) items about the molecules (in any context) or about the substances under normal conditions (orange or yellow solid respectively)? If we decide these are allotropes, that would need to mean the items refer to the substances and not the molecules? ArthurPSmith (talk) 14:41, 15 May 2018 (UTC)

Ooh, and then there's clinosulphur (Q4024650) (clearly a substance - common solid crystal form of octasulfur (Q7076759)) - and also alpha, gamma, lambda and other forms for which we don't seem to have wikidata items. I'm tending to think octasulfur (Q7076759) is therefore NOT an allotrope, rather clinosulphur (Q4024650) and friends would be. So what exactly is octasulfur (Q7076759)? Do we have terminology for it (molecule made purely from one element)? ArthurPSmith (talk) 14:49, 15 May 2018 (UTC)

@ArthurPSmith: octasulfur (Q7076759) is a subclass of allotrope (Q21198401) (or of allotrope of sulphur (Q1094078) if we want to keep that classification) and clinosulphur (Q4024650) is an instance of octasulfur (Q7076759).

And hexasulfur (Q5748936) and octasulfur (Q7076759) are about the substance as they are instance or subclass of chemical substance through allotrope and simple substance. If we want to link them to molecule then shoulf have somewhere above in the classification the link to molecule item, but this is not the case. Snipre (talk) 18:52, 15 May 2018 (UTC)

@Snipre: Ok, yes, I agree that is how they are currently defined in wikidata. And the enwiki article on octasulfur (Q7076759) is almost entirely about the substance, not the molecule. However, it does display 3 models of the isolated molecule at the top of the infobox; also I think the ChEBI identifier (and possibly some of the others) should be linked to the molecule, not the substance. It does seem like the molecule deserves an item of its own, at least to tie to the molecular models and associated identifiers; and separate items for each allotrope of the substance as well I guess. So we would go from 2 items to around 5? I'd like to hear opinions from others on this... ArthurPSmith (talk) 19:10, 15 May 2018 (UTC)

Do we really need to distinguish the atom from the substance ? If we do that for chemical element, we will have to apply the same distinction to chemicals, once as substance and once as molecule. If we really want to do so I think that we should create a dedicated wikidata for chemicals because the classification/ontology starts to be so complex that few contributors will handle it. Snipre (talk) 03:50, 17 May 2018 (UTC)

@Snipre: I don't think it would be so difficult if we point out in the description that we are talking about the atom or the substance. IMO, having a more precise ontology can actually help to sort out someone's doubts in the same way chEBI and IUPAC ontologies do. However, I think the current state is simply wrong because Wikidata shouldn't have an item that describes two different concepts with properties that applies only to one of them. – The preceding unsigned comment was added by Malore (talk • contribs) at 12:58, May 17, 2018‎ (UTC).

@Snipre: Actually, I believe it would be best to (by default) follow ChEBI - the wikidata item refers to the atom or molecule (in all contexts), as that is the more fundamental entity. While we might think the standard form of a chemical at STP is the standard "substance" it should represent, even for some elements there are multiple such forms (diamond and graphite and all the carbon nanotubes, buckyballs etc, and the sulfur examples we just saw). And even for molecules, there can be multiple stable crytal forms. So if we *need* an item for the substance, create it, otherwise just stick with an item for the fundamental unit. ArthurPSmith (talk) 17:12, 17 May 2018 (UTC)

type of business entity (Q1269299) and list of legal entity types by country (Q53400657)

Actually, type of business entity (Q1269299) has a lot of interlanguage links (but not an English Wikipedia one) and is not considered a Wikimedia list article, while list of legal entity types by country (Q53400657) has only the English Wikipedia interlanguage link and is considered a Wikidata list article. Is correct?--Malore (talk) 16:48, 14 May 2018 (UTC)

Yes, it seems correct to me. The first item is the concept itself and the second item is about a list of example of the concept. Cdlt, VIGNERON (talk) 10:13, 15 May 2018 (UTC)

@VIGNERON: But the English Wikipedia article linked to list of legal entity types by country (Q53400657) and the articles linked to type of business entity (Q1269299) talks about the same thing, types of business entities.--Malore (talk) 12:17, 15 May 2018 (UTC)

Malore mhhh, I'm sorry, I'm not sure to get what you want to say. I see Q1269299 and Q53400657 wich is a list of Q1269299. This is perfectly normal, like we have hundred of pair of item, for example Roman bridge (Q1223230) and list of Roman bridges (Q1862851). To me, this is clearly not the same thing. Cdlt, VIGNERON (talk) 13:20, 15 May 2018 (UTC)

@VIGNERON: Actually, there isn't a "business entity" item. What I'm trying to say is that the Wikipedia pages linked to type of business entity (Q1269299) seems to treat the same concept of the wikipedia page linked to list of legal entity types by country (Q53400657). Since I don't know well the topic, I don't know what to do.--Malore (talk) 15:38, 15 May 2018 (UTC)

(ec) :::::We have business (Q4830453), which is likely to be the ultimate parent class for most companies or businesses. The instance of (P31) statements on type of business entity (Q1269299) look very wrong (at least in their current English translations) -- an item should almost never be instance of (P31) abstract noun (Q2712963), and saying something is "an abstract noun of jurisprudence" makes very little sense. An alternativeidentification for type of business entity (Q1269299) might be instance of (P31) classification scheme (Q5962346) 'of' business (Q4830453), and subclass of (P279) business (Q4830453). Jheald (talk) 15:58, 15 May 2018 (UTC)

Ok, now I see. But is there a real difference between "type of business entity" and "business entity". Q1269299 use the first as label and the second as alias, maybe we can switch them to make it clearer (or even change for "legal form" which is an other alias but the label for the corresponding property legal form (P1454)). I'm not a specialist of this subject either, can someone else pitch in? Cdlt, VIGNERON (talk) 15:51, 15 May 2018 (UTC)

Since I can't find a WikiProject Law, I ping WikiProject Companies so maybe they could help us solve this issue

Notified participants of WikiProject Companies--Malore (talk) 17:08, 17 May 2018 (UTC)

Quote P1683

When using the quote feature in reference are we supposed to use quotation marks or not? --RAN (talk) 17:59, 16 May 2018 (UTC)

@Richard Arthur Norton (1958- ): no. If someone really need quotation marks, they can be added afterwards (in the style of the language and with the style preferred : "xxx" or “xxx”, or « xxx » or » xxx « or „xxx“ etc.). Cdlt, VIGNERON (talk) 07:18, 17 May 2018 (UTC)

Awards with warning about single value constraint

Skylar Diggins-Smith (Q2113195) won the Nancy Lieberman Award (Q3870098) in 2012 and 2013. When I add both dates as a point in time qualifier, the warning appears "single value constraint This property is generally expected to contain only a single value." The help link suggests that one should "Define a separator for the property". I don't know what that means, and I note the page inlcudes a warning, which I don;t understand.

I'm illustrating this with a specific example, but I've run into this multiple time today.--Sphilbrick (talk) 18:54, 16 May 2018 (UTC)

You should add two separate statements, one for each year they were awarded. See Bo Carpelan (Q318198) for an example. He has won Finlandia Award (Q50432647) in 1993 and 2005, therefore Q50432647 is listed twice. Shinnin (talk) 19:02, 16 May 2018 (UTC)

OK, thanks.--Sphilbrick (talk) 20:28, 16 May 2018 (UTC)

@Sphilbrick, Shinnin: I’ve updated the help page to include this case, and tried to clarify the “separators” instructions as well. --Lucas Werkmeister (WMDE) (talk) 11:07, 17 May 2018 (UTC)

Adding value <somevalue> with QuickStatements ?

Does anyone know if it is possible to set a value <somevalue> for a statement with Quick Statements ?

There's no indication of how to do it at Help:QuickStatements, but sometimes Magnus moves faster than the speed of documentation.

Alternatively, as a workaround, is there (or could there be) an approved "placeholder item" Q-number, that could then be turned into <somevalue> by a bot? Jheald (talk) 15:41, 15 May 2018 (UTC)

I have created placeholder for "somevalue" (Q53569537) Jheald (talk) 17:55, 15 May 2018 (UTC)

I have set up a bot to replace placeholder for "somevalue" (Q53569537) with <somevalue>. --Pasleim (talk) 21:03, 15 May 2018 (UTC)

Thanks!!! Jheald (talk) 05:14, 16 May 2018 (UTC)

Now it's documented on Help:QuickStatements. --Marsupium (talk) 10:24, 18 May 2018 (UTC)

Is there a 'short history of Wikidata' available?

Hi all

I'm want to provide a few line summary of the history of Wikidata for something, things like the growth in number of items, where Wikidata is being used in and outside Wikimedia projects, number of regular contributors, percentage of statements that are unsourced or are sourced with Wikipedia etc. Is there something like this already available somewhere? The Wikipedia article doesn't really provide this.

Thanks

--John Cummings (talk) 13:37, 17 May 2018 (UTC)

Regarding growth, you will find quite an exhaustive list on WD:News. You will also find there information about newly supported sister projects and some new features. Matěj Suchánek (talk) 17:52, 17 May 2018 (UTC)

en:Wikidata, but it's incomplete. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:00, 17 May 2018 (UTC)

thanks @Matěj Suchánek: and @Pigsonthewing:, the Wikipedia article is surprisingly not that great.... I'll add it to my list of things to try and collect some info on. --John Cummings (talk) 08:35, 18 May 2018 (UTC)

The dates when new datatypes were available at Wikidata can be found at Wikidata:WikiProject_Properties/Reports/Datatypes.
--- Jura 08:44, 18 May 2018 (UTC)

Help

Hey people, I have a thing to show to you all that it's bugging me. See Vicente de Carvalho (Q45224262). It is a monument at São Paulo city; at the property made from material (P186) there's two duplicates, bronze and granite, one with applies to part and the other don't. What is happening? I thought that it was a Quick Statements thing, but I tested and manually is happening too. Ederporto (talk) 18:00, 17 May 2018 (UTC)

@Ederporto: The two claims which do not have apply to part, were added in this merge. I'm not sure what the problem you're seeing is ... delete the two redundant claims and all is good? --Tagishsimon (talk) 18:35, 17 May 2018 (UTC)

@Tagishsimon: The problem is that this shouldn't happen, neither with Merge and manually, right? I don't have memory of this being allowed, and I could've be wrong, if you didn't suggest me to delete the duplicates (So I'm assuming you agree this shouldn't happen). This is only one item (I've created hundreds), how many there are that have duplicates statements? It is easier to not allow Merge or manual edits to duplicate if one must delete after, right? Ederporto (talk) 18:45, 17 May 2018 (UTC)

@Ederporto: A material used, which has a qualifier, is not the same as a material used which does not have a qualifier. So the merge is fine. I agree that the end result is wrong and two things should be deleted in this case. But the general principle that there is a distinction between statement with qualifier and statement without seems sound to me. It does follow that when editing blind, through quickstatements, you need to know whether you're going to leave the record in the state this one is in, with unforeseen near duplicate information; just as after a merge it's wise to check the result and delete different renderings of the same basic information.: I'm a tiny bit more surprised by this edit, which does introduce a duplicate; possibly there should be a constraint added to the property to highlight that as an issue. --Tagishsimon (talk) 19:29, 17 May 2018 (UTC)

@Tagishsimon: Quick Statements don't add duplicate statements, it shows something like "This statement already exists" and don't add the duplicate (if the Quick Statements sequence has an qualifier, it adds to the already existing one), so Quick Statements don't make that mistake. Why an error (or something that causes a wrong end result) is allowed both by merge and one adding manually (like my edition you linked, where I was testing if it would make the same mistake)? Ederporto (talk) 19:38, 17 May 2018 (UTC)

@Tagishsimon: Also, I thing the majority of the properties don't allow duplicate values, right? This should be helpful in position held (P39) for politicians, but its the only example I think it works. Ederporto (talk) 19:42, 17 May 2018 (UTC)

@Ederporto: I don't know the ratio of properties that do & do not allow duplicate values. I guess most of the external ID properties have such a contraint ~~- distinct-values constraint (Q21502410) presumably~~. P39 allows duplicates, AFAIK, but has a mandatory constraint of start-time. I don't think merge did anything wrong in this instance, although I grant that if what you say about QS is true, then merge is behaving differently to QS. ~~Nor do I know if Q21502410 - presuming that is a 'no-duplicates' constraint - factors in qualifiers.~~ We'd expect duplicated materials used, with differing qualifiers, for many objects; so a no-duplicates constraint which ignored qualifiers would not help. --Tagishsimon (talk) 20:15, 17 May 2018 (UTC)

Meanwhile, Help:Property constraints portal. --Tagishsimon (talk) 21:02, 17 May 2018 (UTC)

Wikidata:Requests for comment/Improving Wikidata documentation for different types of user

Dear all

Over the past year or so I've been working quite a lot on Wikidata documentation and have been thinking more about how these individual documents link together to allow different kinds of people to understand and contribute to Wikidata. To help with the community's understanding and planning of what resources are needed I've started an RFC to try and collate this information together.

Wikidata:Requests for comment/Improving Wikidata documentation for different types of user

Thanks

John Cummings (talk) 12:40, 18 May 2018 (UTC)

Notified participants of WikiProject Documentation

"Disallowed qualifiers" constraint?

Would there be value to a negative of the "allowed qualifiers" constraint? That is, a list of qualifiers that should not appear with a given property? I'm thinking of properties like has part(s) (P527), which currently uses allowed qualifiers constraint (Q21510851) – the number of qualifiers that could conceivably make sense with the property is extremely large, and adding them one at a time is not efficient. Would it make more sense to track the qualifiers that should not be used instead? Swpb (talk) 15:31, 17 May 2018 (UTC)

@Swpb: Recently new constraint created (Help:Property constraints portal/None of). This seems to work as, in short, black list. So I suppose perhaps combination of "None of" and constraint scope (P4680) might work as you think (qualifier black list). --Was a bee (talk) 17:19, 17 May 2018 (UTC)

@Was a bee: How would that work? has part(s) (P527)property constraint (P2302)none-of constraint (Q52558054) with constraint scope (P4680)constraint checked on qualifiers (Q46466783), and property (P2306)[qualifier(s) to be avoided]? That doesn't really look like what none-of constraint (Q52558054) was made for, and it seems very awkward. Why not have a separate constraint? Swpb (talk) 19:31, 17 May 2018 (UTC)

@Swpb: Um.. Yes. "None of" constraint seems to be applied only to value, not property. Although {{Complex constraint}} is available, basically this needs to request new constraint. --Was a bee (talk) 06:39, 19 May 2018 (UTC)

Wouldn't the exclusion list the same for many properties? Maybe a constraint that lists the properties a qualifier can be used with would work better.
--- Jura 06:45, 19 May 2018 (UTC)

Can replaces (P1365) and replaced by (P1366) be used for software versions?

I noted in such cases are mostly used follows (P155) and followed by (P156), but wouldn't be more appropriate to use replaces (P1365) and replaced by (P1366) instead?--Malore (talk) 21:14, 18 May 2018 (UTC)

At least with open source software, previous versions don't usually disappear, but can still be downloaded or extracted from repositories. I think follows (P155)/followed by (P156) is good. It's similar to other kinds of publications, such as book editions. Ghouston (talk) 05:07, 19 May 2018 (UTC)

Harvest ethnic group Baltic German - estimated 1600 new statements

Only 16 items have property:P172 (ethnic group) = Baltic German (Q157139). Such information could be imported from

It should only be applied to instances of Q5 (human). 213.39.185.159 09:05, 16 May 2018 (UTC)

Ethnic group shouldn't be imported and be only supported by trustable sources. Sjoerd de Bruin (talk) 09:45, 16 May 2018 (UTC)

Where is the constraint violation to check that you are right? 213.39.185.159 09:57, 16 May 2018 (UTC)

It's clearly present in the description and talk page. Sjoerd de Bruin (talk) 16:46, 19 May 2018 (UTC)

Epistemology

I'd like to get some help resolving Talk:Q336#What_is_science_(Q336)?. This is (to oversimplify) basically a decades-long German–English translation problem. Here are a few salient facts:

These three Wikipedia articles exist:

This much is generally agreed for the "English" concept:

Activity
- Knowledge
  - Science
    - Natural sciences (e.g., physics)
    - Formal sciences (e.g., mathematics)
    - Social sciences (e.g., sociology or economics)
    - (but NOT humanities, fine art, theology, etc.)

This much is also generally agreed for the "German" concept:

Activity
- Knowledge
  - Wissenschaft
    - Natural sciences (e.g., physics)
    - Formal sciences (e.g., mathematics)
    - Social sciences (e.g., sociology or economics)
    - AND Humanities (e.g., literature or theology)
    - AND Art

So it's obvious from the list that Science ≠ Wissenschaft, but the common lay person "translation" from German "Wissenschaft" into English is "science". (Experts who are writing in English and who need to be precise more usually translate Wissenschaft as something like "knowledge-generating activity" or "systematic knowledge", which leaves less doubt about whether their statements encompass literature and theology as well as physics and chemistry.)

I'd like to have the two Wissenschaft articles directly connected (both ways) with interlanguage links, but since most editors aren't experts in epistemiology, I think that it might be disruptive at the Wikipedias. On the other hand, stuffing the broader concept of Wissenschaft into the narrower, English-oriented item is objectively wrong, which is bad for Wikidata.

What do you advise? WhatamIdoing (talk) 18:58, 17 May 2018 (UTC)

Not sure if it's worth saying, as it's fairly general point: I don't see a problem of having two different items labeled the same in English. Stuffing partially related sitelinks onto the same item isn't really one of Wikidata's objective. It can be done with local interwikis where desired.
--- Jura 19:52, 17 May 2018 (UTC)
- So if I've understood this correctly, you would recommend
  1. removing the site link to w:de:Wissenschaft from Q336,
  2. adding a site link w:de:Wissenschaft to into Q8027727, and
  3. locally overriding the removal of the sitelink to the article from Q336 by manually adding it back in to the Wikipedia article? WhatamIdoing (talk) 21:03, 17 May 2018 (UTC)

I would support a split into one item including the study of human society and culture and one item excluding it, but I would rather create a new item than using Wissenschaft (Q8027727); Wissenschaft (Q8027727) is specifically about Wissenschaft in Germany (e.g. German university and education, approaches to translate the term into English), while w:de:Wissenschaft is about the general concept of systematic acquisition and preservation of knowledge. There are other articles that seem to include the study of human society and culture (e.g. w:cs:Věda, w:da:Videnskab, w:la:Scientia_(ratio)) and thus seem to be about a concept closer to the German Wissenschaft than to the English science. Those should be moved to the new item, too- Valentina.Anitnelav (talk) 09:58, 19 May 2018 (UTC)

How have German followers of logical positivism (Q193627) or Karl Popper (Q81244) and falsifiability (Q220888) talked about these ideas? Would they have Wissenschaft without literature and theology or would they use some other term? Ghouston (talk) 04:40, 18 May 2018 (UTC)
- Ghouston, I believe that they solve this difficulty by talking specifically about the natural science (Q7991), formal science (Q816264), or social science (Q34749) that interests them. From a philosophical POV, there are some very reasonable criticisms of the "English" model of science. Combining experimental work with the processes for naming natural objects and with exercises in pure logic about the universe – but excluding, e.g., the very similar processes for naming cultural objects and exercises in pure logic about people – does not always result in a natural category. So they get more specific when it matters. WhatamIdoing (talk) 15:31, 18 May 2018 (UTC)
As a native German speaker I'd like to clarify one point in the description of "Wissenschaft":
- humanities (literary studies (Q208217), musicology (Q164204), art history (Q50637)) fall under the concept of "Wissenschaft"
- art (Q735) (visual arts (Q36649), music (Q638), literature (Q8242)) does NOT fall under the concept of "Wissenschaft"

Currently art (Q735) is a subclass of science (Q336) (via arts (Q2018526)) which is wrong even from a German perspective. -Valentina.Anitnelav (talk) 09:50, 18 May 2018 (UTC)

Thanks for this comment, Valentina.Anitnelav. Some people classify the arts under the humanities, e.g., "the literary arts". IMO it makes sense to separate "the systematic study of a symphony" from "composing a symphony", but other people do not always agree with me. :-)

We probably need to have ways of communicating multiple, mutually exclusive classification systems (because that's how current scholarship on the subject works). For example, both of these are valid:

Knowledge > Non-science > Humanities > Language > Human languages > Spanish language
Knowledge > Wissenschaft > Humanities > Language > Human languages > Spanish language

But the different items are not the same, because this is also valid:

Knowledge > Non-science > Anti-science > Pseudoscience

while Wissenschaft never includes any type of anti-science. I assume that this is possible somehow, but I don't know how to do that myself. WhatamIdoing (talk) 15:20, 18 May 2018 (UTC)

In this case humanities (Q80083) should rather not be a subclass of Wissenschaft and we would need to identify those subfields that are considered a Wissenschaft (or Geisteswissenschaften (Q944537)) individually. - Valentina.Anitnelav (talk) 10:03, 19 May 2018 (UTC)

New constraint types

Hello all,

Constraints can be added to properties in order to check the possible inconsistencies in the data. They are reported in the item pages, on the related statement, for all logged-in users (that’s the small “!” icon that you see in the statements).

Over the past months, several new constraint types have been integrated to the system. Here’s the list of what you can now use:

“allowed units” (example: length (P2043) should have some unit that can be converted to metre (Q11573); Erdős number (P2021) should not have any unit)
“no bounds” (example: Elo rating (P1087) should not have any bounds (±anything))
“allowed entity types” (example: item for this sense (P5137) should only be used on lexeme entities)
“none of” (example: website account on (P553) should not contain any of the values for which a dedicated property exists (e. g. Wikimedia username (P4174)))
“single best value” (example: population (P1082) may have a list of values at different points in time, but should only have a single current value)

And here is what will come in the next weeks:

“integer” (example: number of children (P1971) should not contain fractional values)
“separators” parameter for “single value” and “single best value” (example: an item may have more than one sex or gender (P21) statement as long as the statements are separated by different start time (P580) or end time (P582) qualifiers)

Did you know that you can search for constraint definitions (though not yet for individual constraint violations) with the query service? For example, this query finds all properties with no-bounds constraint (Q51723761), and this one lists constraints that have separator (P4155) parameters.

For more information, you can check the documentation portal. You can also join the WikiProject Property Constraints where the constraint system is discussed in more details. If you’re interesting in monitoring the constraints, you can also check this tool that shows the number of constraint violations for each property and constraint type over time.

Cheers, Lea Lacroix (WMDE) (talk) 20:49, 17 May 2018 (UTC)

@Lea Lacroix (WMDE): Thank you. A can't have the same value as this other property constraint would be nice. Thierry Caro (talk) 09:29, 18 May 2018 (UTC)

@Thierry Caro: Which properties are you thinking of? --Lucas Werkmeister (WMDE) (talk) 10:50, 18 May 2018 (UTC)

@Lucas Werkmeister (WMDE): This would be useful for properties with subproperties. located in/on physical feature (P706) shouldn't have a given value if used on an item where drainage basin (P4614) has that same value, for instance. Thierry Caro (talk) 10:55, 18 May 2018 (UTC)

@Thierry Caro: I see… but doesn’t that mean that drainage basin (P4614) generally conflicts with located in/on physical feature (P706)? --Lucas Werkmeister (WMDE) (talk) 12:46, 18 May 2018 (UTC)

@Lucas Werkmeister (WMDE): Well, an item with drainage basin (P4614) can still have a located in/on physical feature (P706) statement, if the latter is a mountain or a peninsula for example. The conflicts come from te two having the same value. Thierry Caro (talk) 17:35, 18 May 2018 (UTC)

Lazy approach: how about "minimum number of statements on item"?
--- Jura 11:34, 18 May 2018 (UTC)

Was also suggested on Help_talk:Property_constraints_portal#Improvements_for_2018, we will try to discuss this with a bunch of people in Barcelona. Sjoerd de Bruin (talk) 07:48, 19 May 2018 (UTC)

Here is a sample chart for a property: Property_talk:P5138#Number_of_statements.
--- Jura 08:05, 19 May 2018 (UTC)

Completeness of items about US states (and territories)

At Wikidata:Lists/US states, there is now a (somewhat trivial) list of US states.

Completeness of these items is evaluated at Wikidata:Database_reports/Constraint_violations/P5086. When created, many were missing a few key and a couple of random statements. In the meantime, these are fairly complete. Obviously, the inclusion of the item for "United States Minor Outlying Islands" is somehow a nuisance.

If you use "Recoin" on these items, many are only evaluated as "this page provides basic information" or "this page provides a fair amount of information". Marginally relevant (IMHO) "Commons gallery", "sister city", "topic's main template", "exclave of", and "motto text" are frequently listed as missing. (Disclaimer: I checked 10 states before writing this).
Content for some lists at w:Category:Lists of states of the United States is available. Ideally we would probably have a Wikidata query or even list for each of them.
For those interested in identifiers, the identifier count on Wikidata:Lists/US states can be used (column "ids"). The count should probably be similar for all states (currently 25 to 41).

I would be interested in more groups of missing statements, even if they need new properties.
--- Jura 09:58, 19 May 2018 (UTC)

Located in the administrative territorial entity

When I fill in "located in the administrative territorial entity" as a location, is there a bot to fill in "country" automatically? It seems like a task easily automated. --RAN (talk) 17:11, 19 May 2018 (UTC)

change software version identifier (P348) name to "software version name string" and add a "software version" property with item datatype

There are items (like Windows 10 (Q18168774) and Firefox 4 (Q1950119)) that represent software versions. In order to link to them from the software item we have to add statement is subject of (P805) as qualifier of software version identifier (P348). Wouldn't be better to have two separate properties like author (P50) and author name string (P2093), one for software versions without an own item (the actual software version identifier (P348)) and one for software versions with an own item?--Malore (talk) 17:11, 18 May 2018 (UTC)

Support Software versions should be a separate item, in a similar way to how version, edition or translation (Q3331189) would be treated. Continued use of software version identifier (P348) (but only one statement allowed per item) and a new property equivalent to edition or translation of (P629), or just reuse of edition or translation of (P629) would be my proposal. Dhx1 (talk) 05:02, 19 May 2018 (UTC)
- Would there be much to be gained by having every release of something like Linux kernel (Q14579) as a separate item? There are hundreds of official releases by now, not to mention unofficial versions. Perhaps it would be useful for something, I don't know. Ghouston (talk) 05:24, 19 May 2018 (UTC)
  - My proposal was simply to create a property that could link a software to an already existing software version item and to rename software version identifier (P348) in "software version name string". I'd like to have an item for every version but I don't know if it's worth it--Malore (talk) 12:04, 19 May 2018 (UTC)
  - @Ghouston, Dhx1: On second thought, I think we need items for at least major and minor versions because the item about the software as a whole should list only major versions, the items about major versions of the software should list minor versions and the items about minor versions should list releases/builds. In the case of Mozilla Firefox (Q698), for example, software version identifier (P348) has both minor versions and releases as values.--Malore (talk) 17:15, 20 May 2018 (UTC)

"Member of" or "Part of"

Harpo Marx (Q317237) is "Part of" the Marx brothers and some of the other siblings are "Member of" the Marx brothers, and one is listed as both. This is seen in other human groups. What is the standard to harmonize on? Is "Member of" reserved for humans and "Part of" for no-human or even restricted to inanimate objects? --RAN (talk) 22:11, 19 May 2018 (UTC)

part of (P361) can definitely be used with humans as well (e.g., members of the Beatles), and i think it might even be a little more apt than member of (P463) because that seems to be reserved for 'organizations or clubs', so maybe more for things like companies or non-profits than comedy groups. Husky (talk) 00:34, 20 May 2018 (UTC)

For brothers and sisters, I think we generally use "part of". For bands and other groups, "member of". Obviously, some are both. Wikidata:WikiProject Q5/lists/duos has a list of duos.
--- Jura 08:53, 20 May 2018 (UTC)

Wikidata weekly summary #313

Here's your quick overview of what has been happening around Wikidata over the last week.

Discussions
- Open request for adminship: Addshore, Pintoch
- New request for comments: How to manage software versions, Improving Wikidata documentation for different types of user
- Closed request for comments: Privacy and Living People

Events
- Past: Europeana Tech Conference (including a lot of Wikidata workshops and discussions)
- Past:Wikimedia Hackathon 2018, 18-20 May in Barcelona
  - Check the hashtag #wmhack on Twitter to see what has been worked on regarding Wikidata
  - List of the projects that have been demoed during the showcase
- Next Wikidata IRC office hour: May 29th at 18:00 (UTC+2, Berlin time) on the channel #wikimedia-office

Press, articles, blog posts
- A look back at the first Federated-Wikibase-Workshop
- Martin Poulter gave a TEDxBathUniversity talk about Wikidata
- The Joint Roadmap for Open Science Tools (JROST) has been launched; Wikidata is represented by the Wikimedia Foundation
- "Translating a blog post into structured data" Martin Poulter, the Bodleian Digital Library blog

Other Noteworthy Stuff
- A bunch of new constraint types were recently added
- A Request for comments on improving Wikidata documentation for different types of user
- We have deprecated units used for this property (P2237). Please update any tool which uses this property to use the new API before it gets deleted.
- EditGroups is a new tool that lets you review, discuss and revert entire edit groups made by various tools.
- PictureThis! Is a new tool integrated to WikiShootMe! that allows you to choose an item and upload a picture of it

Did you know?

Development
- Hackathon!
- Work on adding “integer” constraint (phab:T167989)
- Fix a bug expanding the references with a constraint violation (phab:T193669)
- Add a "mis" code language to enable uncoded languages in Wikibase Lexeme (phab:T194754)
- Improve the different language fields in the interface of editing a Lexeme (phab:T191504)
- Finish work to edit Forms via the web API (phab:T190906)
- Include special Lexeme IDs (phab:T187060)

You can see all open tickets related to Wikidata here. If you want to help, you can also have a look at the tasks needing a volunteer.

Monthly Tasks
- Add labels, in your own language(s), for the new properties listed above.
- Comment on property proposals: all open proposals
- Suggested and open tasks!
- Contribute to a Showcase item.
- Help translate or proofread the interface and documentation pages, in your own language!
- Help merge identical items across Wikimedia projects.
- Help write the next summary!

Read the full report · Unsubscribe · Lea Lacroix (WMDE) 06:56, 22 May 2018 (UTC)

Vision Éternel

I'm concerned about many of the statements on Vision Eternel (Q29287816); please see Talk:Q29287816. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:01, 22 May 2018 (UTC)

Sports teams and home venues

When I add the home venue of a team to the team page—for example, see the addition of Reese Court (Q7306996) to Eastern Washington Eagles women's basketball (Q30325908), I get a warning. The warning is understandable - it notes that the venue doesn't have the team as an occupant.

However, the venue does include Eastern Washington Eagles (Q3496221) as an occupant. For those not familiar with US University sports teams, the term "Eastern Washington Eagles" is the term that includes all of the sports teams associated with the school while "Eastern Washington Eagles women's basketball" is one specific example of such a sports team.

I'm not sure whether I'm supposed to cure the error by simply adding the individual sports teams as occupants, or if a better approach is to go to the "umbrella" entry and add some "has part" statements. I'll also note I haven't checked to see if this would work.

I poked around some other entries to see if I could see how it should be handled but I have come across anything so far.--Sphilbrick (talk) 13:58, 20 May 2018 (UTC)

Eastern Washington Eagles (Q3496221) cries out for has part(s) (P527), and Eastern Washington Eagles women's basketball (Q30325908) for part of (P361) irrespective of the inverse constraint issue with home venue. Given the hypothetical that a club might have a home venue, but employ a second location for the matches of one of its teams, it seems safer / more complete to me to include each distinct team, and the club, as occupants of Reese Court (Q7306996). --Tagishsimon (talk) 14:19, 20 May 2018 (UTC)

I tried adding the "part of" and "has part" statements. That did not cure the warning on the venue but in fairness, if I read your comment correctly you weren't suggesting that it would solve the problem — I believe you are suggesting it ought to be done anyway. I haven't yet decided whether that's worth it. Seems like a lot of work and I'm not clear I follow the benefit. I have added the specic team as an occupant.

It still seems odd to me to have both Eastern Washington Eagles and Eastern Washington Eagles women's basketball team as occupants. However, my goal is to take care of the women's basketball teams so I leave it to someone else to sort out what looks like an anomaly to me.

Thanks for your help. --Sphilbrick (talk) 00:15, 21 May 2018 (UTC)

owned by (P127) — home venue (P115) pair seems to be the best for sports organization (Q4438121) or any other proprietor (Q16869121) — sports venue (Q1076486). For sports club (Q847017)/sports team (Q12973014) I suggest occupant (P466) — home venue (P115). - Kareyac (talk) 14:34, 20 May 2018 (UTC)

I'm not quite sure am following your advice. It sounds like you are disagreeing the home venue ought to be associated with a sports team. But I may be misreading your comment.--Sphilbrick (talk) 00:17, 21 May 2018 (UTC)

My brevity really misleads, sorry. I'm for the smallest unit, that have item in WD, therefore "team", not "club".

If team plays in stadium, university owns stadium and club operates stadium, can we state Reese Court (Q7306996) → operator (P137) → Eastern Washington Eagles (Q3496221) ? - Kareyac (talk) 03:44, 21 May 2018 (UTC)

This is part of a wider team/club discussion. At some point we'll be more proactive in distinguishing the two. For the moment, confusion is tolerated and thus the described problem will stay. Thierry Caro (talk) 00:24, 21 May 2018 (UTC)

This sheds some light on something that was puzzling me. When I initially started work on women's basketball teams I notice that the "instance of" was often blank, while some had basketball team (Q13393265), some had university and college sports club (Q2367225) and some had both. I opted to include both, but if this gets sorted out maybe some change will be warranted.--Sphilbrick (talk) 19:16, 22 May 2018 (UTC)

Why not add references to Wikidata (property proposals or discussions) in property statements?

Currently, it's difficult to say where a particular property statement comes from. Most of them come from the property proposal but some comes from Wikidata discussions and others are simply added without any previous agreement.--Malore (talk) 16:07, 20 May 2018 (UTC)

I agree. In fact, during the Wikimedia Hackathon 2018, it was briefly discussed about this point. Like Wikiprovenance (which is currently tracking number of reference statements on items), we may also add support on toollabs:wdprop to track the number of references on properties. This may encourage property creators (and editors) to add support statements on properties as well. John Samuel 17:36, 20 May 2018 (UTC)

@Jsamwrites: Maybe property creators should do it when they creates new properties--Malore (talk) 21:36, 20 May 2018 (UTC)

I have been doing this for a while with a script, see FloraCatalana ID (P5179) for instance. If you have suggestions about how to improve these references let me know! − Pintoch (talk) 05:54, 21 May 2018 (UTC)

I guess the easiest solution would be to incorporate this link into the documentation template, which is always located at the talk page of the property.--Ymblanter (talk) 08:30, 21 May 2018 (UTC)

I am not sure what you mean - are you suggesting to change the Lua code that generates the documentation on the talk page so that it also renders references of statements on the property? Or are you suggesting to put a link to the proposal in that documentation (which is already the case for most properties)? − Pintoch (talk) 10:01, 21 May 2018 (UTC)

I was suggesting to put a link in the proposal to the documentation.--Ymblanter (talk) 15:27, 21 May 2018 (UTC)

@Malore: Now toollabs:wdprop also lets you navigate the properties with references. There are currently 772 properties with references (or 'equivalent properties' link). Take for example sex or gender (P21) has 1 reference and 4 equivalent properties. John Samuel 21:42, 22 May 2018 (UTC)

Observatories

Hello. There are still a few hundreds astronomical observatory (Q1254933) still stored as instances of observatory (Q62832). Is there someone here nice enough to move them to astronomical observatory (Q1254933) or to its subclass of (P279)? I'm not a specialist. Thierry Caro (talk) 06:47, 22 May 2018 (UTC)

Comment. I have already moved all that have a Minor Planet Center observatory code (P717) statement. Thierry Caro (talk) 06:58, 22 May 2018 (UTC)

@Thierry Caro: Nice spot - I hadn't realised that we had a separate item for astronomical observatory. Basically, most of the ones that are observatory (Q62832) are actually astronomical observatory (Q1254933). I'm working through them, tracking lists are at en:User:Mike Peel/Astronomical Observatories and en:User:Mike Peel/Observatories. Thanks. Mike Peel (talk) 22:13, 22 May 2018 (UTC)

@Mike Peel: Thank you. I'll try to add nighttime view (P3451) statements to them later. Thierry Caro (talk) 02:02, 23 May 2018 (UTC)

Merge or not? High Wycombe railway station building (Q26668593) and High Wycombe railway station (Q1851135)

Can someone live in High Wycombe tell me about both? --Liuxinyu970226 (talk) 01:08, 19 May 2018 (UTC)

I don't live there, but it looks like High Wycombe railway station building (Q26668593) was created as an import of heritage buildings, while High Wycombe railway station (Q1851135) is linked with other projects but describes the same station. So merge. Ghouston (talk) 05:12, 19 May 2018 (UTC)

No, the station building is somewhat different from the actual station. It could be non-functional for example, the monument classification doesn't apply for everything. Sjoerd de Bruin (talk) 07:46, 19 May 2018 (UTC)

I agree, keep both items and link the building to the station with a part of (P361) - has part(s) (P527) relationship. Simon Cobb (Sic19 ; talk page) 18:48, 19 May 2018 (UTC)

I've done that, and tweaked the labels and descriptions; all of which makes Liuxinyu970226's header, written before the changes, look a little nonsensical. Sorry about that, Liuxinyu970226 :) --Tagishsimon (talk) 18:56, 19 May 2018 (UTC)

Per the list entry: "Railway terminus station and engine shed, later goods shed, now in commercial use. " The listed building is not all within the current station. See en:High Wycombe railway station for more. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:42, 23 May 2018 (UTC)

Adding multiple values for a property

Not sure if this is the right place to drop this, I'm kinda new here, but please how do I go about adding multiple values of equal rank to a single property, like the way the property "subclass" in Golden Globe Awards (Q1011547) has two values (film and tv awards). Obviously this is different from qualifiers, which I believe I understand already. Thanks for the reply in anticipation. HandsomeBoy (talk) 21:01, 22 May 2018 (UTC)

I've seen it, don't mind me. There is another "add value" underneath. HandsomeBoy (talk) 21:04, 22 May 2018 (UTC)

But why do I need to add one value before adding another within the same statement? Is there a way to add multiple values at once? It seem like I need to click "publish" before the "add value" shows again. HandsomeBoy (talk) 21:13, 22 May 2018 (UTC)

One statement can only hold a single value, you probably mean to add a new statement to a "statement group". And no, it has never been possible to add/modify more than one statement at once. Matěj Suchánek (talk) 15:05, 23 May 2018 (UTC)

Thanks, you're right. I will add "statement group" to my Wikidata terminologies.HandsomeBoy (talk) 17:38, 23 May 2018 (UTC)

A Nigerian film award that takes place in United States annually

Please I need a property for Golden Icons Academy Movie Awards (Q18345192) to show that the ceremony is mainly for Nollywood films, even though it holds in the US. The property "country" from my understanding should be US, since that is where the awards has been holding since inception. Please if this is not the right place for these sort of questions, please someone should let me know. Thanks. HandsomeBoy (talk) 21:27, 22 May 2018 (UTC)

You should request it at WD:PP. Mbch331 (talk) 06:51, 23 May 2018 (UTC)

@Mbch331: That's not what was requested; see below. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:26, 23 May 2018 (UTC)

@HandsomeBoy: I've added field of work (P101) and main subject (P921). Others may be able to argue why one of those is preferable to the other. This is the right place for such discussions. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:26, 23 May 2018 (UTC)

Thanks Mbch331 and Pigsonthewing, I'm liking this community better than Wikipedia lol.HandsomeBoy (talk) 17:40, 23 May 2018 (UTC)

direction (P560)

direction (P560) appears to be used in (at least) two very different ways (in both cases, typically as a qualifier). For example South America (Q18) uses it to say that Central America (Q27611) and North America (Q49) are north of South America. Central Station (Q15166) also uses it in this sense. However, Washington Union Station (Q3570) (like Central Station (Q15166) a rail station) uses it to say that Silver Spring station (Q7516347) is in the direction Martinsburg (Q1929803), that is to say along the railway line that leads there. Watford Junction railway station (Q10698) uses it similarly to Washington Union Station (Q3570). I would think these should be two separate properties. - Jmabel (talk) 05:25, 23 May 2018 (UTC)

We have that property already - towards (P5051) which should point to the terminus of the linear feature. adjacent station (P197) actually lists this qualifier as mandatory, whereas the direction is only optional. I have fixed Q7516347 already. Ahoerstemeier (talk) 08:46, 23 May 2018 (UTC)

What about "left" and "right" as used for the tributaries in Po (Q643)? - Jmabel (talk) 15:13, 23 May 2018 (UTC)

how to model references for number of subscribers used as a qualifier

Hi, I have been invited by @Pigsonthewing: to request your help on finding an elegant solution to this issue: I'm adding a number of subscribers (P3744) to X username (P2002) and subreddit (P3984) statements, because I find it informative (see import scripts) but those are very time sensible values, so we need a way to add a timestamp: the solution I found was to add a retrieved (P813) in references, but this choice was considered a bad option by @Pigsonthewing:. @VIGNERON: would prefer using point in time (P585). What do you think? Would adding the implicit URL of the account being checked to the reference address the concern? Any better option? or should those data just be deleted? cc @Envlh: -- Maxlath (talk) 09:15, 22 May 2018 (UTC)

@Maxlath: point in time (P585) is the normal way to timestamp time-sensitive claims - see, for instance, population (P1082) in United Kingdom (Q145), or any other country. References are good, and retrieved (P813) good also; and I do appreciate that the P813 value is likely to be much the same as the P585. However the P585 value is much more likely to be relied on by anyone seeking to access the data programatically, since for most time-sensitive data, P813 will not be expected to be identical to P585. --Tagishsimon (talk) 12:08, 22 May 2018 (UTC)

@Maxlath: point in time (P585) would be good to have a qualifier. I'd also add a reference with reference URL (P854) and retrieved (P813), though. What's your plan for keeping the numbers up-to-date in the future - will you add new numbers every so often in addition to the existing values? Thanks. Mike Peel (talk) 22:01, 22 May 2018 (UTC)

I was thinking maybe to keep reference URL (P854) for Wayback Machine (Q648266) URLs where one is available, but that would require to use those as source instead of the live pages, and would only work for the accounts that are archived there -- Maxlath (talk) 19:36, 23 May 2018 (UTC)

@Maxlath:To be honest, I'm not entirely convinced by point in time (P585) in qualifier (ideally it should be a qualifier of qualifier which is not possible) but I think it seems better than retrieved (P813) in reference. Cdlt, VIGNERON (talk) 07:26, 23 May 2018 (UTC)

How do I enter a profile on Wikipedia.

I will like to upload a footballers profile on Wikipedia and need a guide on how to go about it

This is Wikidata, not Wikipedia. You may want to go to https://en.wikipedia.org/wiki/Help:Getting_started --Anvilaquarius (talk) 09:46, 24 May 2018 (UTC)

sandbox is your friend; you can ask more specific questions at w:Wikipedia:Teahouse. Slowking4 (talk) 10:28, 24 May 2018 (UTC)

Property for nomination venue

Africa Movie Academy Awards (Q384139) usually takes place in Nigeria annually, but the nomination venue is usually rotated across many African countries. Is there a property for award ceremonies with a venue for nomination party, because I am assuming AMAA is not the only award in the world that adopts such practice? HandsomeBoy (talk) 17:58, 23 May 2018 (UTC)

@HandsomeBoy: Use significant event (P793), qualified with location (P276) You'll need to create a new, generic "nomination party" item, and to use its QID as the value for the former property; or create an item for each party; in which case the location can go on that. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:04, 24 May 2018 (UTC)

Thanks, I took your first suggestion. HandsomeBoy (talk) 17:23, 24 May 2018 (UTC)

IP editor adds qualifier entries and uses the "instance of" property for them – in my understanding an incorrectly modeled data structure

Please check [13], [14], [15]. I presume this is incorrectly modeled.

Do we have a valid qualifier which can be used to achieve the intended result?
Is there any possibility to contact an anonymous editor on WD (or Commons) without blocking him?

--Zaccarias (talk) 01:51, 24 May 2018 (UTC)

@Zaccarias: yes, P31 should not be used as a qualifier like that. If the statement has a statement is subject of (P805) qualifier then the P31 of that value is essentially filling that role, and there's really no need for this sort of qualifier at all. In other cases, object has role (P3831) might be appropriate. You can try a message on the IP talk page but the user may not see it; better might be just to undo these edits with a note in the undo message about why. ArthurPSmith (talk) 13:56, 24 May 2018 (UTC)

Duplicate Lexeme entities, and what to do before merge & redirect

Hi all. I spotted my first duplicate lexeme entities this morning Lexeme:L503 and Lexeme:L506 created one after another with the exact same Lemma. Really we should delete one and redirect it, but that is currently not technically possible. Should we come up with some sort of system? A Template on the talk page perhaps? Or would all of this be overkill? I don't imagine too many people care that much right now as the data is young and not used. Thoughts? ·addshore· ^{talk to me!} 09:01, 24 May 2018 (UTC)

This seems important to get right soon so the backlog doesn't pile up. Is it possible/sensible to just use the same merge options as with Q numbers? It seems sensible to maintain a unified user experience across multiple types of items. John Cummings (talk) 13:37, 24 May 2018 (UTC)

@Addshore: it was pointed out to me that you can do a search for existing lexemes (to avoid duplicates) by picking a property that is lexeme-valued and doing the autofill search there. Anyway, to "fix" the present case I figured it wouldn't hurt at this point to change Lexeme:L506 to the French word with the same meaning... hope that's ok! ArthurPSmith (talk) 14:13, 24 May 2018 (UTC)

How to update Template:Discussion navigation?

I would have liked to add Wikidata talk:Lexicographical data to {{Discussion navigation}}, however I have seen that there are subpages for every language... Is there any way to update all subpages at once?--Micru (talk) 12:01, 24 May 2018 (UTC)

You should update Template:Discussion navigation/text. Sjoerd de Bruin (talk) 14:17, 24 May 2018 (UTC)

Thanks!

Done --Micru (talk) 16:25, 24 May 2018 (UTC)

Reference url or official website url?

I'm trying to add references, in the form of urls, to statements.

I've been using reference URL (P854), but now I'm wondering if I should be using official website (P856)

Here is an example: Quick Facts

I originally chose not to use the official url property because this isn't THE official website, but now I'm think I may have been too anal – the link example is clearly created by the university and part of the overall website. I'm now thinking that I should be using the official website (P856) property.--Sphilbrick (talk) 15:48, 24 May 2018 (UTC)

In references, use reference URL (P854) whether or not it's an "official" source. official website (P856) should be reserved for the main website of the organization or other entity, as a property directly on that item. ArthurPSmith (talk) 17:53, 24 May 2018 (UTC)

Thans, glad to hear I don't have to make a lot of changes.--Sphilbrick (talk) 17:59, 24 May 2018 (UTC)

QuickStatements is down ?

Is anybody else having a problem with QuickStatements at the moment ?

A QS run I was doing halted at 16:42 (UTC), and doesn't seem to want to start again. Jheald (talk) 19:06, 28 May 2018 (UTC)

Resolved -- There seems to have been one edit in the set that QS really didn't want to make. Having now done that manually, the rest have gone through with no bother. Jheald (talk) 13:22, 29 May 2018 (UTC)

This section was archived on a request by: Matěj Suchánek (talk) 06:37, 30 May 2018 (UTC)

"The SPARQL query resulted in an error."

There are several errors in Montana State Bobcats women's basketball (Q30325919)

"The SPARQL query resulted in an error."

Not sure what is the problem.--Sphilbrick (talk) 19:22, 29 May 2018 (UTC)

@Sphilbrick: seems to have gone away; I did some playing with the record, but I don't think that's connected to the disappearance of the issue. --Tagishsimon (talk) 19:57, 29 May 2018 (UTC)

I'm not far away. My guess is that it is unrelated to my specific edits, which are virtually identical to many similar ones.--Sphilbrick (talk) 20:20, 29 May 2018 (UTC)

And now things are fine. I'm guessing there was a database burp, now recovered.--Sphilbrick (talk) 20:22, 29 May 2018 (UTC)

This section was archived on a request by: Matěj Suchánek (talk) 06:37, 30 May 2018 (UTC)

Merge

Julie Boettcher Q54506663 = Q54513511 85.182.28.185 20:38, 30 May 2018 (UTC)
- Thank you. Done. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:35, 30 May 2018 (UTC)

This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:35, 30 May 2018 (UTC)

Are there any qualifiers that map progress of something?

Hi all

I need to map the progress of dataset imports using qualifiers for the new property Wikidata Dataset Imports URL (P5195), mainly just 'in progress' and complete' (although something to say 'needs updating on x date' would also be super helpful). Is this possible? Or do I need to create a new property/s?

Thanks

--John Cummings (talk) 12:08, 24 May 2018 (UTC)

We have expected completeness (P2429) and {{User:Pintoch/coverage}}, mostly for identifiers. They seem related but might not completely fulfill your needs? − Pintoch (talk) 11:00, 25 May 2018 (UTC)

Thanks @Pintoch:, hmmm, that's not quite right for what I need. I think I'll need to request a new property. --John Cummings (talk) 15:48, 25 May 2018 (UTC)

Archive of conversations missing Overview

For some reason the Overview of archived conversations has not been created on Wikidata talk:Lexicographical data/Archive, although the conversations have been archived. Do I need to add anything else? --Micru (talk) 19:24, 24 May 2018 (UTC)

Adding Statements with QS which already exists

dear all, i have prepared an import to add archives at (P485) to all authors archived at Austrian National Library (Q304037). i have also the information about the specific inventory number (P217) and its described at URL (P973). for each author exists at least 1 inventory number or a small amount of numbers (<=3). if i add inventory numbers and its uris in one statement it seems to be problematic to make a query which shows inventory number and its corresponding url. so i suggest to set the information like done in Peter Altenberg (Q44972) - but then an import via QuickStatement is not possible, because the tool prevents creating a statement which seems to exist already (because the difference comes from the qualifiers...) someone has any suggestions? --Mfchris84 (talk) 09:52, 25 May 2018 (UTC)

Would you consider trying out OpenRefine for that? The upcoming editing features of OpenRefine could help you, because OpenRefine matches statements by looking at qualifiers too (not just on the main values). See the wiki to run the development version (if that is too complicated, a new beta release should be out in the coming week.) − Pintoch (talk) 10:56, 25 May 2018 (UTC)

Announcement: Natural Earth v4.1.0 released ( with better Wikidata integration )

NaturalEarthData.com : "A global, public domain map dataset available at three scales and featuring tightly integrated vector and raster data."

Whats new in v4.1.0:

"Expands the name localization added in v4.0 to 21 languages (up from 7) and several dozen themes expanding from populated places to include all admin-0, admin-1, rivers, lakes, playas, geographic lines, physical labels, parks, airports, ports, and more. As part of this work a new unique and stable “ne_id” has been added for any feature with a name translation &/or a Wikidata ID concordance. The full list of languages is: name_ar*, name_bn*, name_de, name_en, name_es, name_fr, name_el*, name_hi*, name_hu*, name_id*, name_it*, name_ja*, name_ko*, name_nl*, name_pl*, name_pt, name_ru, name_sv*, name_tr*, name_vi*, and name_zh. (Names with * indicate new language in v4.1 series.) A 2-character language code decoder ring. Props to Wikidata for their CC0 license. Want to see more name translations in the next Natural Earth release? Go edit Wikidata!"

see more:

--ImreSamu (talk) 13:07, 25 May 2018 (UTC)

Checking my work, and more

With the help of Tagishsimon, I've been working on adding data to articles about NCAA Division I women's basketball teams in the US.

I've reached the point where I'd like to do a couple things that I think are related. I'd like to learn how to run a report to output the results, for example create a list of all nicknames. Before I do that, I'd like to double check my work by running a report to identify the elements I have added to see which ones I'm missing and/or made a mistake.

I'm not quite sure where to start. Do I create a report or a query are these the same things?

Tagishsimon created a report for me Wikidata:Project_chat/Archive/2018/05#Creating_a_list_of_Q_values to generate a Q values from the article titles (as an aside, that report had a small glitch arising from redirects which I believe I fixed).

I'm not sure whether I start with that report and ask for different output, whether I create a report that starts with the Q values (which I have in a spreadsheet somewhere) or do something else.

As a specific example, some of the teams have a description and in some cases it is blank. I want to fill in the blank ones with quick statements (which I know how to do), but I'd like to make a list of all the Q values with blank descriptions. However, rather than simply figure out how to do a report that lists only the blank descriptions I'd like a report that lists all descriptions so I can inspect the existing ones to see if there are any that need correcting, then sort the list to get all the blank ones and then extract the Q values and do quick statements.

Does this approach makes sense?

Then I'd like to rinse and repeat for home venues (I filled in most but I know there are a few that are missing), nicknames, and other properties.--Sphilbrick (talk) 17:28, 25 May 2018 (UTC)

Answered at my talk page. --Tagishsimon (talk) 17:55, 25 May 2018 (UTC)

Just in case you may be interested, NCAA.com team ID (P3692) has yet to be deployed on relevant items. Thierry Caro (talk) 11:39, 26 May 2018 (UTC)

Invaluable Artists in Mix'n'Match

Looking for something relaxing yet productive to do this weekend? There are 20K automated matches remaining in the 'Invaluable Artists' Mix'n'Match catalogue, that need checking and accepting or rejecting. Can you help? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:52, 25 May 2018 (UTC)

Suggestions of properties

Hi, since last week breakdown I don't see suggestions of properties when creating or editing an item. When, for example, I add property instance of (P31) = human (Q5) usually in the next "add statement", it shows properties like place of birth (P19), place of death (P20), date of birth (P569) etc. Anyone having the same problem? Ederporto (talk) 15:43, 28 May 2018 (UTC)

@Ederporto: See #Some features temporarily disabled, above. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:47, 28 May 2018 (UTC)

This section was archived on a request by: Matěj Suchánek (talk) 14:11, 1 June 2018 (UTC)

Box set of compact discs

I've been trying (and failing) to come up with a way to say that The Firesign Theatre's Box of Danger (Q7026985) is a box set of compact discs. I'd expect to be able to do this with some sort of qualifier on

⟨ The Firesign Theatre's Box of Danger (Q7026985)  

 ⟩ instance of (P31) ⟨ box set (Q394970)  

 ⟩

, but I can't work out what would be acceptable. - Jmabel (talk) 00:44, 31 May 2018 (UTC)

Instance of box set (Q394970); has part "CD", quantity "N". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:43, 31 May 2018 (UTC)

This section was archived on a request by: Jmabel (talk) 00:05, 2 June 2018 (UTC)

Wikipedia Glossary entries

Now that we have new property glossary entry at Wikipedia URL (P5178), I think it would be great tp have a tool or script that could make a Wikipedia Glossary like en:Glossary of sewing terms into a Mix'n'Match dataset. Do others think this is a good idea? There are hundreds of Glossaries in ENwiki alone, so I am not certain this is the best way to proceed... PKM (talk) 20:30, 24 May 2018 (UTC)

Glossaries generally have three or four types of entries:

1. entries that define a term
2. entries that note that a term is a synonym for another
3. entries that link an article in Wikipedia for the definition
4. (maybe) entries that note a term is used for several concepts

Personally, I think #1 is suitable for item creation, #2 should be an alias of an item for another entry, #3 would generally already have an item, #4 would need several items.

For #1, one needs to find a suitable P279 (or P31) value and create the entry. I'm not sure if Mix'n'Match is the best tool for that.
--- Jura 07:54, 25 May 2018 (UTC)

Because Wikidata is more granular than Wikipedia, I would expect many Glossary entries that do not have their own Wikipedia articles to already have a Wikidata entry. I'd prefer a tool that offered the opportunity to match before creating new items. I agree that a separate Glossary tool might be better than clogging up Mix'n'Match with hundreds of new lists. - PKM (talk) 21:09, 25 May 2018 (UTC)

You could try that one glossary and see how it goes. I think it's worth giving some thought to the P279/P31 values you will be using when creating the items. Wikipedia outlines might help you with this. For a sport one, I think the structure of w:Glossary of rowing terms could be used for these terms. (You will notice that that glossary is very different from most, much closer to an outline).
--- Jura 10:12, 27 May 2018 (UTC)

CPDL ID (P2000)

WikiProject Music has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.. Hello. This should be an external ID, not a string-based property. Can we get it moved faster than through the full deletion/recreation process? Thierry Caro (talk) 05:26, 25 May 2018 (UTC)

There is Wikidata:Identifier migration, and it’s already listed and discussed here. —MisterSynergy (talk) 05:38, 25 May 2018 (UTC)

OK. Thank you. Good to know! Thierry Caro (talk) 05:59, 25 May 2018 (UTC)

Hmm, it might be good to do one more run through analyzing our remaining string properties that could be identifiers - it sounds like some of them have been sufficiently cleaned up or are well behaved enough that that assessment from two years ago is no longer valid. ArthurPSmith (talk) 14:01, 25 May 2018 (UTC)

Yes, please. Thierry Caro (talk) 07:39, 27 May 2018 (UTC)

Original title of a book

Twenty Thousand Leagues Under the Sea (Q183565)

How can I show that the french title was the original title? Xaris333 (talk) 23:40, 26 May 2018 (UTC)

@Xaris333: I get the impression the best advice is, don't start from here. Wikidata:WikiProject Books#Bibliographic properties sets out a plan of attack for books, and, for instance, Diary of Anne Frank (Q6911) is used as an example. The major trick is to keep separate the original work, and subsequent editions. In the original work, a single title (P1476) identifies the original title. Editions, such as Journal de Anne Frank (Q14624843), get their own item, which'll have an edition or translation of (P629) pointing back to the original work. Set against that model, the Twenty Thousand Leagues Under the Sea (Q183565) record is a dog's breakfast requiring a root & branch rejig to make it useful. --Tagishsimon (talk) 00:00, 27 May 2018 (UTC)

Yes, thanks. I forgot that we have separate items for editions. Xaris333 (talk) 00:20, 27 May 2018 (UTC)

Ansestors & Relatives

Hi,

I want to start a project that holds the names and relations of humans alive and their parents, grand parents , grand grand parents and so on.

In today's world humans are moving too fast and moving far away from their roots and are loosing their identity and relationships.

This information on Wikidata will be editable by individuals for themself and their family, relatives and their ansestors and that may help discover a lot of things for them.

I know that in India there is a system, a set of people who record this information at specific places like Haridwar when some body dies and their family members go their to dispose the remains of the dead. – The preceding unsigned comment was added by Maandalbir (talk • contribs) at 21:58, 25 May 2018‎ (UTC).

@Maandalbir: Please review Wikidata:Notability and Wikidata:BLP. there are external sites already doing such things, to which we link via relevant external ID properties, for people who meet our notability criteria. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:34, 26 May 2018 (UTC)

@Maandalbir: Wikidata stores information about enities for which there are reliable sources, do the extend that there are good sources it would be interesting to have information about Indian heritage in our project. Especially given that Indian heritage is likely not as well covered by the Western Ancestery.com empire. ChristianKl ❪✉❫ 11:59, 27 May 2018 (UTC)

Adding references (P854)

Is it possible to add URL references using quick statements?

I've been trying in the sandbox and don't quite seem to get the hang of it.

For example, I just added X username (P2002) StanfordWBB to Stanford Cardinal women's basketball (Q7598727), and I would like to add reference URL (P854) "http://gostanford.com/index.aspx?path=wbball"

Ideally, I'd like to add the twitter handle and the reference in one step. Are either possible?--Sphilbrick (talk) 11:07, 27 May 2018 (UTC)

Q7598727 P2002 "StanfordWBB" S854 "http://gostanford.com/index.aspx?path=wbball" should work. Matěj Suchánek (talk) 11:30, 27 May 2018 (UTC)

That worked, thanks.--Sphilbrick (talk) 12:14, 27 May 2018 (UTC)

Okay, I'm so close but I still have one nagging problem. This may technically be an Excel question so if it's outside the scope of this forum, so be it.

I'm trying to create my quick statements in Excel, but when I place a URL in a cell with quotes around it, Excel helpfully removes them. I haven't quite figured out how to have an entry in Excel which is a URL surrounded by double quotes.--Sphilbrick (talk) 12:25, 27 May 2018 (UTC)

Never mind figured it out (created a custom format).--Sphilbrick (talk) 12:39, 27 May 2018 (UTC)

fwiw, I use libreoffice, where I find encapsulating strings in quotes seems to work with a formula in this form: +""""&A2&"""". Not sure how that translates to excel, nor to your now cured issue. --Tagishsimon (talk) 12:49, 27 May 2018 (UTC)

Curiously, I had tried triple quotes in Excel and it didn't work. I just tried quad quotes and it did. Thanks, that may be easier than the custom formatting I was using.--Sphilbrick (talk) 14:09, 27 May 2018 (UTC)

Yes. It's four quotes in formulas in Excel. Thierry Caro (talk) 15:24, 27 May 2018 (UTC)

Label of language of work or name (P407)

Your input about the label of language of work or name (P407) is welcomed on Property_talk:P407#Label. --Pasleim (talk) 13:25, 27 May 2018 (UTC)

OpenRefine 3.0 beta released

Hi all,

The OpenRefine team is pleased to announce that we have a new release out, with plenty of new features to try! Given that we have changed quite a lot of things, we call this a beta for now.

OpenRefine now includes a Wikidata extension, designed to help you import datasets in Wikidata. In many cases, the tool covers all steps of the import process, from downloading a dirty CSV file to uploading statements to Wikidata items. These features are demonstrated in four different tutorials, covering various data sources and formats. We hope you will find them useful.

This release was supported by the Google News Initiative as part of a project to improve OpenRefine for data journalism. Many thanks to all our contributors, translators, testers and the developers of the libraries we rely on - especially Wikidata-Toolkit - and the Wikidata team for their support.

Feedback is very welcome, for instance here, on the tool's talk page or directly on GitHub. − Pintoch (talk) 15:04, 27 May 2018 (UTC)

Same subject?

Hello! I founded the following items on User:Pasleim/projectmerge/huwiki-enwiki with basionym (P566). As far I can see their articles are about the same subject. The current situation is that enwiki articles are alone (do not have interwikis), which is bad. I am not sure about myself and I would not like to do something wrong, so could you take a look?

Thanks! Bencemac (talk) 17:00, 31 May 2018 (UTC)

When taxons have their own literature, they are from a taxonomy point of view different. This difference means that the precise definition differs and consequently they do not describe the exact same thing. Thanks, GerardM (talk) 17:16, 31 May 2018 (UTC)

Ideally, the software would link the Wikipedia pages in such cases, but this is a long way off, so a (manual) edit is necessary. For homotypic names, all sitelinks should be put together in one item. In itself, this choice does not mean that any one name is incorrect. - Brya (talk) 17:25, 31 May 2018 (UTC)

The relevant paper that outlines the current taxonomic viewpoint is A revised generic classification for Aloe (Xanthorrhoeaceae subfam. Asphodeloideae) (Q22829422). Most Wikipedias are probably not aware of this paper. Wikidata is. I moved the en-sitelinks back to the Aloe species. --Succu (talk) 18:33, 31 May 2018 (UTC)

@Succu: Thank you! Could you tell me what tool did you use (which makes these summaries)? Bencemac (talk) 07:06, 1 June 2018 (UTC)

@Bencemac: that's the move gadget. It can be enabled in your preferences. Mbch331 (talk) 06:29, 2 June 2018 (UTC)

I found it, thanks! Bencemac (talk) 09:19, 2 June 2018 (UTC)

This section was archived on a request by: Bencemac (talk) 09:19, 2 June 2018 (UTC)

Limit page creation and edit rate

Tracked in Phabricator
Task T184948

Tracked in Phabricator
Task T192690

Hello all,

Following the dispatch lag problems that we encountered over these past months, we decided to set up a limit for the speed of edits and page creation. These limits are now enforced for all accounts (including bots):

page creation: max 40 per minute per account
edit: max 80 per minute per account

This means that some bots and automated tools have to reduce the speed of their edits in order to not reach these limits.

We will continue monitoring the situation carefully, and see how this has an impact on the projects. If you want to learn more about the reasons for this change, you can have a look at the ticket.

Let us know if you encouter any problem. Lea Lacroix (WMDE) (talk) 08:29, 19 April 2018 (UTC)

Looks like it's working: Special:Log/massmessage.
What are the plans to develop capacity to raise it again?
--- Jura 03:06, 20 April 2018 (UTC)
Doesn't seem convenient for regular editors. Didn't we have problems mainly due to bots creating new items at steady speed for several days in a row?
--- Jura 12:31, 21 April 2018 (UTC)
- In the past weeks, we have repeatedly seen individual edit rates up to at least 400/min, often ongoing over several hours. Apparently editors can’t overcome the temptation of starting several QuickStatements batches in parallel. (I don’t like the limitation as well, but I have no idea how to otherwise approach the problem.) —MisterSynergy (talk) 13:05, 21 April 2018 (UTC)
  - It's hard to add a second batch after the first one. Besides this also affects PetScan. With PetScan, the normal thing to do is to start one, prepare another one, start that, etc. Then wait till they all complete. Some might just do 10 or 20 edits at each run, but the peek adds up. Looking at the edits of WMF staff on Special:Log/massmessage, it seems that the peek rate you mention isn't a problem as such. I think Wikidata's dispatcher even groups them together ..
    --- Jura 05:20, 22 April 2018 (UTC)
    - Would it be feasible to have limits like 5000/hr or 2500/30min, which would be quantitatively very similar to 80/min? This would allow to exceed the limit for shorter times, but in the long run one would not be able to edit faster than ~80/min. —MisterSynergy (talk) 07:01, 22 April 2018 (UTC)
      - Yeah it looks like the current state isn't what we were trying to achieve so we'll tweak it more. We'll start with an increase of the timespan to 3 minutes and see how that goes. I am hesitant to go to 30 mins or an hour because then we'll again make it easy to do 400 edits in a minute causing all the problems we've seen in the past :/ --Lydia Pintscher (WMDE) (talk) 08:01, 23 April 2018 (UTC)

"Versand von „Wikidata weekly summary #309“ nach User talk:*Youngjin fehlgeschlagen mit dem Fehlercode ratelimited". Hm, we are blocking our workbase ;) . Regards, Conny (talk) 09:22, 24 April 2018 (UTC).

What's the medium term (3-6 months) plan for the root problem? It seems that various bot operators get requests to slow down even when they stay within the new rate.
--- Jura 06:23, 30 April 2018 (UTC)
- Marius looked into it more and enabled HHVM's JIT for Wikidata's dispatchers. This should solve the problems we've been seeing over the last few days. --Lydia Pintscher (WMDE) (talk) 16:20, 30 April 2018 (UTC)
How about tweaking the feature in a way that the edit limit only kicks in if the server is actually overloaded? ChristianKl ❪✉❫ 12:43, 7 May 2018 (UTC)
- I was going to suggest that this be lifted for periods of low edit rates (one could imagine some bots just upload 100000 additions in a low traffic period).
  Looking at some of the grafana reports, either I didn't look at the right report or I had to conclude that edits are already somewhat spread out. In the later case, Wikidata might just almost always be at its maximum capacity.
  --- Jura 07:10, 8 May 2018 (UTC)
  - What's the progress on this? (other than that staff's mass messages don't get blocked any more?).
    --- Jura 17:26, 13 May 2018 (UTC)
    - As a next step I'll go over the remaining issues with multichill at the hackathon this weekend to fully understand them and prioritize them. --Lydia Pintscher (WMDE) (talk) 17:04, 14 May 2018 (UTC)
      - Currently edits with QuickStatements can end up broken as the tool can skip random steps, e.g. moving to the next item. When investigating why QuickStatementBot worked very slowly, I found that a blocked user had found a way around the rate limitation/reallocation by running batches with multiple accounts (see Wikidata:Administrators'_noticeboard#QuickStatementsBot). I think you should find a solution that works for users other than Léa sending out mass messages.
        --- Jura 17:29, 14 May 2018 (UTC)
Hope the discussion was productive. Which solution will be implemented ?
--- Jura 07:57, 22 May 2018 (UTC)
- We agreed to lift the limit from admins and bot accounts. I will insist on a stricter application of the rules wrt respecting max lag in the API by these groups. We will adjust max lag to also include a measure for dispatch lag. --Lydia Pintscher (WMDE) (talk) 21:58, 22 May 2018 (UTC)
  - @Lydia Pintscher (WMDE): Can you please recommend a threshold value for the median lag at WD:Bots? It currently says “If needed they should check before editing entities (but only every 60s) if the 'Median' on Special:DispatchStats (also available via API as median) is 60 or higher and not edit.”, but I think we haven’t seen a median lag of 60 (seconds) for months now. It typically is around 3 minutes or even more when everything runs smoothly. It is difficult to apply blocks based on that policy.
    Apart from that it would be great it the rate limits could be increased with time, in order to make quicker editing possible without stressing the servers. —MisterSynergy (talk) 05:17, 23 May 2018 (UTC)
    - @MisterSynergy: Could you have a look at phabricator:T194950 and see if that would work? --Lydia Pintscher (WMDE) (talk) 20:03, 23 May 2018 (UTC)
      - Sounds like a 5 minutes (300 seconds) threshold, which to my opinion would be just enough most times. I think we would anyway have to figure out whether this value works in practice. Whatever value is chosen should be written into the policy. Hoo should be aware of that, since I mentioned it in IRC a couple of weeks ago when he was active there as well. —MisterSynergy (talk) 20:12, 23 May 2018 (UTC)
        Could you create a user group without the limit other than admins and bots? I don't see why users need to be admins to do that.
        --- Jura 20:39, 23 May 2018 (UTC)

Just noticed again the edit rate, even as admin, while doing only 2x about 300 QuickStatements lines together with a single manual edit. This edit rate is really annoying. This is just asking for people to create a series of sock puppets to bypass this. Seems pretty easy to have them seem editing from a different IP address if needed.
Wikidata has been largely built by tools and bots. If I read the Phabricator ticket T184948 I see mainly an issue on the tech side and design of Wikidata, and now the users get the troubles. I can imagine it is needed to work on a good solution, but not being able to do simple edits is weird to me.

"they show up there in recent changes and watchlist is delayed" -> Wikidata is designed to contain large quantities of data and that each individual change is shown as one edit. Large quantities of data ask for large edit series. This could have been predicted! If recent changes and watchlists are not working that well, the way how the info on those special pages is shown is wrong. If all the municipalities in Brazil or Germany get this year's population numbers added, this should not be shown as thousands of edits. There is the problem.
"job queue on the clients gets overloaded due to page purges and reparsings" -> that seems like a tech/design problem to me, like nobody could have imagined how successful Wikidata would become and much data getting processed.
"editors can't keep up with the amount of changes happening and meaningfully review/maintain them" -> this is not a serious argument, this was already not possible years ago, and by limiting number of edits during an amount of time this will certainly not be solved. And it is the result of how Wikidata has been designed: every detailed change is one edit. Just creating a Wikidata "stub" is already easy 10 changes minimal.

Even while Wikidata has now about 48 million items, Wikidata is still in its young phase: there is still not a mass adoption by institutions and organisations worldwide, something I think will come within 5 years. This will result in so much more edits. Hopefully a workable technical solution is found, without having users to do less work to improve Wikidata. Romaine (talk) 01:04, 27 May 2018 (UTC)S

- Right now I see one bot adding a new item every (for scientific articles) 2 seconds. This is just one bot. Recently I've seen another bot creating items for people with just two properties: P31=Q5 + OrcidID. The OrcidID-link gave no other info that that there was a human with that name, so to me that makes it for sure that no one can ever link those items to other items, nor to articles without more contest. But we do get enormous loads of new items, and on the other side were are withdrawn to work with them. I try to give a description to every item, especially since we use these WikiData-descriptions in the mobile app and the type-ahead-search-results, but if we create hundred times more items in a minute then that I am allowed to add more data to it, it will become an endless exercise. I understand from all the above discussion that we either do not know the bottle neck, or the bottle neck can't be fixed, other then asking users/bots to not edit anymore. To me it feels that we are being killed by our own succes. Edoderoo (talk) 05:57, 28 May 2018 (UTC)

First experiment of lexicographical data is out

Deutsch
English

Hello all,

After several years discussing about it, and one year of development and discussion with the communities, the development team has now released the first version of lexicographical data support on Wikidata.

Since the start of Wikidata in 2012, the multilingual knowledge base was mainly focused on concepts: Q-items are related to a thing or an idea, not to the word describing it. Starting now, Wikidata stores a new type of data: words, phrases and sentences, in many languages, described in many languages. This information will be stored in new types of entities, called Lexemes, Forms and Senses. It will allow editors to describe precisely all words in all languages, and will be reusable, just like the whole content of Wikidata, by multiple tools and queries, everything that the community creates to play with words. Lexicographical data can be reused inside and outside the Wikimedia projects, and can provide support for Wiktionary.

The first release

A new namespace and several new entity types have been created in order to model words and phrases. If you’re new to this project, you can learn more by looking at the documentation, briefly describing the data model and the interface. The technical structure is set, but the editors remain free to model and organize data as they prefer, with the usual open discussions and community processes that we apply on Wikidata. Some discussions about new properties to create have already started: if you want to be involved in the early stage of the project to shape it, please participate!

Please note that the version that is now deployed is a first experiment, that will be continuously improved in the future. Some features are missing, some bugs may certainly occur. Here are the features that are included in the first release:

Add, edit and delete Lexemes, Forms, statements, qualifiers, references
Link between the different entity types (Item to Lexeme, Form to Item, etc.)
Entity suggestion when adding a property or a value

And the following features will not be included in the first version, but are planned for the future:

Find Lexemes and Forms via Special:Search
RDF support (which also means: the ability to query it with query.wikidata.org)
Support for Senses
Merging of Lexemes
Including the data on other Wikimedia projects, such as Wiktionary

How to try it?

The features described above are now deployed on Wikidata.org. Here are some suggestions of what you can do to explore this new territory:

If you’re not familiar with the structure of Lexemes, have a look at the documentation
Look at what is already existing. Please note that Special:Search and the search bar on the top right corner of pages is not supporting Lexemes yet. We’re working on this.
Create a new Lexeme with Special:NewLexeme
If a property that you need is missing, you can suggest it here
Discuss about how to model words and ask questions on Wikidata talk:Lexicographical data
Report bugs or issues that you may encounter: either on the talk page or on Phabricator, if you’re comfortable using it (create a task, add the tag Lexicographical data, and add Lea_Lacroix_(WMDE) as a subscriber)

About mass imports and tools

We kindly ask you to not plan any mass import from any source for the moment. There are several reasons behind that: first of all, like mentioned above, the release is a first version and we need to observe how our system reacts to the manual edits before starting considering automatic ones. The system may not be ready for big massive imports at the beginning. Second reason is legal. Lexicographical data in Wikidata is released under CC0, and the responsibility of each editor is to make sure that the data they will add is compatible with CC0. For more information, you can have a look at the advice of WMF Legal team. Finally, we strongly encourage you to discuss with the communities before considering any import from the Wiktionaries. Wiktionary editors have been putting a lot of efforts during years to build definitions, and we should be respectful of this work, and discuss with them to find common solutions to work on lexicographical data and enjoy the use of it together.

We also suggest you to wait a bit before building tools or scripts on the top of lexicographical data. The interface and its API are probably going to evolve during the next months, and the system may not be stable enough to support such tools. We will inform you as soon as it will be possible.

Next steps

After this first release, some improvements will be made on a very regular basis (new deployments every week). Once you tried playing with the new data, feel free to give us feedback. We’re looking especially to know what are the most important features for you to be worked on next.

What did you experiment while editing lexicographical data? What went wrong or was unexpected?
What bugs or troubles during the process did you encounter?
What are the features that are, in your opinion, the most important? Which one should we work on next?

If you’re interested in following the discussions and further announcements about lexicographical data, I encourage you to follow this present talk page, where we will discuss about how to organize and structure data, new features to be added, ideas of tools and queries, and a lot of other things.

Additional note: with this new kind of data enabled on Wikidata, we expect some new editors to get interest in it, edit Lexemes, suggest properties or ask questions. They may not be familiar with all of our community processes and our ways to organize content. They will need help and support as well as links to useful resources to understand how the Wikidata community works. I hope that we will all be kind and patient, both with other editors and with the software that may not work exactly as we want it to at the beginning :)

Thanks to the people who tested the model and the interface before the release, who showed support and curiosity about lexicographical data on Wikidata!

If you have any question or idea, feel free to write on Wikidata talk:Lexicographical data or contact me. Lea Lacroix (WMDE) (talk) 11:33, 23 May 2018 (UTC)

What's the target date for the "Senses" part?
--- Jura 15:23, 27 May 2018 (UTC)

If everything goes as planned, maximum 3 months. Lea Lacroix (WMDE) (talk) 08:28, 28 May 2018 (UTC)

The number of the championships the winning team have after winning that trophy.

Hello. Example,

⟨ 2017–18 La Liga (Q24529775)  

 ⟩ winner (P1346) ⟨ FC Barcelona (Q7156)  

 ⟩

How can I show that that win of the league was the 25th win of the league for FC Barcelona? See the template of w:en:2017–18 La Liga, it says "25th title". How can you give that information to Wikidata item of the league? May we use number of wins (P1355) as a qualifier?

Xaris333 (talk) 15:08, 25 May 2018 (UTC)

Anyone? Xaris333 (talk) 10:49, 27 May 2018 (UTC)

@Xaris333:, Q9916 uses series ordinal (P1545), I woiuld suggest to do the same.--Ymblanter (talk) 19:56, 27 May 2018 (UTC)

No, I see, it is not what you want. I guess number of wins (P1355) is indeed the solution.--Ymblanter (talk) 19:58, 27 May 2018 (UTC)

Not sure. One could misunderstand that qualifier as the number of wins in that season, as this property typically holds such data. There is also ambiguity in the sense that there may be different countings, so that “25th title” could apply to the 25th title in that league, or the 25th overall championship win, which doesn’t necessarily need to be the same (e.g. the German top-tier Bundesliga (Q82595) was founded in 1963, but there were German championships since 1903). No idea what to do here. —MisterSynergy (talk) 20:14, 27 May 2018 (UTC)

Since 2017–18 Bundesliga (Q28937555) has

⟨ 2017–18 Bundesliga (Q28937555)  

 ⟩ sports season of league or competition (P3450) ⟨ Bundesliga (Q82595)  

 ⟩
series ordinal (P1545) ⟨ 55 ⟩

it means that its counts from 1963. So the title is the 27th of FC Bayern Munich (Q15789). Since 1903 is the 28th title. The English article has both informations. w:2017–18 Bundesliga. We must find a way to have that data in Wikidata. Xaris333 (talk) 20:20, 27 May 2018 (UTC)

Should I propose a property "victory number" or "title number"? It may be used to show the number of the cups/wins/titles/championships the winning team/person had after winning that cup/win/title/championship. Xaris333 (talk) 20:42, 27 May 2018 (UTC)

To my knowledge there is at least no existing property which really fits, so a proposal for a new property could be a viable approach. Do you plan to use it as a qualifier? How would one distingish between league title and championship title? —MisterSynergy (talk) 21:06, 27 May 2018 (UTC)

Yes, as a qualifier... Don't know about the second one. I only think the easy case of

⟨ 2017–18 Bundesliga (Q28937555)  

 ⟩ winner (P1346) ⟨ FC Bayern Munich (Q15789)  

 ⟩
title number Search ⟨ 27 ⟩
of (P642) ⟨ Bundesliga (Q82595)  

 ⟩

. Xaris333 (talk) 21:19, 27 May 2018 (UTC)

Yes, I think it is best to propose the property, sometimes there is good feedback coming from the discussion. Just make it clear this is a win of the team/person, not the overall championship title (which is covered by series ordinal (P1545))--Ymblanter (talk) 05:48, 28 May 2018 (UTC)

If the winners are stored on all such items, can't the number simply be gained by a query that counts the number of past wins of that team? Storing the number "manually" too might be seen as redundant. --Kam Solusar (talk) 07:21, 28 May 2018 (UTC)

Yes and no. Technically it is of course possible, but it requires very high data quality which I’m afraid we are not able to provide at this time. There is also the problem that you cannot query from lua modules, so retrieving such data is impossible as long as we don’t write it explicitly to some item. —MisterSynergy (talk) 07:43, 28 May 2018 (UTC)

Done. Wikidata:Property proposal/title number Xaris333 (talk) 15:11, 28 May 2018 (UTC)

HPIP ID (P5094) , changing formatter URL

The actual formatter URL "http://www.hpip.org/Default/pt/Homepage/Obra?a=$1" that links to the Portuguese version, should be changed to the English version "http://www.hpip.org/Default/en/Homepage/Entry?a=$1"; and also "http://www.hpip.org/Default/pt/Homepage" -> "http://www.hpip.org/Default/en/Homepage". May I do it?
---JotaCartas (talk) 06:29, 27 May 2018 (UTC)

What's the advantage of doing that?
--- Jura 09:35, 27 May 2018 (UTC)
- @Jura1:Because many more people can read English, and a non-speaking Portuguese may leave immediately the page before seeing that it can be changed to English. Anyway I am just a rookie here and perhaps you know better, regards
  ---JotaCartas (talk) 09:50, 27 May 2018 (UTC)
The content of the English and Portuguese pages are not the same - and it seems like there's more in Portuguese than English - so I think the main links should be left as they are. Ideally we should have both, with qualifiers to indicate the language, but I'm not sure if that's supported for formatter URLs? Thanks. Mike Peel (talk) 11:58, 27 May 2018 (UTC)
- Ok, agree, thank you --JotaCartas (talk) 12:12, 27 May 2018 (UTC)
- I've done that. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:22, 28 May 2018 (UTC)

Municipality item

Hello. I am trying to complete an item about a municipality but I have some problems.

1)

⟨ Ypsonas (Q3560553)    ⟩ instance of (P31) ⟨ municipalities of Cyprus Republic (Q16739079)    ⟩

and

⟨ municipalities of Cyprus Republic (Q16739079)    ⟩ subclass of (P279) ⟨ administrative territorial entity of Cyprus Republic (Q8435708)    ⟩
⟨ municipalities of Cyprus Republic (Q16739079)    ⟩ subclass of (P279) ⟨ second-level administrative division (Q13220204)    ⟩
⟨ municipalities of Cyprus Republic (Q16739079)    ⟩ subclass of (P279) ⟨ municipality (Q15284)    ⟩

Also (was)

⟨ Ypsonas (Q3560553)    ⟩ instance of (P31) ⟨ Municipal Councilor in Huy (Q29385989)    ⟩
⟨ Ypsonas (Q3560553)    ⟩ instance of (P31) ⟨ community of Cyprus Republic (Q29414133)    ⟩

2)

⟨ Ypsonas (Q3560553)    ⟩ office held by head of government (P1313) ⟨ Mayor of Ypsonas Municipality (Q54263502)    ⟩

Should I add all the majors to Ypsonas (Q3560553) with head of government (P6) or just the active one? And add all of them to Mayor of Ypsonas Municipality (Q54263502) with officeholder (P1308)?

3)

⟨ Ypsonas (Q3560553)    ⟩ executive body (P208) ⟨ Ypsona's municipal council (Q54263570)    ⟩

There are some constrains here.

4)

⟨ Ypsonas (Q3560553)    ⟩ member of (P463) ⟨ Union of Cyprus Municipalities (Q28863484)    ⟩

There is a constrain here.

Xaris333 (talk) 21:08, 27 May 2018 (UTC)

@Xaris333: on #2 (I'll let others respond on the other questions), the most useful thing is to make sure that each person who has held the role has a suitable position held (P39) statement, e.g.

⟨ Pantelis Eftychiou Georgiou (Q29473379)  

 ⟩ position held (P39) ⟨ Mayor of Ypsonas Municipality (Q54263502)  

 ⟩

. There's a bit of a difference of opinion on whether the head of government (P6) and officeholder (P1308) should be filled for historic mayors: it's certainly not wrong to do so, but it can become quite unwieldy quite quickly if there are more than a handful filled in. (It is, however, important that the current one be marked as 'preferred' if there are more than one.) --Oravrattas (talk) 08:35, 28 May 2018 (UTC)

ScienceSource funded/Facto Post Issue 12

The Wikimedia Foundation announced full funding of the ScienceSource grant proposal from ContentMine on May 18. See the ScienceSource Twitter announcement and 60 second video.

For further discussion, see the newsletter at w:User:Charles Matthews/Facto Post/Issue 12 – 28 May 2018. Charles Matthews (talk) 10:27, 28 May 2018 (UTC)

Reviving the "Property creator" list of users

There are many property creators in the list Special:ListUsers/propertycreator that are no longer active in that role (they haven't created any property in the last six months). I propose to import the text of losing adminship after 6 months of inactivity into Wikidata:Property creators, and clean appropriately the "Property creator" list of users. I would also like issue an invitation to have new candidancies as Property Creator, since we don't have new ones since 4 years ago.

(I'm pinging also the current property creators: @Ayack, Paperoastro, MichaelSchoenitzer, Emw, Izno: @Mattflaschen, JakobVoss, 99of9, ArthurPSmith, Kolja21: @Tobias1984, Viscontino, Jonathan Groß, PinkAmpersand: @Yellowcard, Thierry Caro, Ivan A. Krestinin, Nightwish62, Fralambert: @Danrok, MSGJ, GZWDer, Joshbaumgartner, Srittau: @Almondega, Jura1, Pintoch:)

Can you please assess if this is the right course of action? --Micru (talk) 10:29, 20 May 2018 (UTC)

As one of those inactive property creators, I would suggest a slight addition: Give inactive property creators a month heads-up warning before the removal. This is what Commons does with their administrators and it actually encouraged me (after taking a break for about a year from admin duties) to become active again. This might work with property creators as well. Sebari – aka Srittau (talk) 10:49, 20 May 2018 (UTC)

Sounds like a good addition to me. Thanks for the comment! --Micru (talk) 11:19, 20 May 2018 (UTC)

1) What problem are you trying to solve? 2) What activity would you require? 3) Would we require the same of admins (who are also property creators)? 4) Do we really want property creators to create some random property every six months?
--- Jura 10:52, 20 May 2018 (UTC)

1) As it is now we are having a list of people registered as property creators that doesn't reflect the truth, and as I see it we need an honest view of who cares about the property creator process. Having a realistic view on the process and on the people caring about it can give us a guidance as to how to improve the process, or when to invite new property creators when there are not enough people caring about the process.

2) I would require property creation because it is easy to track. Additional requirements like commenting on properties might be nice too, but it is harder to monitor.

3) Admins already are required admin actions to keep being an admin. I feel that the overlap between adminship/property creators is not healthy because gives admins the ability to create a property even if they do not care about which properties fit in our ontology, and that requires focus, and following the discussions. TBH, I do not know if most admins care enough about the property creation process to justify having the right, so in my opinion it would be best if adminship and property creators were two sepparate groups with no overlap (meaning that an admin can apply to property creator but they wouldn't be considered property creators by default).

4) No, what we want is people caring about properties and making an effort to keep that part of Wikidata alive. I propose property creations as a general guideline, but of course if we feel that someone is not really into it we can always propose that they are removed from the property creators group. (sorry to add numbers to your comment, but I felt that it would be easier to address that way all your questions.)--Micru (talk) 11:19, 20 May 2018 (UTC)

I am also interested in helping property proposals to get more switfly reviewed and created. My understanding is that there are basically two sort of roles that we need to encourage:

Reviewing proposals and marking them as ready/not done when ripe. This requires knowledge of our conventions, understanding of the intent of the proposer, having some experience in designing data models of some sort.
Actually creating the property and transferring the information from the proposal to the property. This is a purely technical and fairly tedious process that I have been trying to automate by parsing the proposal template and automating its transfer as statements on the property. (I might release that code if people are interested - but for now it's just an ugly hack I am not very proud of and not very confident distributing to others)

The valuable task where we need intelligent people is the first role, I think. It technically does not require any special permissions, but maybe it is still useful to encourage people to do this by giving them special rights and a fancy {{User property creator}} user box. Anyway, I think we should not unflag people because they have "just" been commenting on proposals without actually creating any property: that could be recieved as offensive somehow. − Pintoch (talk) 11:28, 20 May 2018 (UTC)

Not sure if people marking proposals as ready necessarily assess all these. Some proposals are also ready for being marked as "not done". There is always the problem with lengthy comments by users who don't read discussions, won't actually use the property and don't necessarily know how to add statements. All things property creators are aware of, but technical creation might ignore.
--- Jura 11:40, 20 May 2018 (UTC)

I'm not sure if it's a good idea to strip administrators of their property creator rights. As the admin activity requirement is fairly light (many deletions are trivial), requiring much more of property creators doesn't seem a good idea. None of this prevents property creators creating properties that don't reflect the proposal and don't bother assessing the arguments in the discussion. Obviously, periods of inactivity bear the risk that some things changed around the site, but most of use are fairly prudent about this.
--- Jura 11:40, 20 May 2018 (UTC)

What it is needed is a group of people that cares about reviewing property creation discussions and approving or rejecting them. That requires commitment and willingness to spend time doing that. Just because someone is an admin does not mean that they will do that, the only thing we can offer is the "badge" that recognizes that someone is approved by the community to do so and that they are willing to do it. I'm ok in automatically granting property creation rights to any admin that so wishes without extra steps, but they do have to want it, if not it is a useless gift that the community is giving to them.--Micru (talk) 11:57, 20 May 2018 (UTC)

I don't think removing after 6 months is helpful to anything. People sometimes get busy and return to projects even years later. Cleanup after 5 years of inactivity yes. But 6 months is for many people a time that can easily go by with real life issues. Maybe we should rather concentrate on better notification for users with roles (A seperate bell comes to mind). A lof of things are easily drowned out for me in my email inbox and the wiki notification system. --Tobias1984 (talk) 14:13, 20 May 2018 (UTC)

@Micru: When you say importing the text, are you setting a bar of "5 property creations in the last 6 months" (for admins it's at least 5 admin actions in the time period)? That seems a little high. Also as Tobias suggests, 6 months may be too short a period. Maybe if somebody has done no property creations in the last year they should be considered to be inactive, but at least one a year keeps them in active status? Also if we had an easier way to measure activity in property proposal discussions that would be nice - I know I feel in my role I should review most of the proposals, comment on any I feel I have an interest in, and help improve any that seem to be missing pieces; if that counted toward activity as a property creator that would be helpful. A year or two ago we discussed having a special property creators noticeboard, similar to the Administrators noticeboard; I think in effect the "Overview" page now run by PLbot serves most of that purpose - all property creators should check that regularly, maybe have it on their watchlists. ArthurPSmith (talk) 14:30, 21 May 2018 (UTC)

@ArthurPSmith: After reading your comment and Tobias1984's comment, I would set the bar at "1 property creation in the last year and/or some reasonable activity in the property proposal discussions", I think that is the lowest that the bar can go, acknowledging that sometimes people need to be away from the project and can come back later on, but still asserting that we need some commitment. Also take into account that even if somebody loses the Property Creation right, they can apply for it again, it is not such a big deal. Yes, I agree that the "Overview" page is helpful (I follow it myself) and it should be mentioned in the page about property creators.--Micru (talk) 09:26, 22 May 2018 (UTC)

I have not created a property in at least a year. I am judicious about my use of the PC right, and thus that is precisely why I've not used it recently. I would not want to have to conduct some arbitrary 'action' every so many months just to retain active status. I am an active user and try to be responsive to pings and such. It seems to me the list of property creators is not very long, and so I don't see any pressing need to reduce it. If it were hundreds of users long, and many were inactive and unresponsive, I'd see a point, but given there are currently only 28 on the list (plus some 55 sysops who can also create properties), I think it's pretty manageable without needing an arbitrary cut-off or activity requirement. As for inviting new candidates or encouraging increased participation, I'm all for it. Josh Baumgartner (talk) 16:05, 21 May 2018 (UTC)

@Joshbaumgartner: As mentioned above I would add the clause of "reasonable activity". I still think it is going to be hard to track without some tool, but maybe better leave the door open for wide interpretation. The reason why I am suggesting to reduce the list is not because it is long. The proposed changes have the intention to increase the value of the property creator right so users feel more inclined to participate in the process. The property creation right should be an honor that the community concedes to some members to perform a function that is useful to the community albeit sometimes overlooked. If that honor is treated lightly it loses its value, therefore we need a mechanism to make sure that property creators stand up to what the community requires of them which is some minimal activity. I feel that by not having a way for inactive users to lose their right we are not honoring enough the ones who are active and participating in the property creation process. We can also have a Timeline like Wikidata:Administrators/Timeline so past Property Creators also can feel appreciated for their time served in that position.--Micru (talk) 09:26, 22 May 2018 (UTC)

If you're not willing to use the tool more frequently, especially given that there have been several periods when there has been a backlog of properties awaiting creation, why do you need it at all? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:32, 23 May 2018 (UTC)

@Micru: Are there specific cases you have in mind where people with the property creator right who haven't participated in a while create properties that you think shouldn't be created? If there aren't any example of such behavior I don't see a reason to remove this right. If there are cases, I think we should discuss the issue more directly. ChristianKl ❪✉❫ 09:38, 29 May 2018 (UTC)

download URL (P4945) is a recent property created by a user who was not aware of our current policies. --Pasleim (talk) 09:47, 29 May 2018 (UTC)

@ChristianKl: Even if Pasleim have given you a potential example, in my view it is not about inactive property creators becoming active again and creating wrong properties, as this happens rarely. For me it is about the list of property creators not being a representation of who are the active property creators. In my opinion every list of users should reflect a reality, and since we don't have any page that shows "active property creators" as we have for active users, then I consider that the maintenance of the list of active property creators should be done manually.--Micru (talk) 09:52, 29 May 2018 (UTC)

Some features temporarily disabled

Due to the database issues encountered yesterday, causing important problems on Wikidata and Wikipedias, some features of Wikidata are temporarily disabled: the property suggester on Wikidata, some Lua modules on Wikidata and the sister projects, search for the ArticlePlaceholder.

We apologize for the inconvenience, we're working on it to get the features back as soon as possible.

For those who are interested in the technical details: phab:T195520, Incident documentation/20180524-wikidata. Let us know if you encounter any other issue. Cheers, Lea Lacroix (WMDE) (talk) 07:16, 25 May 2018 (UTC)

Update: the things listed in the incident report are still disabled, or returning nothing. We will work on transferring the property suggester to ElasticSearch, so we can enable it again. In the next days, we will make some changes on the database tables, then reactivate the features one by one to test them and avoid breaking everything again.

Since today is holiday in the US, we don't expect a lot of changes to happen today. If you encounter any issue, feel free to write a comment in the ticket. Lea Lacroix (WMDE) (talk) 09:22, 28 May 2018 (UTC)

Update: the property suggester is back :) The Lua modules on Wikidata should be fine as well. We're working on the rest. Thanks for your patience! Lea Lacroix (WMDE) (talk) 15:58, 29 May 2018 (UTC)

Drag'n'drop gadget rewrite – feedback welcomed

Hi all, during Wikimedia Hackathon I managed to more-or-less finish my rewrite of Drag'n'drop gadget you can find in Preferences. It's complete code rewrite including applying OOUI interface, optimized SPARQL queries, added search and highlighting of potential duplicates, added reference to Wikipedia as imported from Wikimedia project (P143) property and added option to add file from Commons not only as image (P18). There are some things that still needs to be added, but I would like to hear your opinions.

You can enable it by pasting line below in your commons.js file.

 importScript('User:Yarl/DragNDrop.js');

You can also have old Drag'n'drop gadget enabled, there is no clash between them. Cheers, Yarl 💭 20:04, 25 May 2018 (UTC)

First impression: Looking absolutely FANTASTIC! Also feels *a lot* faster. Thank you so much, I will test this extensively over the weekend and get back with any findings 👍 Moebeus (talk) 20:48, 25 May 2018 (UTC)

@Yarl: For those of us who use other sister sites outside those twenty(ish), is there a means to add a local statement in our common.js files for those wikis? — billinghurst sDrewth 22:44, 28 May 2018 (UTC)

@billinghurst: My plan is to include all sister projects, but I don't want to push untested untested code. Will be done soon, though. Yarl 💭 18:30, 29 May 2018 (UTC)

As the author of the original gadget, I am now using yours ;-) --Magnus Manske (talk) 10:08, 29 May 2018 (UTC)

Coordinates on ceb article for Split Rock County Park (Q49564453) don't work for DnD. --Magnus Manske (talk) 10:16, 29 May 2018 (UTC)

@Magnus Manske: Coordinates and external links are not working yet, I will do my best to add it next week. Yarl 💭 18:30, 29 May 2018 (UTC)

Bot amok

At least five different editors have pointed out recent bad edits of Reinheitsgebot but there has been no response. The bot is having trouble with information on VIAF, especially when a VIAF record for a person uses c. (circa) or f. (floruit) and is misinterpreting these as birth or death dates and adding them to the data item.

As I say, the bot owner has not responded at all, and the bot keeps adding incorrect information. --EncycloPetey (talk) 14:17, 26 May 2018 (UTC)

@EncycloPetey: If the bot is applying false data, then you should be asking for it to be blocked until its scripts are repaired, there is more harm in having false data than blocking the bot, especially as the bot should be able to iterate over its tracks. I would suggest that block requesting at Wikidata:Administrators' noticeboard. — billinghurst sDrewth 22:48, 28 May 2018 (UTC)

Birth/death at sea

Are birth at sea (Q46998262) and death at sea (Q46998267) useful values for place of birth (P19) / place of death (P20)? They don't really work, since they aren't geographical objects and violate a constraint. I suggest using a specific ocean or sea instead, or World Ocean (Q715269) if unknown. Ghouston (talk) 04:11, 29 May 2018 (UTC)

That already seems to be a common way of doing it, e.g., Edward Smith (Q215786) place of death (P20) North Atlantic Ocean (Q350134). Ghouston (talk) 04:22, 29 May 2018 (UTC)

It's discussed at Wikidata:Project_chat/Archive/2017/12#Born at sea and Died at sea, but that discussion doesn't really support the idea that the two items are useful. Perhaps they should be deleted to avoid confusion. Ghouston (talk) 04:50, 29 May 2018 (UTC)

@Ghouston: References often will say "died at sea" and not give a specific place. So I would agree that the labels "(event type) at sea" are not useful, however, that said we cannot rely on being totally specific, so we may need an ability to say "at sea", event irrespective. An example, someone emigrating from the UK to Australia could have died in any of the North or South Atlantic, Indian, or Pacific Oceans, or any of the subplaces within those, so "at sea" needs to be able to have some geographic ability. Here the reference may be a passenger manifest that has that simple note of "died at sea". — billinghurst sDrewth 08:27, 29 May 2018 (UTC)

Yes, but I mentioned World Ocean (Q715269) which would be sufficiently non-specific, if we can assume at least they didn't die on the Caspian Sea (Q5484) or something. I noticed some items using a value of ocean (Q9430), but that's really a type of thing rather than a particular geographic location, even though it does claim to be a geographic location. Ghouston (talk) 09:18, 29 May 2018 (UTC)

Like on Jimmy Stanton (Q6201347). It fails a constraint because ocean (Q9430) is a subclass and not an instance of a class. Ghouston (talk) 09:21, 29 May 2018 (UTC)

World Ocean (Q715269) is good. It is a geographic location, how speicific it is doesn't matter. birth at sea (Q46998262), death at sea (Q46998267) and ocean (Q9430) all have the problem that these are not geographic locations und should imo not be used as value of place of birth (P19) / place of death (P20) --Pasleim (talk) 09:28, 29 May 2018 (UTC)

If you want to add "at sea" as an alias, then go for it. Only thing that if it is called as a data field in an infobox, it is going to look ridiculous as the output, so there will need the ability to have the response "at sea" somehow. — billinghurst sDrewth 09:43, 29 May 2018 (UTC)

We could create a new "at sea" item which includes the "World Ocean" and the internal seas, and is explicitly intended for use as an event location when more specific info isn't available. Ghouston (talk) 10:42, 29 May 2018 (UTC)

No, they should not be deleted. They are both notable concepts discussed in external sources, and which are used outside of the narrow issue of P19/P20. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:47, 29 May 2018 (UTC)

@Pigsonthewing: Please expand on these thoughts explaining the use, and how the proposed alternative does not function. Also please explain when something says "location" in an infobox it is returning "death location: death at sea", instead of "death location: at sea" . — billinghurst sDrewth 09:58, 29 May 2018 (UTC)

What "proposed alternative"? As for "explaining the use", please use the "What link here" tool. I have no idea how your final sentence relates to my comment; perhaps you misread it? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:39, 29 May 2018 (UTC)

Suggested new merge tool

Could someone write a script that adds a "merge" button to the items on the sub-pages of User:Pasleim/projectmerge, which performs that task when clicked?

Or otherwise, a tool that puts a preview of the two items in a side-by-side view, with a merge button between them? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:45, 29 May 2018 (UTC)

There is https://tools.wmflabs.org/wikidata-game/distributed/#game=33 --Pasleim (talk) 09:29, 29 May 2018 (UTC)

Can I suggest that is a horrid idea. As someone who has dealt with 0000s of these merges they need exploring and real review, not a high level-no visibility merge. Category to category suggestions have a number of variations that can need to take place. Matches from wikipedias to sisters is not so clear, be it wikinews or wikisources, etc. There are merge ability at the top level of pages, and there are move tools for interwiki level, and they do the job beautifully, so you can open two wikidata pages and do first level review, possibly using the side popout infoboxes, or you can drill to the targets to review. — billinghurst sDrewth 09:51, 29 May 2018 (UTC)

All of which is more complicated - and more burdensome - than doing the same job in the tool I requested. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:37, 29 May 2018 (UTC)

Wikidata weekly summary #314

Here's your quick overview of what has been happening around Wikidata over the last week.

Discussions
- Closed request for adminship: Pintoch, Addshore (both successful)
- New request for comments: Why do we have an item for dogs and another one for Canis lupus familiaris?

Events
- IRC office hour, on May 29th at 18:00 (UTC+2, Berlin time), on the IRC channel #wikimedia-office. Special topic: Lexemes on Wikidata

Press, articles, blog posts
- Paper on Cellosaurus, with mappings to Wikidata, by Amos Bairoch
- Blog post from Galder Gonzalez announcing the Lexemes (in Basque)

Other Noteworthy Stuff
- Ongoing: On 24 May, there was a significant outage affecting Wikidata and sister projects that use Wikidata. As a result, some features of are temporarily disabled: Wikidata's property suggester, Lua modules and parser functions calling by label instead of ID, search for the ArticlePlaceholder. We apologize for the inconvenience, we're working to get them back as soon as possible. For technical details, see: phab:T195520 & Incident documentation/20180524-wikidata.
- As announced last week, the “integer” constraint type and the “separators” parameter for the “single value” and “single best value” constraint types are now supported in WikibaseQualityConstraints.
- Lexicographical data is now available on Wikidata! Check the announcement for more details. Feel free to try adding words and give feedback
- Structured Data on Commons has designs for displaying and using multilingual captions on the file page. Feedback is welcome on the talk page.
- You can try the new Drag&Drop gadget developed by Yarl and give feedback
- The European Commission announces a review of the Database Directive.
- OpenRefine 3.0 beta was released. You can get an overview of the new Wikidata-related features with tutorials and videos.
- TextRazor is a web service which analyses text and identifies the entities and concepts discussed, giving the corresponding Wikidata QIDs.

Did you know?
Newest properties:
- General datatypes: grammatical gender, conjugation class, word stem, Sandbox-Lexeme, Sandbox-Form, synonym, derived from, Wikidata property example for lexemes, Wikidata property example for forms, officialized by, Wikidata dataset import page, output method, IMDA rating, adapted by, topographic map, date of commercialization, stroke count, has conjugation class
  - External identifiers: CIVICUS Monitor country entry, Relationship Science organization ID, JMA Seismic Intensity Database ID, Eurohockey.com club ID, Daum Encyclopedia ID, Melon song ID, ASC Leiden Thesaurus ID, British Library system number, eBird hotspot ID, BAG public space ID, BAG building ID
- New property proposals to review:
  - General datatypes: Vocalized form, display technology, compound of, Chromosome number, Vietnamese character reading pattern, fanqie, evokes, homograph lexeme, homograph form, prime factor, classifier, Accomplice, Slavic Alphabet, signum
  - External identifiers: BMRB ID, ICSC ID, Baidu Baike ID, Tree of Life Web Project ID, Argentine biography deputy ID, B.R.A.H.M.S. ID, Chromosome numbers of the flora of Germany database ID, Filmow ID, Cité de la musique ID, Giant Bomb ID, OpenCorporates corporate grouping, Artists in Canada record number, RollDaBeats ID, Songfacts ID, ARWU ID
- Properties not really used, created >3 months:
  - Inventories of American Painting and Sculpture control number (P4814), Amtrak station code (P4803), make-up artist (P4805), sets environment variable (P4809), Technical Element Score (P4815), Rugby Australia ID (P4799), TORA ID (P4820), Panoptikum identifier (P4818), Cour des comptes magistrate ID (P4821), La Poste personality ID (P4822)
- Query examples:

Development
- Deployed and activate WikibaseLexeme on wikidata.org so you can now store the first lexicographical data on Wikidata (phab:T191457)
- Fixing the encoding issues on labels of items linked on Lexemes (phab:T195470, phab:T195359)
- Fixed an issue that was preventing adding Forms and Lexemes in statements (phab:T195402)
- Suppressed the browser's autocomplete that covers WikibaseLexeme's suggestion on Special:NewLexeme (phab:T195383, phab:T191526)
- Worked on a bug about representation overwriting other representation with the same language code (phab:T193636)
- Changed title of the field of a lemma language to make it less likely for people to add a translation as a second Lemma (phab:T193603)
- Working on the RDF mapping of WikibaseLexeme (phab:T160260)
- Working on implementing fulltext search for Lexemes (phab:T189739)
- Working on showing Lemmas for linked Lexemes instead of just their ID on special pages like Special:AllPages (phab:T195382)
- Fixing issues that happened after dropping an index from the wb_terms table (phab:T194270, phab:T195642, phab:T195611)
- Made constraint check result appear directly after adding a new statement (phab:T194247)
- Working on looking up entities by external identifiers on Special:Search (phab:T99899)
- Added Docker image to Wikibase website (phab:T189936)
- Added WikibaseImport script to Docker images to make it easier for people to start their own Wikibase install with some data imported from Wikidata (phab:T192080)

You can see all open tickets related to Wikidata here. If you want to help, you can also have a look at the tasks needing a volunteer.

Monthly Tasks
- Add labels, in your own language(s), for the new properties listed above.
- Comment on property proposals: all open proposals
- Suggested and open tasks!
- Contribute to a Showcase item.
- Help translate or proofread the interface and documentation pages, in your own language!
- Help merge identical items across Wikimedia projects.
- Help write the next summary!

Read the full report · Unsubscribe · Lea Lacroix (WMDE) 13:20, 29 May 2018 (UTC)

Needed: Easy way to add character role (P453) to films etc.

P453 has data type wikibase-item for now. This is why every single role in a film etc. needs its own item. Adding them for a whole cast is quite intense. Can't we build a tool for that? I use f.e. moveClaim.js by Matěj Suchánek, which is very useful. I would like to have a tool, which

creates a new item with given label (per popup) in a default language
adds instance of (P31)fictional character (Q95074) / instance of (P31)fictional human (Q15632617) and given sex or gender (P21) (per popup)

adds present in work (P1441) based on current film item

adds performer (P175) with current actor statement, on which the script is executed
adds character role (P453) with new item as qualifier

Can anybody write such a script? This would be very very useful. Queryzo (talk) 13:33, 29 May 2018 (UTC)

Wikidata:WikiProject Eurovision

Let me present a new WikiProject at the recent agenda. There are some questions to discuss at talkpage too. --Infovarius (talk) 13:43, 29 May 2018 (UTC)

You may have a look there. A property is waiting to be used. Thierry Caro (talk) 13:55, 29 May 2018 (UTC)

Addition that was taken down

Hello:

My name is Christopher Malcolm. I was a winner on NBC's The Wall Season 2 Episode 14, along with my brother Prince Malcolm. I was told that we both appeared on our high schools notable alumni list, Eleanor Roosevelt (Greenbelt, MD) and High Point High (Beltsville, MD). I am not sure why it was taken down, but I will say that it was a very significant accomplishment and deserves to be recognized. We have represented not only ourselves, but as products of our family, Prince George's County Public Schools, DCPD and the State of Maryland . My email is christophermalcolm5@hotmail.com if you have questions. I have other articles and evidence to show how significant that this appeareance/win was. – The preceding unsigned comment was added by 64.26.97.60 (talk • contribs) at 29. 5. 2018, 15:07‎ (UTC).

Does anybody know what item the IP was referring to? Matěj Suchánek (talk) 19:04, 29 May 2018 (UTC)

presumably Eleanor Roosevelt High School (Q14691919) and High Point High School (Q5756201). This is an enwiki thing of course, though it seems doubtful they would qualify for wikidata notability either. ArthurPSmith (talk) 19:55, 29 May 2018 (UTC)

Two different QIDs for Curator?

I have noticed there are two different QIDs for Curator. curator (Q674426) museum curator "content specialist charged with an institution's collections and involved with the interpretation of heritage material" and exhibition curator (Q780596) art curator "person in charge of organising an exhibition". The first (museum curator) has 6 identifiers, and 32 wiki pages, and 1688 wikidata items use it. The second has no identifiers and 7 wiki pages, and 347 wikidata items use it; these wiki pages appear to mostly be stubs. As part of our Art+Feminism effort, we have a number of curators we are adding occupation (P106) data to, and we want to do it correctly. Two questions:

What is the difference between these two? Is art curator an attempt to distinguish non-art from art curators? Or is it Harold Szeeman's idea of an "Ausstellungsmacher"? Does this mean non-institutional curators, like gallerists, or independent curators?
Which should we be using for our purposes?

For me, it feels like there should be some clarification here. Either they merge into one, which seems unlikely given there are different wiki pages linked, implying a meaningful difference. Or there should be significant distinctions made between them, and probably there should be more values added to go along with them, with some clear relationships and/or structure. Pingning Fnielsen as you have discussed this on one of the talk pages. --Theredproject (talk) 16:44, 29 May 2018 (UTC)

I regard curator (Q674426) as a position in an art gallery (the primary Danish label is currently "museumsinspektør" ~ "Museum inspector"), while exhibition curator (Q780596), I regard, as a person that is not necessarily employed in an organization but might be contracted for an exhibition (the German word is "exhibition maker"). For instance, I would say that Kasper Monrad (Q17287428) was employed in some form of curator (Q674426) at Statens Museum for Kunst (Q671384) (actually some kind of "senior inspector") while he acted as a exhibition curator (Q780596) for A beautiful lie - Eckersberg (Q22666748). — Finn Årup Nielsen (fnielsen) (talk) 17:03, 29 May 2018 (UTC)

So someone employed for an exhibition would be called an "independent curator" in the US. Which is kind of different from Szeeman's exhibition maker idea. The exhibition maker idea is predicated on authorship. --Theredproject (talk) 21:27, 29 May 2018 (UTC)

This is one example of a meta-issue where the world is not easily modelled due to operating on a continuum, with the added dimension of national and cultural differences. I raised one here a while ago, regarding crèche/nursery/child-care, which was archived unresolved. There are many more. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:59, 29 May 2018 (UTC)

IRC office hour today

Hello all,

As a reminder, we will have an office hour today, from 18:00 to 19:00 (UTC+2, Berlin time). It will take place on the channel #wikimedia-office^connect.

First, we will present what the Wikidata development team has been working on during the last months, and collect your questions and feedback. The second part will be dedicated to lexicographical data on Wikidata, and the first release that happened last week. We will give some news but again, keep time for your questions and suggestions.

See you there! Lea Lacroix (WMDE) (talk) 10:10, 29 May 2018 (UTC)

Update: the log. Lea Lacroix (WMDE) (talk) 06:29, 30 May 2018 (UTC)

@TomT0m: thanks for participating on behalf of the non-devs ;)
--- Jura 11:38, 30 May 2018 (UTC)

Instance of

What is the proper "instance of" for mayors of Essex County, New Jersey (Q50054133)? Also related, where does the Wikipedia category belong to? at Mayor of Newark, New Jersey (Q20899061), the position; or list of mayors of Newark, New Jersey (Q16147012), the list? --RAN (talk) 15:33, 29 May 2018 (UTC)

Hard to answer the first question until we know what the definition of the term is. Will it be a list, like list of mayors of Newark, New Jersey (Q16147012), or is it perhaps a shop - Mayors of Essex County, New Jersey, for all your haberdashery requirements? But at a guess, it might follow the pattern of P31s in the two Newark items. As to the category; probably both items will benefit from it. --Tagishsimon (talk) 16:03, 29 May 2018 (UTC)

If you mean the "job" that someone has, then the item label should be singular, "mayor of....", not plural "mayors of...". It you refer to a"List of mayors", then it's an instance of a list. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:03, 29 May 2018 (UTC)

The instance shouldn't be mayor of a place in New Jersey (Q42688505), in any case, since a group of mayors isn't an instance of a mayor. It's not a list either, based on the items linking to it. It has a subclass statement, so doesn't need to be an instance of anything, or it can be an instance of third-order class (Q24017465) if you like. Ghouston (talk) 01:53, 30 May 2018 (UTC)

Yes, second-order metaclass (Q24017465) is what I am looking for. Thanks! --RAN (talk) 04:39, 30 May 2018 (UTC)

What's the benefit of having "mayors of [place]" in addition to "list of mayors of [place]" and "Mayor of [place]"?
--- Jura 08:13, 30 May 2018 (UTC)

Emma Stone

Emma Stone (Q147077) has won an Academy Award for Best Actress (Q103618) for her role in La La Land (Q20856802). I've just created the item about this role, which is Mia Dolan (Q54467082). Where should it go in the artist's specific award received (P166) statement? Should it replace the movie as the value of for work (P1686) or is it OK to store it as the value of a character role (P453) qualifier? Even better, should it not be stored there at all? Thierry Caro (talk) 07:55, 30 May 2018 (UTC)

character role (P453) sounds fine to me, even though there is a constraint that doesn't allow award received (P166). I think keeping the movie in is nice. Husky (talk) 08:20, 30 May 2018 (UTC)

@Husky: OK. Would you also add character role (P453) → Mia Dolan (Q54467082) as a qualifier under the winner (P1346) → Emma Stone (Q147077) statement on Academy Award for Best Actress (Q103618)? Thierry Caro (talk) 08:37, 30 May 2018 (UTC)

Suppose she returns to play the same role in a sequel. She woudln't have the Emmy for the role, but for the first film. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:37, 30 May 2018 (UTC)

Website for districts of a city

Hi all,

I just wanna ask you if there is a possibility to add an official webpage to a district of a city? Up until now I only noticed this entity for whole city but not for parts of it.

Thank you and regards! – The preceding unsigned comment was added by 2001:67c:2184:410:d95b:25f5:66be:18ba (talk • contribs) at 12:42, May 30, 2018‎ (UTC).

Yes, districts can have their own wikidata items and wikipedia pages. For example Lintong District (Q194635) is part of the city of Xi’an in China. ArthurPSmith (talk) 12:57, 30 May 2018 (UTC)

Thank you for the answer. I didn't mean a wikipedia page but an external link to a website. For instance: Berlin is linked to berlin.de. I would like to link an offical domain to a district. This is not possible for districts I guess.

official website (P856) can be added to any Wikidata item, there is no constraint for this property regarding the type of the item. So if that district has an official website, don't hesitate to add it. Ahoerstemeier (talk) 14:08, 30 May 2018 (UTC)

It's probably not showing up in the generated suggestions on districts.... Sjoerd de Bruin (talk) 14:21, 30 May 2018 (UTC)

Importing datasets under incompatible licenses

I learned on a Phabricator ticket about at least one dataset, where the uploader (User:Pintoch) states that the imported dataset, RNSR, is not under a CC-0 license. The ticket claims that this is a frequent occurrence (and lists one other example, PubMedCentral, which I haven't digged into detail yet).

I would like to ask the community to discuss this issue and to decide on remedial steps. Here are a few possible suggestions - feel free to add some yourself. I was hoping that we already have policies for all of these in place - I didn't check, I just saw the contributions on Phabricator, and shaked my head. If all of these policies are already in place, then I would hope that we actually enforce them.

Proposal 1: when the proposal for importing a dataset is made on Wikidata:Dataset Imports, we need to have a field that explicitly discusses the license of the dataset to be imported, or explains why this is not needed. No dataset import may be approved without checking the license of the dataset for compatibility.

Proposal 2: contributors who knowingly import data from datasets with an incompatible license should be warned, and if warnings don't help, blocked.

Proposal 3: data imported from datasets with incompatible licenses should obviously be removed.

The whole thing has a hook that is massively problematic: importing a dataset with an incompatible license is not allowed - but referencing a dataset with an incompatible license on a statement, is. So if we get data from dataset A, which has a CC-0 or PD license, and then add references to dataset B, which has a CC-BY-SA or proprietary license, that's totally OK. But we can't import from dataset B directly.

What do people think? --Denny (talk) 01:09, 8 May 2018 (UTC)

P.S.: there are a number of side discussions that I would like to avoid, in particular about whether certain datasets are licensable at all. E.g. one could argue that the RNSR cannot be licensed by the French government, because it is incidental data that they need to have anyway (similar to timetable data for a public transport organization). These are important discussions, and I would be happy if we as a community decided to test the laws in such cases, as we did for the monkey selfie, but let's keep it simple for now and just assume that if an organization publishes a dataset under a specific license that this license does actually apply. --Denny (talk) 01:09, 8 May 2018 (UTC)

From an Italian point of view, we decided NOT to bulk import ANY dataset released under CC BY or above, because it is still not clear what might be the consequences of this. This doesn't mean that those dataset cannot be cited as sources, though, since single data cannot be copyrighted.

There is also the more general topic of "Italian institutions are really afraid of CC0, and most of them consider CC0 to be too American-like (sic!), so they prefer CC BY, also because it offers them more legal protection in case of citation", but I don't know if it's the right place, so I'll just leave that there. --Sannita - not just another it.wiki sysop 07:57, 8 May 2018 (UTC)

« if an organization publishes a dataset under a specific license that this license does actually apply. » then you can *never* import even a single data from a French database as the BY is *always* mandatory in France if you strictly follow the French law. Compatibility is never an easy thing, that is why is fear that your proposal 1 and 3 should be less strong and more pragmatical (I don't want all the names and codes of French cities removed just because they are under copyright of the Code officiel géographique (Q2981593)). Cdlt, VIGNERON (talk) 08:16, 8 May 2018 (UTC)

I have the feeling that the problem with all three proposals and data imports in general is that there is nobody around who really knows what is legally allowed. Most datasets don't have a compatible license but one can argue that factual data contained in a database is not protected even if the database itself is protected. [16]. To implement any of the above proposals we first need people who can give legal advice. --Pasleim (talk) 08:57, 8 May 2018 (UTC)

For what concern reusing existing data banks, only those terms of which which legally enable to (re)license them under CC-0 clauses can be used, and CC-by certaintly does not, so it's more careful to not attempt such an import. Are there online resources with official statements for Italian institutions such as the one you mention? Single data probably can't copyrighted in general, but that's not the topic. This discussion pertains to "substantial parts" of data banks, who can be subjects of restricted use based on some various legal grounds. Concerning the patrimonial right in France that point @VIGNERON:, I think that CC-0 do indicate that people weavers rights in an way that is as extensive that law allow, but not more (obviously). So, as far a work is covered by a droit d’auteur (let's say copyright as implemented in France), then attribution is a legal requirement for derivative works (they are also limits and exceptions to that but they don't apply for a wide public project as Wikidata). That doesn't mean that you can't use CC-0 to release a work, but that people that reuse your work are legally required to respect proper attribution. Thus I guess that to make Wikidata really useful for French people when dealing with copyrighted material, Wikidata should always provide a way to retrieve proper attribution to let this cheese eaters (🤣) in position to respect their local law. --Psychoslave (talk) 14:01, 8 May 2018 (UTC)

I'd suggest Proposal 4: switch Wikidata to a CC BY-SA license. This would solve license incompatibilities in imports and between Wikidata and other Wikimedia projects, fix Wiktionary integration (ie, not having to do it all again from scratch) and ensure that the data we work so hard for does not end in a proprietary silo. Three birds with one stone. NMaia (talk) 10:39, 8 May 2018 (UTC)

This is not really an option. CC0 is a "one way street". Once you have released your work under a CC0 license you cannot go back and claim some copyright, for example with a CC-BY license. See Who can use CC0 section on the CC0 Faq from Creative Commons. The only realistic option would be to create a Wikidata 2 and build it again from the sources (Wikipedia etc, but not the current Wikidata)... Robevans123 (talk) 12:02, 8 May 2018 (UTC)

While it's true CC0 data can't go back to being copyrighted, new content can be released under new, more protective terms. NMaia (talk) 12:37, 8 May 2018 (UTC)

Well, taking the hypothesis of illegally imported dataset into Wikidata, they never were legally made available under CC0. So it's not about going back, it's just about regularizing what is provided in Wikidata, for example by moving this dataset out of Wikidata or changing the license policy (other ideas are welcome, of course). --Psychoslave (talk) 15:16, 8 May 2018 (UTC)

Is anything imported from Wikimedia projects actually copyrightable? Simple facts are not. Ghouston (talk) 11:15, 8 May 2018 (UTC)

Wiktionary definitions would be copyrightable and importable if we used a compatible license, for instance. NMaia (talk) 12:37, 8 May 2018 (UTC)

At the very least it is a really doubtful situation, which is not better when one want to be able to honestly promote Wikidata as a database fully under CC-0 license that is reusable by anyone under the terms of this license without fear of any legal issue. --Psychoslave (talk) 14:11, 8 May 2018 (UTC)

Ghouston yes, the vast majority of Wikidata data is clearly not-copyrighted facts but maybe not all (I'm thinking of certain string and texts properties, where in some very limited cases one can raise the question of legality). Psychoslave you do know that zero-risk doesn't exist (or just as a bad-decision making bias) and for most of the Wikidata data, the risk of issue is so limited, it can easily be ignored (and AFAIK, there was no issue in 5 years) and if not ignored, it's easy to just take a look at the sources for reassurance. Cdlt, VIGNERON (talk) 08:15, 9 May 2018 (UTC)

I agree with @VIGNERON: that zero-risk doesn't exist. But on the other hand being under an acceptable threshold of uncertainty is possible and should be targeted. What are the hypotheses and metrics on which you are grounding the conclusion that "the risk of issue is so limited, it can easily be ignored"? I don't share this conclusion, because they are laws which raise high doubts about what can be transferred from one data bank to an other. That we didn't faced any in court judgment in five years is not a proof that we don't break the law. I think we don't need to wait for issues to reach such a critical state to tackle them. For example @Denny: as an official WMDE team member admitted as soon as of 2012 that large extraction from Wikipedia could not be allowed Denny Denny Denny. But today there are around 50M statements in Wikidata which where directly imported from Wikipedia. So ever some convincing arguments were raised that change this official legal point of view, or we are clearly consciously making importations which are considered disallowed. Or it can also be an other possibility that I didn't envisioned. Anyway, it would important to make an official statement about that, is there any official statement @Lydia_Pintscher_(WMDE):? And if this does apply to Wikipedia, how would this not apply to other data banks whose significant parts were extracted and imported in Wikidata (expect when they are clearly under CC-0 compatible terms, of course)? --Psychoslave (talk) 07:25, 10 May 2018 (UTC)

I think Denny got it wrong back then, or his words weren’t precise enough. Just because some data is included in a CC-BY-SA-licensed work (as Wikipedia) doesn’t mean that this data is copyright-able at all. The license protects creativity and originality, but it does not protect each and every word/data/aspect in the work. This is a thing with other seemingly incompatible works as well: just because someone claims to have copyright for a work doesn’t necessarily mean that the work actually enjoys any copyright protection. So: I don’t understand how you can worry about Wikipedia imports, and I don’t consider them disallowed. —MisterSynergy (talk) 07:48, 10 May 2018 (UTC)

The problem is not to know if every single data included in Wikipedia is copyrightable as single element, of course they are not. But articles as a whole, and Wikipedia as a whole are copyrighted and licensed. It's substantial extraction of this copyrighted material, licensed under CC-by-sa which is a legal issue. It would not be an issue to provide identical data obtained from miscellaneous other sources, provided that no similar irregular massive extraction would be involved. It would not be an issue to provide this massively extracted and imported data under a CC-by-sa license. Wikipedia instances constitute an original collection of data. If Wikidata don't like the condition under which Wikipedia provide data, the option of getting data from other means compatibles with CC-0 is open. In the end of the day, the claim that Wikidata is under CC-0 is simply not reliable enough for convincing people serious about license issues such as the OSM community. What the point of claiming a CC-0 license if downstream users can't rely that statement? --Psychoslave (talk) 20:31, 10 May 2018 (UTC)

Do you have a definition what a substantial extraction is? To my knowledge this is not clearly defined (or even completely undefined), just as many other details about database rights have not yet been well contested in court. —MisterSynergy (talk) 21:30, 10 May 2018 (UTC)

I agree that substantial part is not clearly defined, at least in the European directive. But who would believe that 50M statements, for the sole case of Wikipedia imports into Wikidata, is not a "substantial part"? Debating what would constitute a trigger of concern for Wikidata could be an interesting idea. Indeed it would be good to provide contributors a clear decision of what is the threshold they should not not exceed. --Psychoslave (talk) 07:04, 11 May 2018 (UTC)

@Psychoslave: A couple of comment pieces that might interest you on what may constitute a 'substantial taking' in (EU/English) copyright law: [17][18]. A key point is that it is the quality rather than quantity of what is taken that is most important -- reproduction of a large quantity of material may nevertheless not be a substantial taking, if the material taken contains little of creative originality. This is surely the case with most of what has been taken from Wikipedia -- what has been taken is not original expression, but established facts that had themselves been taken from other pre-existing sources. Was there a particular creative originality or judgement or intelligence in the selection of those facts for Wikipedia, that Wikidata is piggy-backing on? In most cases I would suggest not -- they are simply what people happen to have shoveled into Wikipedia infoboxes, from established sources. In a few cases, there might have been some judgement shown -- eg to prefer one date over another -- but I would suggest that Wikidata's limited number of such takings, especially if accompanied by referencing to Wikipedia, would fall squarely within the allowance for 'quotation' in EU law, and well within the scope of fair use in United States law.

A second, practical, point (which one might not seek to rely on in trying to attain the high moral ground, but which probably pertains all the same) is that, as per en:Wikipedia:About#Trademarks_and_copyrights, 'contributions remain the property of their creators', they are not signed over to WMF or anyone else, so it is only individual creators that have the standing to assert copyright infringement. Per the above, it would seem extremely unlikely that an individual creator could assert that what had been taken in Wikidata was a 'substantial' taking from their work; such that even if that did happen to be the case, and the Wikipedia user could establish EU jurisdiction (rather than US), it's such an unlikely event that the most practical approach would be to wait for a concrete takedown request. Jheald (talk) 17:45, 11 May 2018 (UTC)

Thank you for the links @Jheald:. It might also be put in perspective with comments on the phabricator ticket which discuss whether Wikipedia should be considered an original information collection with copyright backed by the selection theory, and who could claim copyright infringement, and How Does Feist Protect Electronic Data Bases of Facts? which concludes the selection theory grants copyright protection in a factual compilation based on originality in the compiler's selection. In framing this standard, the Court reiterated the low threshold of originality required to invoke copyright protection.

For the second point, it really apply to the specific case of Wikipedia, and we might open a new subsection below as was done for OSM. It raises two other points: 1. should we change terms of use of Wikimedia services to grant the WMF rights to defend Wikimedia community copyrights in court (which could be delegated to recognized affiliates), 2. Other data banks might have a well identified single moral person holding the copyright, so a a clear statement should be decided anyway on what should or should not be imported from other data banks depending on their licenses, not on our sole guess that probably in practice no one will care the hurdle of going in court. --Psychoslave (talk) 04:06, 12 May 2018 (UTC)

I am not a lawyer and can therefor not make an official statement on legal matters, sorry. However there is m:Wikilegal/Database Rights from the WMF legal team which I rely on. --Lydia Pintscher (WMDE) (talk) 10:31, 10 May 2018 (UTC)

Thank you @Lydia Pintscher (WMDE):, it's always nice to recall existence of this document for those who didn't read it yet, but it doesn't give an answer to whether Wikidata will continue to officially ignore concerns about legal issues or not. So basically, what is the Wikidata official interpretation of this text of legal "preliminary perspective", and how it indicates what should apply for extraction of substantial part of data banks and their importations in Wikidata? If the product manager for Wikidata is not the person who can answer this, who should we contact, or which defined process should we follow to expose a clear explicit decision on this point? Thank you in advance for any suggestion on this. --Psychoslave (talk) 21:27, 10 May 2018 (UTC)

On this point I would rather make Proposal 4.1: let contributors fill information of license that apply for each data depending on its source (or a set of licenses chosen by the contributor if this is an original data). Of course only free licenses should be accepted, so this wouldn't make disappear the point of making massive import respecting the license policy, just shift it, but it would already make Wikidata far more flexible and useful than it is currently with its CC-0 only license. Alternatively we might have Proposal 4.2: let the community launch its own Wikidata instances to host whatever free license they want in a relational database under whichever free license fit their need. But it appears to me that at least this last proposal is really out of the focus aimed by this section.--Psychoslave (talk) 15:16, 8 May 2018 (UTC)

For the sake of precision on Italy: CC-BY-3.0-IT (the non generic version) is fine because it waives sui generis database rights. For other imports, it matters whether there is any copyright (on the individual pieces of data) or database right. On the topic, see m:Wikilegal/Database Rights. --Nemo 15:14, 8 May 2018 (UTC)

That make we wonder, especially what was additionally said in this section, "does Wikidata provide sufficient information about its data sources to enable reusers ensure they are respecting law in their specific jurisdiction". It's one thing to be sure that Wikidata is not doing anything illegal through what it publishes, it's an other point to give its users all the information (or even some helping tools) to make sure they can legally use the data they are interested in, in their own jurisdiction. --Psychoslave (talk) 15:23, 8 May 2018 (UTC)

I need to respond to a claim earlier in the thread by Robevans123: it's backwards to say that CC0 data can't be relicensed as CC-By-SA. Yes, it's a "one way street", and this is the correct way to go down it! You can use CC0 content for any purpose, including to add a CC-By-SA license to it; the problem is that a large amount of content of Wikidata is not legitimately CC0. If Wikidata doesn't plan to delete all content imported from Wikipedia, Wiktionary, and so on, it needs to relicense as CC-By-SA. Rspeer (talk) 18:10, 11 May 2018 (UTC)

@Rspeer: More accurately: if Wikidata doesn't plan to delete all copyrightable content imported from Wikipedia, Wiktionary, and so on (beyond the amount allowable as quotation (EU) or fair use (USA)), it needs to relicense as CC-By-SA.

But it's not obvious, or at least not yet made out, that Wikidata includes any such content. Jheald (talk) 18:15, 11 May 2018 (UTC)

The claim that Wikipedia infoboxes are uncopyrightable is absurd. At best you're looking for jurisdiction-specific loopholes on databases that allow you to disregard the copyright and copy the data anyway. But there is no universal loophole, and it is not a good idea for Wikidata to go looking for one. On this Phabricator bug report, Aschmidt observes specifically that Wikidata violates the copyright law of Germany. Rspeer (talk) 19:32, 11 May 2018 (UTC)

Furthermore, Wikidata's own claim that its data is available under CC0 is an assertion that the data has a copyright. If the content of Wikidata were uncopyrightable, there would be nothing to put the CC0 license on. Rspeer (talk) 19:36, 11 May 2018 (UTC)

@Rspeer: All I see from Aschmidt is a blank assertion with no argumentation or specifics to back it up. It's not exactly a reasoned legal opinion. I fail to see what it is that you see of value in his statement.

Understand that copyright protects expression, not ideas. It does not protect effort, it protects creativity. There is creative expression in Wikipedia articles, even to an extent in Wikipedia infoboxes. But overwhelmingly, that is not what Wikidata is taking. Instead, the vast majority of what Wikidata is importing from Wikipedias is uncontroversial fact, involving no creative choices from the Wikipedia editors, and therefore no copyrightable expression. There may be some limited examples where that is not the case, but equally some limited taking is allowed, especially with citation, under the exceptions for quotation. Fundamentally, Wikipedia is supposed to reflect reliable sources and contain no original research. Content which is factual and does not contain original expression is not copyrightable.

If you believe that there are specific areas of content that do reflect editor choices and original expression, those are what it would be most useful for you to present, for the purposes of this discussion. Jheald (talk) 21:52, 11 May 2018 (UTC)

@Jheald: Here's one example: translations. Each Wikidata item is translated into a number of languages, using information collected from Wikipedia and Wiktionary. Translations are not mere facts, because languages aren't just one-to-one ciphers of each other. Translations reflect a choice of how to represent the overlapping concepts of different languages. Is your view that translation dictionaries are uncopyrightable? Authors of translation dictionaries would disagree strongly. Rspeer (talk) 22:21, 11 May 2018 (UTC)

@Rspeer: If you look at the recent discussions on the licensing of the namespace for lexicographical data on Wikidata, you will find that I fully agree that much/most of the content on Wiktionary is copyrightable, and I argued that I thought it was a mistake to require the lexicographic data to be CC0 (or at least, not without the active support of Wiktionary), because that would essentially make it impossible for the project to support Wiktionary with its own data. My views did not prevail.

But as regards Wikipedia, I don't see it. Wikipedias are simply writing an article about X. Infoboxes are not making a statement that the translation of A into language Y is B. When somebody in the past added an interwiki link between the two, to say that the articles covered roughly matching concepts, is it that interwiki link you think is being copyright-infringed? Jheald (talk) 08:41, 12 May 2018 (UTC)

Also, your claim about creativity is not universally true. The EU Database Directive does not require creativity. Creative Commons licenses assert w:Sui generis database rights, not just US copyright, and the reasoned legal opinion you're looking for about why the US's loophole doesn't exist in the EU has been linked several times in this thread: it's at meta:Wikilegal/Database Rights. They say "Whenever possible, the best course is to use only content that is made available by the author under an open license" -- and they clearly meant that you should follow the license. Rspeer (talk) 23:42, 11 May 2018 (UTC)

Above I was specifically writing about copyright. Since the ECJ's Infopaq decision, it is generally accepted that copyrights in the EU do require creativity. The EU database right is not copyright, it's something else (a "sui generis" right: sui generis = "of its own kind"). In relation to database rights, the CC-SA 4.0 licence disclaims them. The CC-SA 3.0 licence, as I understand it, doesn't clearly address the point one way or another. In relation to database rights and Wikipedia, one question is: who would that database right belong to? While individual contributions are protected by copyright, a contribution is not a database, especially when it is merely an addition to structures already established. WMF has always disclaimed ownership of contributions, saying that they belong to the contributors, so it is far from clear that it would somehow have acquired the database right. Furthermore, WMF's policy position in Brussels has been to support calls for the database right to be abolished (as on balance working against a society where everyone can share equally in the sum of all knowledge etc), so it is probably more likely that WMF would take a policy view to disclaim any database right, even if such a right were vested in WMF. (Continuance of the EU database right is a subject of occasional controversy, because there is no evidence of any kind that it has added to economic activity, which was its original stated objective.) One other possibility is that a database right in Wikipedia exists, and is jointly owned by the contributors; so that the contributors acting by joint decision could decide to assert it. I have no idea what legal requirements would need to be met for such a decision to be considered to rest on a sound basis. Nor would I like to predict which way such a hypothetical decision would go, particularly if there was a strong campaign that an environment without database rights is better conducive to the aims of free culture (such as appears to have convinced Creative Commons in the preparatory process for the CC-SA 4.0 licence). Arguably, the decisions of the various Wikipedias to support the transfer of the structures for interwiki linking to Wikidata already represents a community agreement to grant a community release of information to Wikidata, as least as regards the data in interwiki links. Jheald (talk) 11:24, 12 May 2018 (UTC)

In my opinion, I have seen a few red herrings here: this discussion is not about which license Wikidata should have - we had these discussions previously - but whether, whatever license Wikidata has, how do we deal with datasets that have incompatible licenses. So no matter whether Wikidata is CC-0 or CC-BY-SA, we would still not allow the import of, say, CC-BY-NC data, and we still wouldn't allow the import of proprietary data, etc. I was really hoping to get rather unanimous agreement on that question - and therefore support for my proposals.

We should respect the license of datasets to be imported; we need to state and record the license of datasets to be imported explicitly; and we should not allow the upload of datasets with incompatible licenses.

Do I really need to make a call for votes on these basic assumptions? Or can I assume that we all agree on these basic points? --Denny (talk) 22:50, 13 May 2018 (UTC)

@Denny: we need to state and record the license of datasets to be imported explicitly - how do you propose we do this? That is - I'm assuming the idea is to have an item for the dataset (and there's new pages that John Cummings has been working on for this etc) - but I don't know that we have a way right now to indicate that a particular statement in wikidata - or an entire item - originated with a certain dataset. Other than for the wikipedia's, for which we use imported from Wikimedia project (P143). I think what you are actually suggesting is the need for a new property (or new use of P143) to link every statement to the dataset from which it came. As you note, stated in (P248) in a reference confirms that this statement agrees with what that dataset says, but it does not specifically indicate the data was imported from that dataset (though I think in the majority of cases right now that is the case anyway). Having a property specifically to indicate import from a particular dataset would then allow monitoring and handling of the policy issues related to copyright and database rights. If we only import a few statements from an incompatibly-licensed database, that doesn't seem to be a problem under any of these rules, correct? With a property linked to the dataset we can count and perhaps as a community set what we think are reasonable rules on this. Right now I think we just don't know what the impact would be of what you propose relative to what is happening now. ArthurPSmith (talk) 13:55, 14 May 2018 (UTC)

What I meant - and there are several possible ways to do this - is to merely record the license of a dataset on John's page. That's all. This should help us to weed out datasets that are incompatible while discussing them. Sure, we could also represent this data in Wikidata with items and properties, but I really meant literally just an additional field on the dataset import request page (if it is not there yet, I haven't checked). That's what I meant with 'record the license of the datasets to be imported explicitly'. Make sense? --Denny (talk) 14:58, 14 May 2018 (UTC)

OpenStreetMap

User:Mateusz Konieczny has just added a section to Wikidata:OpenStreetMap, saying "Copying data from OSM to Wikidata is not allowed". In the light of the above, ongoing, discussion, should such a bold statement be made? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:00, 8 May 2018 (UTC)

What do you mean? What in the ongoing discussion incline you to conclude that this is a bold statement? OSM is licensed under ODBl which is clearly incompatible with CC-0. If data from OSM are imported into Wikidata, then Wikidata must be published under OBDl, according to OSM legal fact. --Psychoslave (talk) 21:36, 8 May 2018 (UTC)

Well: licenses are not compatible so this it true. It's a bit harsh and maybe it should be rephrased but in fine it's ok (and the same has been written on the Wiki OSM page for ages). Plus, we already have a lot of data similar to OSM from other sources, so why would you want to copy OSM? I use OSM a lot but for comparing that data are consistent on both side, I check in the sources and correct on both sides if there is discrepancies; no need to import nor even copy anything. Cdlt, VIGNERON (talk) 07:59, 9 May 2018 (UTC)

I think that if contributors are forced to duplicate their efforts, then it's an example of a case where Wikidata fails to meet some of its of its main fundamental stated goals: avoiding duplication of efforts, and effectively enable anyone to reuse its data. When a project fail to meet its fundamental goals, there is generally something that must be fixed. --Psychoslave (talk) 08:12, 11 May 2018 (UTC)

@Psychoslave: what are you talking about? sorry to be blunt but: do you even know Wikidata? first no-one is « forced to duplicate their efforts » and more importantly « avoiding duplication » is not one « main fundamental stated goals » (where did you read that?), per se and de facto Wikidata has to be made only of « duplication » of data from somewhere else, we can't and don't invent things, we have to rely on external sources). Yes, failure call for fix but thee has to really be failure first. The only failure I see is that you fail to understand some core principles of Wikidata (you could read Wikidata:Introduction/fr for a first introduction). Cdlt, VIGNERON (talk) 09:05, 11 May 2018 (UTC)

For example in Wikidata: the next step for Wikipedia and more, authored by Lydia Pintscher, it is stated Wikidata's [phase one] goal is to reduce [language links] duplication [in Wikipedia]. Thus the interpretation that reducing duplication have been an highlighted advantage of having a central relational database, right from the start of Wikidata. And subsequently with further inferences that next phases like infoboxes are coming with a similar goal of factorizing data where it can be factorized, so we avoid duplication of effort and improve fluidity of information dissemination. This interpretation can of course be supposed to be a misinterpretation. So isn't Wikidata aiming at 1. reducing duplication of factorable information, 2. reducing duplication of continuous effort to input, curate and disseminate underlying data, 3. maximizing ease of reuse in downstream projects? This would be surely useful to have a project with this kind of aims if Wikimedia movement have to "become the essential infrastructure of the ecosystem of free knowledge" by 2030. Using concepts, properties and entities as a way to represent information in a comprehensive manner (aka representation in intention) through basic predicates that can be retrieved and combined through some query service(s) should be helpful for this aims. If Wikidata as nothing to do with this aims and/or means, here is a good opportunity to state it clearly. If Wikidata do embrace this aims, then the ongoing discussion is reporting clearly identified cases where it fails to meet them. If Wikidata is not compatible with this aims, then maybe we should launch a Wikimedia project that embraces them. --Psychoslave (talk) 02:17, 12 May 2018 (UTC)

@Psychoslave: so you have one five year old blog post, stating that one goal of a finished phase of Wikidata was to reduce some duplication limited to the links (which has be at 99% done) and you conclude now that « main fundamental stated goals: avoiding duplication of efforts ». This is far beyond misinterpretation, please read yourself carefully before writing. I won't even comment the last part, this is even more illogical and has nothing to do with the point at hand here. Cdlt, VIGNERON (talk) 06:42, 12 May 2018 (UTC)

The example is more there to illustrate that it was something highlighted right from the start. The highlight of advantages procured by more coordination of efforts through a centralized relational database is not difficult to find all along since then. If anyone is interested, more recent examples could be provided. But if no one care, then is surely no point in doing this work of providing additional links: this was just some hints on the inferences that led to such a personal point of view not an attempt to defend or promote it. On the other hand, I would be very interested with a clear confirmation of whether 1, 2 and 3 are aims of Wikidata or not. @Lydia_Pintscher_(WMDE):, would you be kind enough to provide some feedback on that points? --Psychoslave (talk) 16:31, 12 May 2018 (UTC)

Reducing duplication and the wasted effort/lost efficiency that comes with it is a goal of Wikidata of course. Though I'm not sure how it moves us forward in this particular discussion ;-) --Lydia Pintscher (WMDE) (talk) 17:08, 14 May 2018 (UTC)

You assert that what is currently being debated is true. It is clear that there is not agreement on this. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:50, 10 May 2018 (UTC)

So far you provided no argument beyond stating "basic facts that are not copyrightable". It is true, but you jump directly to "and therefore OSM license may be ignored and OSM data is importable to Wikidata". Note that just because basic facts are not copyrightable it does not mean that something that consists (among other things) from basic facts is also not copyrightable Mateusz Konieczny (talk) 15:28, 10 May 2018 (UTC)

I made no such jump. Please do not put words in quote marks and attribute them to me, when I never said them. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:21, 10 May 2018 (UTC)

Sorry, it was not my intention to suggest that "and therefore..." was a quote. Putting it into a quote marks was misleading and I am sorry for that. But I am not sure what is your position if you disagree with that description - you claim that OSM data can be imported at large scale, what would require ignoring OSM license (what may be legal in USA, but it would still be ignoring of OSM license) Mateusz Konieczny (talk) 08:04, 11 May 2018 (UTC)

"you claim that OSM data can be imported at large scale" Where, please, do I claim that? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:13, 11 May 2018 (UTC)

Wikidata:Bot requests - by creating "Fetch coordinates from OSM" section ("We have many items with an OSM relation ID (P402), and no coordinates:") and continuing to insist that this report can be done for example by making this edit Mateusz Konieczny (talk) 08:22, 11 May 2018 (UTC)

Nowhere in that request do I make the claim that you suggest I make. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:05, 11 May 2018 (UTC)

Why you think that "Fetch coordinates from OSM" for "many items with an OSM relation ID (P402), and no coordinates:" is not a large scale import of OSM data? Mateusz Konieczny (talk) 09:45, 11 May 2018 (UTC)

Do you have any evidence at all that ODBL can be completely ignored in the USA (Wikidata juridistristion)? I am not a lawyer but "as long as you credit OpenStreetMap and its contributors" and "If you alter or build upon our data, you may distribute the result only under the same licence" from https://www.openstreetmap.org/copyright seems kind of incompatible with CC0 Mateusz Konieczny (talk) 08:44, 10 May 2018 (UTC)

Where have I suggested that ODBL "can be completely ignored in the USA"? Please avoid straw-man arguments. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:50, 10 May 2018 (UTC)

Either ODBL license of OSM data can be ignored in USA (maybe via "we will import everything in parts and claim that each part in a fact not protected by copyright or similar restriction") or importing OSM data into Wikidata would break copyright and copyright-like restrictions. Mateusz Konieczny (talk) 14:06, 10 May 2018 (UTC)

False dichotomy. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:20, 10 May 2018 (UTC)

Why? How you propose to import into CC0 database data on license that requires attribution and share-alike without ignoring that license? Mateusz Konieczny (talk) 08:06, 11 May 2018 (UTC)

I do not propose to "import into CC0 database data on license that requires attribution and share-alike". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:13, 11 May 2018 (UTC)

OSM data requires attribution and share-alike in ODBL license Mateusz Konieczny (talk) 08:24, 11 May 2018 (UTC)

@Pigsonthewing: can you check is there any evidence that importing from OSM to Wikidata is allowed? I encountered some mentions of automated imports of OSM data (under ODBL) into Wikidata and I plan on tracking down this copyright violations and request cleanup. In case that OSM license can be ignored it would be a waste of time Mateusz Konieczny (talk) 08:46, 10 May 2018 (UTC)

Wrong question. Do you have any evidence that it is prohibited by Wikidata/ the WMF? By law? The OSM community may assert that it is not allowed, but they are - it is argued above - not in a position to apply such rules to basic facts that are not copyrightable. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:50, 10 May 2018 (UTC)

as far as I know burden of proof is on people wishing to import something (at least it works this way on Wikimedia Commons, enwiki, plwiki - maybe Wikidata has different approach to copyright and everything is accepted as out of copyright and related restrictions until proved otherwise? Mateusz Konieczny (talk) 14:10, 10 May 2018 (UTC)

You asserted "Copying data from OSM to Wikidata is not allowed" - I'm asking you to substantiate that assertion. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:22, 10 May 2018 (UTC)

See https://www.openstreetmap.org/copyright (note: IANAL, as I mentioned there may be a way to argue that ODBL does not apply in USA) Mateusz Konieczny (talk) 15:18, 10 May 2018 (UTC)

I've already seen it. It does not substantiate your assertion. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:20, 10 May 2018 (UTC)

See this conclusion of the Wikimedia Foundation’s preliminary perspective on this legal issue: In the absence of a license, copying all or a substantial part of a protected database should be avoided. That is, for the case of Wikidata, the absence of a free license compatible with CC-0 should lead to not import data from a data bank. --Psychoslave (talk) 19:54, 10 May 2018 (UTC)

The heading of that page states that it is a "preliminary perspective... not final", is "Not Legal Advice!" (emboldening in original) and that "This page may not be accurate, and may fall out of date over time". If that is your basis for claiming that "Copying data from OSM to Wikidata is not allowed", it would seem to be a very weak one. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:11, 11 May 2018 (UTC)

Do you have any better source with analysis of legal situation for that topic? Because while it has multiple disclaimers I still would rank it above amateur interpretations like mine or yours Mateusz Konieczny (talk) 09:48, 11 May 2018 (UTC)

Actually the link that was pointing to this page was already highlighting this fact, plus Lydia Pintscher also pointed to this very same document indicating that she was relying on what it says. So what is the point of this answer? --Psychoslave (talk) 16:37, 12 May 2018 (UTC)

Thank you. The statement "Copying data from OSM to Wikidata is not allowed" does not limit itself to "all or a substantial part" of the OSM database. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 04:55, 11 May 2018 (UTC)

Is "substantial part" defined anywhere? I would expect that any intentional imports cross that barrier (while individual people using OSM maps to fact-check some basic info would not), but legal terms may have surprising definitions. Mateusz Konieczny (talk) 08:10, 11 May 2018 (UTC)

I just noticed "For EU databases, bots or other automated ways of extracting data should also be avoided because of the Directive’s prohibition on “repeated and systematic extraction” of even insubstantial amounts of data." in https://meta.wikimedia.org/wiki/Wikilegal/Database_Rights#Conclusion Do you have any more recent documents overturning this recommendation or different opinion from some other lawyers? Mateusz Konieczny (talk) 08:13, 11 May 2018 (UTC)

See above for my comments on the disclaimers on that page. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 05:21, 12 May 2018 (UTC)

Do you have any better source with analysis of legal situation for that topic? Because while it has multiple disclaimers I still would rank it above amateur interpretations like mine or yours? Mateusz Konieczny (talk) 09:23, 12 May 2018 (UTC)

Thank you, I added it to the https://www.wikidata.org/wiki/Wikidata:OpenStreetMap#Importing_data_from_OSM Mateusz Konieczny (talk) 08:10, 11 May 2018 (UTC)

"by Wikidata" AFAIK Wikidata has no page documenting restrictions on what may be imported from a copyright side Mateusz Konieczny (talk) 14:10, 10 May 2018 (UTC)

So not prohibited by Wikidata, then. And my other question? Who does make the prohibition which you claim exists? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:22, 10 May 2018 (UTC)

"So not prohibited by Wikidata, then" - rather not documented. At least I assume that at least some data is not importable due to copyright concerns (not documenting it is not changing anything) Mateusz Konieczny (talk) 15:24, 10 May 2018 (UTC)

"Some data" != "all data". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:20, 10 May 2018 (UTC)

So you agree that just because Wikidata failed to document what may be imported does not change that some databases may not be imported? Mateusz Konieczny (talk) 08:07, 11 May 2018 (UTC)

Once again you attempt to put words into my mouth. Please desist. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 05:17, 12 May 2018 (UTC)

I also marked Wikidata:Bot_requests#Fetch_coordinates_from_OSM as resolved as such import would require changing OSM license Mateusz Konieczny (talk) 09:38, 10 May 2018 (UTC)

Again, this is subject to ongoing debate. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:50, 10 May 2018 (UTC)

And a similar section has just been added at Wikidata:OpenStreetMap#Importing data from OSM. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:22, 10 May 2018 (UTC)

It is about Wikidata:OpenStreetMap#Importing_data_from_Wikidata_into_OSM, right? In that case it should be rather discussed on OSM (it is not something decidable by Wikidata community) Mateusz Konieczny (talk) 15:21, 10 May 2018 (UTC)

No it is not. What you wrote there begins "Copying data from OSM to Wikidata is not allowed ...". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 05:17, 12 May 2018 (UTC)

Then why you started the same discussion twice? OSM->Wikidata section is already discussed above "User:Mateusz Konieczny has just added a section to Wikidata:OpenStreetMap, saying "Copying data from OSM to Wikidata is not allowed"" Mateusz Konieczny (talk) 11:06, 19 May 2018 (UTC)

@Pigsonthewing, Mateusz Konieczny: can we stop pingpoing and zigzaging and come up with a sentence that any one can agree on? Maybe something like "Copying coprighted data from OSM to Wikidata is not allowed" or "Massively copying data from OSM to Wikidata is not allowed"? or maybe something completely different like "Data on OSM are under a licence uncompatible with Wikidata"? Andy since you are the one who have a problem with the current wording, could you pitch in please. Cdlt, VIGNERON (talk) 08:58, 11 May 2018 (UTC)

Given what was reported above, I would suggest ""Data on OSM are under a licence uncompatible with Wikidata, transferring data from OSM to Wikidata is not allowed. As a general rule, transferring data from Wikidata to OSM is also not allowed, see their own policy statement on the subject." --Psychoslave (talk) 04:51, 12 May 2018 (UTC)

Depending on the data concerned, that is like saying "Images may not be used in Wikimedia projects, because Bridgeman asserts copyright over them". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 05:17, 12 May 2018 (UTC)

In case of OSM data we have no evidence (unlike Corel case) that ODBL claims are invalid, and the best source ( https://meta.wikimedia.org/wiki/Wikilegal/Database_Rights#Conclusion ) of instructions for us instructs us to not ignore restrictions of use. And yes, https://meta.wikimedia.org/wiki/Wikilegal/Database_Rights#Conclusion has disclaimer spam (like a typical lawyer advice posted in public) but still has greater weight than amateur speculation Mateusz Konieczny (talk) 09:21, 12 May 2018 (UTC)

@VIGNERON: I would be happy to improve that section, thanks to this discussion I already made some changes. But currently I have no further ideas how it may be improved And yes, it is sad that due to license incompatibilities neither project may import data from other - but it is preferable to document it rather than use "handle copyright (and copyright-like restrictions) by pretending that they do not exist" method (I admit that at times I am tempted to do this but it would merely add work for reverting that and may cause loss of some related data added later) 09:28, 12 May 2018 (UTC)

I am rather stunned by this discussion. First, we should not make any statement about what OSM may or may not import from Wikidata besides that they may import as they like since Wikidata is under CC0. If OSM decides to adopt a different policy, that is their right and that's fine, but we should very clearly state that we consider the data in Wikidata under CC0.

Second, OSM is released under the ODBL. This makes an import from OSM to Wikidata incompatible. No such imports should be permitted or done, wether through bots or individual editors. Using OSM as a reference for individual statements is a different matter, but the data itself must not be imported or taken from OSM.

In my opinion, Wikidata should not be in the business of testing the legal boundaries of the license of fellow open content projects. If that's indeed the business we want to be in, we should do so together with the Wikimedia Foundation and their legal team. The same is true for data published by government organisations. Personally, I have doubts that many datasets published by governments may be published under anything else but the public domain (especially in light of the last few decisions by European courts regarding database rights), but unless we actively aim to proof that - which, as said, we should be doing together with the WMF and their legal team - we should respect their licenses and act accordingly.

Do we really need to have a vote on whether we are planning to respect the license of other datasets, in particular fellow open content projects? --Denny (talk) 23:03, 13 May 2018 (UTC)

Wikipedia and other Wikimedia projects

This is certainly a point that worth to have a subsection, at least as much as OSM. So the main topic of this subjection is to determine if we should allow massive data transfer from sister projects which uses licenses incompatible with CC-0. More specifically:

should we ban any import which obviously come directly from a sister project covered by a license incompatible with CC-0 ?
if we allow it, do we set some limits to what can be imported?
1. do we consider the set of all infoboxes as a work with legit copyright, as they are the result of an intellectual selection, and accordingly, do we allow massive extraction and import from Wikipedia infoboxes?
  1. If not, what should we do with all data already imported from Wikipedia?
    1. Simply delete them?
    2. Transfer them to an other Wikibase instance under CC-by-sa-3.0?
    3. Require proper license attributes on each Wikidata item which came from such an import?
2. do we consider extraction of category graphs from Wikipedia is allowable?
3. do we consider that massively extracting statements from the Wikipedia prose through Natural Language Processing (NLP) is allowable?

Of course this is just some suggestions to launch the conversation, feel free to make other propositions. --Psychoslave (talk) 06:09, 13 May 2018 (UTC)

For the last point on NLP, this is essentially the same idea that for the infoboxes, but they are people that seems to think that the prose and the table of data, or already semi-structured data like categories, are fundamentally different. Whatever we consider for factual data gathered in infoboxes (and categories) should equally apply to automated reduction of prose to a set of basic predicative statements.

Keep in mind that taking this path means that at the end anyone should be able to generate new prose under CC-0 or basically under any condition, since it is what CC-0 permits. So basically, anyone will be able to pretend they can generate new encyclopedic article under CC-0. To my mind, it would sound like a license laundering of Wikipedia work. To make things clear, it would no be a problem that Wikidata came to gather all this statements by any clearly legal mean. Extracting predicative statement with NLP on PD works would be fine for example. This section really is about Wikimedia sister projects.

Also, beyond the legality concern, it would be interesting to know if there is any consensus among Wikipedia contributors regarding this derivative work which weaver the license terms under which they consented to contribute. The legal aspect is of course important, but the feeling that Wikimedia matches its displayed social values to be welcoming hosts, caring neighbors, and equitable allies should not be neglected. --Psychoslave (talk) 06:24, 13 May 2018 (UTC)

Already answered several times but ok let's play ball again...

Obviously no (wrong predicates, license of a project is not the license of the content, nor of the data)
Obviously yes (nothing is limitless, and Wikidata only accept data so it filters a lot already)
1. no? NA? ("infoboxes" data protection can vary a lot ; plus, almost all data from Wikipedia infoboxes who where compatible with Wikidata had already been imported for 5 years... and we didn't just imported data, we did a lot of transformation and verification before and/or aftter import)
  1. NA
    1. NA
    2. NA
2. yes (already done for the most and compatible part)
3. probably yes, but probably not doable either so it's a rhetoric question.

Cdlt, VIGNERON (talk) 10:29, 13 May 2018 (UTC)

Other members of the Tremendous Wiktionary User Group might be interested to provide feedback on this point: @Noé, Benoît Prieur, Delarouvraie, Lyokoï, Jberkel, Thiemo Kreuz (WMDE):, @Daniel Kinzler (WMDE), Epantaleo, Ariel1024, Otourly, Shavtay, TaronjaSatsuma:, @Rodelar, Marcmiquel, Xenophôn, Jitrixis, Xabier Cañas, Nattes à chat: @LaMèreVeille, GastelEtzwane, Rich Farmbrough, Ernest-Mtl, tpt, M0tty: @Nemo_bis, Pamputt, Thibaut120094, JackPotte, Trizek (WMF), Sebleouf:, @Kimdime, S The Singer, Amqui, LA2, Satdeep Gill, Micru:, @Vive la Rosière, Tofeiku, Stalinjeet, Aryamanarora, TAKASUGI Shinji, Kvardek du: --Psychoslave Psychoslave (talk) 13:01, 13 May 2018 (UTC)

@Psychoslave:, I think I am a bit slow, and I am only now starting to understand your claim. Are you saying that extracting and republishing a statement under a different license - let us take a concrete example - such as the birth place of Marie Curie, from the Wikipedia article on Marie Curie, either by an algorithm using KE techniques or by a human who reads the article, is or should be disallowed due to either copyright or due to database rights? --Denny (talk) 23:21, 13 May 2018 (UTC)

Hi @Denny:. A single statement is not a problem, as far as I understand, as it is not eligible to either copyright^{[note 1]} or database rights restrictions. So extracting a single statement is not a problem. Extracting a few statements is surely not a problem either, as exception such as fair use in copyright and none substantial extraction of database rights are certainly invokable in such a use case. On the other hand, from what I currently understand, a collection of factual data resulting from a selection, such as Wikipedia, is granted a copyright (at least in the USA), and probably also trigger database rights where available. Republishing of significant data sets from such a collection under terms of use incompatible with the source database does raise concerns of infringement of both copyright and database rights, whatever is the used mean to do so. Maybe I didn't insist enough on the importance that we republish them, we don't simply transfer them in a non-public database. So we no longer matches the claim that WMF is republishing [contributors] copyrighted content under [contributors] license stated within Things you need to know in Wikipedia:Mirrors and forks. --Psychoslave (talk) 03:47, 14 May 2018 (UTC)

@Psychoslave: Let's talk about copyright and database right separately, in order to reduce potential confusion. We agree that copyright applies to expressions, not ideas? If we extract from Wikipedia the idea that Douglas Adams is a human, and store here in a wild JSON format that Q42 P27 Q5, I'm having trouble to see how the copyright that pertains to the Wikipedia article of Douglas Adams could preclude us to make that statement here. Again, let's stick to copyright first - could you explain how you think that would work? --Denny (talk) 05:11, 14 May 2018 (UTC)

Yes, @Denny: we agree that copyright applies to expressions, not ideas.

As I understanding it, you seem to stuck on single statement extraction, which I agree would not be a problem. The issue is that USA legislation recognizes copyright on purely factual collections, provided they result from a selection which exceed a low threshold of originality. Wikipedia as a whole, and Wikipedia infoboxes as a whole, certainly pass this original selection criteria. To keep on the example of birth place, only people which pass the notability criteria of Wikipedia will have such such a statement exposed in Wikipedia. Other data banks including birth place data would not have the same set of statement because the they would use a different set of original selection process.

My understanding is that if many single statements are each extracted from different sources and aggregated into a single database, that is not an issue, whatever the copyright status of each source. But if a significant part of a single copyrighted material is transferred, including a purely factual collection, then this induce copyright infringement unless the transfer is granted by some license that accompany the collection. Of course that doesn't prevent to obtain the very same data collection, or a significant subset, or a superset, by mean of an independent data gathering from many sources.

Now, of course it's not purely black and white, law is subject of interpretation. But as it is Wikidata is sufficiently swimming in troubled waters for raising concerns in downstream potential users to conduct them into enacting policy to not use Wikidata without proper analyze of the upstream sources and don't rely the claim that Wikidata can be used as a CC-0 material. --Psychoslave (talk) 08:17, 14 May 2018 (UTC)

So, let me see if I can rephrase your concern: you are citing Feist vs RTS as support for the idea that pure collections of data are copyrightable if they exceed a certain level of originality. It is a surprising choice of support, because that case actually established that pure collections of data such as telephone books are *not* copyrightable. But OK.

If I understand correctly, you agree that the single facts in Wikipedia are not copyrightable. OK, that's out of the way.

Now if I understand it correctly, you are relying on Wikipedia's selection of people and their birthplaces in order to establish that Wikipedia has a copyright on these.

Now, even if I follow this argument, and even if I where to hypothetically agree with you that Wikipedia's selection of people and their birthplaces are protected under copyright - which I do not, but as said, let's remain hypothetical - Wikidata is not actually copying that selection. First, there are many people in Wikipedia that have a birthplace stated in Wikipedia that have no birthplace stated in Wikidata. Second, there are many people in Wikidata that have a birthplace stated in Wikidata where there is no birthplace stated in Wikipedia. Third, there are many people in Wikidata that are not represented in Wikipedia at all.

It follows that the selection criteria applied by Wikipedia, even if they were the thing that is protected, are significantly different in Wikidata than they are on Wikipedia. What is extracted are single statements. And they are extracted from many different places. And they expand beyond Wikipedia and they do not fully cover Wikipedia.

This is why I think that the US copyright law regarding collections of factual data does not apply for the case of statements on Wikidata sourced to Wikipedia.

I am waiting for your response on this, or we can move to database rights, in case you acknowledge the argument. --Denny (talk) 14:55, 14 May 2018 (UTC)

It is a surprising choice of support, because that case actually established that pure collections of data such as telephone books are *not* copyrightable. But OK.

Hi @Denny:. To my mind, it established far more than that. Yes it established that some reuse of purely factual data collections don't raise copyright infringement. But if you look at the document I pointed earlier, it also defined more clearly what criteria a reuse of a purely factual data collection must follow to claim no copyright infringement likely occurred. So how is this surprising to reference it here? The important is not to gather documents that support an irrevocable established point of view. The goal is to determine if Wikidata import practices match our values, like:

strive for excellence, including in the legal field,
being welcoming hosts, caring neighbors, and equitable allies.

When projects like OSM, that I consider fully legitimate part of our allies, state that import of information from Wikidata into OSM is generally not permitted because Wikidata has lower standards in terms of copyright than OSM, to my mind it might mean that

they are completely wrong and we have to prove it with undeniable arguments, so we can help them going out of their erroneous assumptions;
they are at least partially right and they are things we have to improve to be more in a value-driven behaviour;
some other option that didn't spontaneously raised in mind.

And when it concerns a Wikimedia sister project, we should all the more not rely solely on they can't legally stop us, we have law for us, should we find some tricky loophole. Would it proven not illegal, to my mind that doesn't make us caring neighbors, and equitable allies.

--Psychoslave (talk) 08:59, 16 May 2018 (UTC)

I am happy to discuss these additional points - about us striving for excellence and being welcoming hosts, caring neighbours, and equitable allies - once the legality of our approach is established. If what we are doing is illegal there is not much point in discussing the other points. --Denny (talk) 13:57, 16 May 2018 (UTC)

First, there are many people in Wikipedia that have a birthplace stated in Wikidata that have no birthplace stated in Wikidata. Second, there are many people in Wikidata that have a birthplace stated in Wikidata where there is no birthplace stated in Wikipedia. Third, there are many people in Wikidata that are not represented in Wikipedia at all.

Seems like the first point contain an error, I guess you meant there are many people in Wikipedia that have a birthplace stated in Wikipedia that have no birthplace stated in Wikidata.

That is the point that is the more important for the current section I think, as far as the "original selection" criteria which trigger copyright on factual data collections goes. So the questions here are:

why they where not all imported? Is it because the bots programmed to extract this set of data didn't catch them all, or is it because there was a deliberate choice for each of them based on original criteria?
1. All the more, if the later holds, could we have a more quantitative idea of the data subset which was successfully extracted from Wikipedia but not imported into Wikidata?

--Psychoslave (talk) 08:59, 16 May 2018 (UTC)

Thanks for pointing out the error, I fixed it to make it easier to follow.

My assumption is that it is mostly due to bots not catching all birthdates. --Denny (talk) 13:57, 16 May 2018 (UTC)

It follows that the selection criteria applied by Wikipedia, even if they were the thing that is protected, are significantly different in Wikidata than they are on Wikipedia. What is extracted are single statements. And they are extracted from many different places. And they expand beyond Wikipedia and they do not fully cover Wikipedia.: Once again, it would be fine to have actual metrics to support this claims. If it can be showed that Wikidata aggregates data from almost as many sources than it has statements, then – at least to my mind – there is no reason to have concerns of any copyright infringement. If Wikidata relies on a single source for large subsets of its whole, then each time it happens the corresponding imported subset might raise legal concerns if the source of this large dataset is covered by terms of use incompatible with CC-0.; --Psychoslave (talk) 08:59, 16 May 2018 (UTC)

@Psychoslave:: Well, look, I am very happy you asked about metrics. In fact I was hoping you would.

Because you see, in my opinion, getting these metrics out of Wikipedia is not easy at all. Because unlike in Wikidata, the data in Wikipedia is not explicit at all - it is expressed in a mixed number of ways, often in natural language. Which makes the data in Wikipedia expressed very differently than the data in Wikipedia.

I claim that this establishes a substantial and fundamental difference between these two projects in how they express facts - with the obvious implications for the claims of a copyright infringement.

If you think this is not the case, it should be easy for you - given you're a computer scientist - to actually provide these metrics you are asking for.

I didn't learned SPARQL so far, although I do plan to do that at some point. So right now I can make the suggested query on Wikidata. But I'm not sure we are talking about the same metrics, as you are talking about metrics of Wikipedia. What do you have in mind? My demand was about varieties of sources for Wikidata statements. --Psychoslave (talk) 14:41, 26 May 2018 (UTC)

Because in case it is not trivial, then the claim that Wikidata imported from Wikipedia as a database, seems not to hold. --Denny (talk) 13:57, 16 May 2018 (UTC)

Note that just because importing from source A into specific database is not trivial it does not mean that this source is not a database (especially from a legal viewpoint). Mateusz Konieczny (talk) 15:17, 17 May 2018 (UTC)

True. And Wikipedia certainly can be regarded as a database of articles. But given the legal definition of databases in the EU directive, as "a collection of independent works, data or other materials arranged in a systematic or methodical way and individually accessible by electronic or other means", I don't see how one can argue that the places of birth are individually accessible in such means (unless you regard every natural text as a database). --Denny (talk) 15:28, 18 May 2018 (UTC)

For now I stick to the question of USA copyright problem, as you suggested. Personnaly I regard everything that exists as natural, and every collection of statement as a database. Using the term data bank can help to make it clear that we don't take into account that data are stored into a relational database or in plain prose. --Psychoslave (talk) 14:41, 26 May 2018 (UTC)

I am waiting for your response on this, or we can move to database rights, in case you acknowledge the argument.

I am not disagreeing with the global form of the argument, but until it's backed with solid metric evidences, I won't agree that it makes disappear legal uncertainties, or more accurately make them fall behind a reasonable uncertainty threshold.

Moreover, the import is not to convince me personally. We should consider this topic completely resolved when we will be able to convince actors such as OSM community that they can use Wikidata data without any legal concern, or that Wikidata provides sufficient integrated facilities to let such an actor filter data and keep only those which match its own legal standards.

--Psychoslave (talk) 08:59, 16 May 2018 (UTC)

"when we will be able to convince actors such as OSM community that they can use Wikidata data without any legal concern" to do this it would be necessary to provide real sources of data. And "imported from Wikipedia" is not sufficient, it would require distinguishing between "location copied from out-of-copyright map" and "location coped from Google Maps"), I am not expecting this to be achievable (alternative is to cease and revert imports from Wikipedia what is even less likely to happen). Mateusz Konieczny (talk) 09:14, 16 May 2018 (UTC)

This suggestion seems in phase with the idea of a data bank whose aims include reliability, which imply traceability of data as a requirement. Giving systematically a chain of import and the legal status of each transfer along each item would fulfill this requirement, wouldn't it @Mateusz Konieczny: ? --Psychoslave (talk) 03:49, 27 May 2018 (UTC)

Yes, it would solve the problem. @Psychoslave: Mateusz Konieczny (talk) 07:21, 27 May 2018 (UTC)

I do not agree with your exit criteria for this topic. The OSM community is a complex ally with many internal discussions and a very different mindset than the Wikidata community or the larger Wikimedia community. Just to name the most obvious difference: OSM explicitly asks for original research, Wikipedia explicitly forbids these. This obviously leads to a difference in the way sources for import are treated. That means that I don't consider it a goal to convince OSM that they can freely import coordinates from Wikidata - that would run counter to their basic principles. On the other hand if they, say, start importing names from Wikidata or other data, and re-release it under ODBL - well that would be quite a proof that they are rather comfortable with the claim that Wikidata is CC-0. Agreed? --Denny (talk) 13:57, 16 May 2018 (UTC)

"if they, say, start importing names from Wikidata or other data, and re-release it under ODBL" - it can also mean that whoever runs the import should be stopped and all added data reverted. I considered making an import of Wikidata/Wikipedia names and discovered legal issues - though at this point I have not encountered/reverted/reported any imports like that Mateusz Konieczny (talk) 15:14, 17 May 2018 (UTC)

@Mateusz Konieczny:: Thanks for raising potential legal issues in names of Wikidata and Wikipedia, these would be good to know of. Can you point us to a bit more background? --Denny (talk) 18:57, 17 May 2018 (UTC)

The clearest blocker for OSM usage is that coordinates in Wikipedia may be, for example, from Google maps (that is OK for Wikipedia). But Wikidata allows imports from Wikipedia and other similar sources and there is no real source given ("source=Wikipedia" is not enough to check whatever coordinates are based on Google maps). As result Wikidata is not usable for OSM imports - even adding wikidata links based on coordinates in Wikidata entries is suspect. I also suspect that other data in Wikidata it not really CC0, including names but I made no real research here. Note that for example Wikidata:Bot_requests#Fetch_coordinates_from_OSM is still open and there is no page documenting what kind of data may not be imported due to legal issues, so I am certain that CC0 in footnote is a trap for people caring about copyright. Mateusz Konieczny (talk) 11:00, 19 May 2018 (UTC)

Actually @Mateusz Konieczny: if there are legal concerns of importing Google map data into OSM, I don't see how they don't raise any concern of transfer into Wikipedia. If Google consider they have a valid copyright on this data, then either they should grant the appropriate permission to the libre culture community to use them under a free license and only libre projects with a compatible license should use them, or no libre project should use them. Do we agree on that? --Psychoslave (talk) 03:49, 27 May 2018 (UTC)

It is a bit complicated but there are multiple things that result in a different policies. For start OSM respects database rights limitations. In addition in case of OSM there would be problem of putting limit at which point copying is wrong - cloning entire Google Maps dataset is certainly wrong, so how much may be copied? Wikipedia avoid this problem as only limited amount of geographical info can be added to article, OSM solved it by 100% ban on importing anything from Google Maps and purging with fire anything that was added by unaware or malicious. Finally, Wikimedia and OSM have different approaches to copyright, with enwiki allowing fair use images and aggressive approaches like monkey selfie case on one side and "OSM is not a place to experiment with the grey areas of international copyright law" sentiment on the other side. On top of that OSM strongly prefers original work over imports. In addition - on Commons deleting image that turned out to be copyrighted is trivial, in OSM reverting an import may require deleting any further contributions in a given area. So overall it is a combination of different products, different jurisdictions, different approaches. @Psychoslave: Mateusz Konieczny (talk) 07:37, 27 May 2018 (UTC)

"either they should grant the appropriate permission to the libre culture community to use them under a free license and only libre projects with a compatible license should use them, or no libre project should use them" - It would be nice for Google to publish under an open license what they bought for billions I would not expect this, this data will probably stay as "all rights reserved" (and this is a partial reason why OSM is a good idea). @Psychoslave: Mateusz Konieczny (talk) 07:37, 27 May 2018 (UTC)

@Psychoslave: It would be great if you could comment on T193728, the bug you have opened on Phabricator, in order to get this moving. Do the questions posed there look good to you? --Denny (talk) 17:48, 22 May 2018 (UTC)

Hi @Denny:, I gave some feedback, both on the ticket and bellow. Sorry for the delay. --Psychoslave (talk) 13:47, 26 May 2018 (UTC)

The OSM community is a complex ally with many internal discussions and a very different mindset than the Wikidata community or the larger Wikimedia community.: Surely, any community is a complex sociological topic. We can insist on the differences, or try to find way to improve way to better act together based on our common values. Otherwise in term of mindset, there is probably a larger gap between projects using CC-0, and two projects using copyleft licenses. Copyleft licenses are in the spirit of carrying the sustainability of freedom, non-copyleft licenses are in the spirit of carrying broadest immediate freedom. So in other words, this is a two ways to operate on a common concern on balancing between social welfare and individual freedom. --Psychoslave (talk) 13:47, 26 May 2018 (UTC)

Just to name the most obvious difference: OSM explicitly asks for original research, Wikipedia explicitly forbids these.; And Commons and Wikiversity do encourage them too, if we want to talk about the larger Wikimedia community. All the more if we take into account the [strategic plan] that by 2030, Wikimedia should become the essential infrastructure of the ecosystem of free knowledge, then we definitely have to come with an infrastructure that allow integration of other free culture projects. We should aim at integrating as much as we can in a comprehensive, dynamic, operative and sustainable way. We should welcome and cherish our differences and take advantage of the diversity they foster. In this perspective, we should come with either convincing arguments to make our follow free culture friends that they can use Wikidata data safely, or we should improve our infrastructure in any manner that will make our community more integrative of facilities that at the very minimum create two-ways bridges between our projects. --Psychoslave (talk) 13:47, 26 May 2018 (UTC)

This obviously leads to a difference in the way sources for import are treated.: If this mean, the way sources are integrated into Wikipedia, yes indeed, preventing original research do have an influence on what will be accepted. Wikipedia articles aim at not integrating original researches, but they do it so by creating original articles. Actually this make it a very strong argument toward originality of its gathered data as a work of selection. Copyright for original collection of factual data is about originality resulting from intellectual selection, not about novelty of the resulting work. --Psychoslave (talk) 13:47, 26 May 2018 (UTC)

That means that I don't consider it a goal to convince OSM that they can freely import coordinates from Wikidata - that would run counter to their basic principles. On the other hand if they, say, start importing names from Wikidata or other data, and re-release it under ODBL - well that would be quite a proof that they are rather comfortable with the claim that Wikidata is CC-0. Agreed? --Denny: To my mind, if OSM would start integrating data from Wikidata, this would be a strong argument to advance when advocating Wikidata toward other ODBl community projects. I would expect such a move to result itself from very strong arguments, which should themselves be used for this kind of advocating. --Psychoslave (talk) 13:47, 26 May 2018 (UTC)

Serious doubts on this fetish called copyright

I blogged about how copyright for Wikidata is highly irrelevant and at best a defensive manoeuvre. There is one additional way in which copyright of a database may be irrelevant. It is when the rights holder gives express permission to use its data. Thanks, GerardM (talk) 20:31, 27 May 2018 (UTC)

"it is done by bot and consequently there is not even some "sweat on the brow"." - copying of images also can be done by easy to implement bot and it is not changing that images may be copyrighted or covered by copyright-like laws. I am not sure why you think that just because copying something is easy it means that it is unencumbered by copyright or copyright-like laws Mateusz Konieczny (talk) 19:25, 28 May 2018 (UTC)

Your comparison fails on one important aspect. With images, it is the image itself that is singular. When the exact same image is available in multiple sources, arguably you can make your pick from the licenses available to you. A data item, a fact is in itself not open to copyright, a copyright is for the totality of the database. However, the point that I am making is that when multiple sources agree on the same fact (=data item) it is not possible to say who of those sources is the original source. It may even be that the source is a book, a newspaper a Wikipedia but not the source that is being queried.

Consequently it is not because it is easy, it is because the fact in question has no copyright of its own. There is no single source that can be seen as authoritative, that is only in the agreement on the same fact among multiple sources. Thanks, GerardM (talk) 16:02, 29 May 2018 (UTC) Doesn't that fail again to catch the problem exposed here which is not copyrightability of individual facts, but using significant subset of copyrighted data banks created through original selection? It also seems to show no consideration between fact, data and metadata which along data traceability should be a mandatory requirement for any project serious about reliability. Also CC-0 don't make law disappear, so whether it's liked or not, people in a jurisdiction like France can't waiver their patrimonial rights which include attribution, so basically in France CC-0 apply more or less like a CC-by. It doesn't mean it apply to individual facts, but collective works such as Wikipedia are a different matter. Psychoslave (talk) 06:30, 31 May 2018 (UTC)

Notes

↑ Maybe some works such as a Haiku might raise a grey zone here, but this is really not an important consideration for the topic of this discussion.

Problem with described by source (P1343)

described by source (P1343) can have reference URL (P854) as qualifier, but reference URL (P854) can't be used as qualifier. What to do?--Malore (talk) 00:15, 27 May 2018 (UTC)

Yup. That's a bit broken. P1343 seems to need a URL as a PQ, analogous to title, chapter, page(s), etc. Choices are reference URL (P854) or URL (P2699), and in truth the latter seems more sensible if it is a pointer to the resource, rather than a reference about the resource. So. Two options are, 1. change the P854 definition to broaden its scope, or 2. change the P1343 allowed qualifiers, substituting P2699 for P854; and then play in WQS and quickstatements to move claims from the one property to the other. I lean to option 2. --Tagishsimon (talk) 14:38, 27 May 2018 (UTC)

@Tagishsimon: I agree with you. I think the second option is the better one.--Malore (talk) 20:16, 27 May 2018 (UTC)

Why would you want to use reference URL (P854) as qualifier? Do you have an example? — Finn Årup Nielsen (fnielsen) (talk) 16:17, 28 May 2018 (UTC)

I don't want to use it as qualifier. I'm just saying that described by source (P1343) property page mentions reference URL (P854) as an allowed qualifier.--Malore (talk) 19:57, 28 May 2018 (UTC)

And iirc 50+k items use P854 as a PQ. The drift of the conversation so far has been towards use of URL (P2699) instead; see above. --Tagishsimon (talk) 20:25, 28 May 2018 (UTC)

@Tagishsimon: Currently there are about 9000 uses of P854 as qualifier, 8326 of which are qualifiers of described by source (P1343)

If you agree, I can substitute all these reference URL (P854) with URL (P2699)--Malore (talk) 22:16, 28 May 2018 (UTC)

I'm happy that be done, but I think notes should be left on the talk pages of both properties advising readers of the problem and proposed solution; and the constraints for P1343 amended to remove P854 and add P2699; all before porting data to 2699. --Tagishsimon (talk) 22:19, 28 May 2018 (UTC)

Ok, I added a topic about this in P854 talk page, and I had added a similar topic in P1343 talk page but noone replied. If noone opposes in the next 2-3 days, I'm going to edit the constraints for P1343 and port the data to P2699.--Malore (talk) 14:40, 29 May 2018 (UTC)

I use <reference url> as a qualifier for <stated in> and <described by source> when referencing Berg Encyclopedia of World Dress and Fashion (Q4891400), since every article has a unique URL. - PKM (talk) 20:55, 28 May 2018 (UTC)

Do you see any issue with using URL (P2699) instead, PKM? --Tagishsimon (talk) 22:21, 28 May 2018 (UTC)

No issue (especially if a bot can convert all the ones I have already done!) - PKM (talk) 20:15, 30 May 2018 (UTC)

Cornish College of the Arts

On Commons, I see that commons:Category:William Volker Building now has a Wikidata Infobox that is somehow downloading the information for Cornish College of the Arts. That seems quite wrong to me. While the building is currently part (in fact, headquarters) of the college, it was not so for most of the history of either the building or the college. We have a separate category for the college, whose most famous building is certainly Kerry Hall on Seattle's Capitol Hill, not the William Volker Building. I think this happened because of this recent bot action on Wikidata, which I think is mistaken.

I'd like to revert that edit, but (1) how do we assure that the bot doesn't just make the same edit again and (2) it seems likely that the bot is making other mistaken edits combining an item for a building with items for an institution that happens to be located in that building.. This would be even more wrong for an office building that might contain the headquarters of several independent companies. - Jmabel (talk) 17:14, 29 May 2018 (UTC)

It's because of the NRHP reference number (P649) entry on Cornish College of the Arts (Q1134192). If there is a wikidata item for the building, the NRHP entry should be moved to that item. ArthurPSmith (talk) 17:32, 29 May 2018 (UTC)

There is not currently a wikidata item for the building, so what do you recommend? - Jmabel (talk) 23:50, 29 May 2018 (UTC)

First, create an item for the building (done) and move the NRHP ref to it: William Volker Building (Q54449361). Then fix the poor interwiki mappings, presumably. --Tagishsimon (talk) 23:59, 29 May 2018 (UTC)

Similar & even stranger case where Commons category for a neighborhood (commons:Category:International District, Seattle, Washington) has ended up linked to the Wikidata item for a park at the edge of the neighborhood (Kobe Terrace) and gets the resulting Wikidata item, even though Chinatown-International District (Q3153248) exists. - Jmabel (talk) 23:56, 29 May 2018 (UTC)

Both fixed. Removing / amending the commonswiki link did the trick. --Tagishsimon (talk) 00:13, 30 May 2018 (UTC)

To clear up a misconception, above, the NRHP number was blameless, and the issue was with duff data in two wikidata records. A commons template, Wikidata_Infobox, is doing the heavy lifting. From the Commons template documentation, "The category page needs to be linked to through a sitelink on Wikidata (under 'other sites', add the correct sitelink to 'commons')", so where there is a wrong mapping, the fix is to remove the commons sitelink from the 'wrong' wikidata item and, ideally, add it to the 'right' item, even if that item needs to be created from scratch. Thereafter a purge of the commons category will probably be the push it needs, if push is needed, to fix the error. --Tagishsimon (talk) 00:28, 30 May 2018 (UTC)

Thanks! - Jmabel (talk) 01:58, 30 May 2018 (UTC)

It looks like the first one was because of the NHRP number being misplaced here (possibly imported from nlwp, [19]), and my bot added the commons sitelink on the basis of the NHRP ID matching; the second was because of a bad commons category link on enwp that got imported here (later fixed on enwp, but after the import to here [20]). Now that the underlying data is fixed (thanks Tagishsimon and ArthurPSmith), the bot won't repeat those edits. Thanks. Mike Peel (talk) 10:52, 30 May 2018 (UTC)

Possibly related matter

Tracked in Phabricator
Task T177698

It looks like Commons category (P373) property values have stopped linking to the relevant Commons category. This appears to be a recent breakage. - Jmabel (talk) 00:24, 30 May 2018 (UTC)

Example, please. And unrelated, given the findings above. Seems to be fine on Ebey's Landing National Historical Reserve (Q5331845), for instance. --Tagishsimon (talk) 00:31, 30 May 2018 (UTC)

commons:Category:Coupeville, Washington, using a Wikidata-based Infobox, shows this symptom, and is failing to get data from Coupeville (Q1513976), even though

⟨ Coupeville (Q1513976)  

 ⟩ Commons category (P373) ⟨ Coupeville, Washington ⟩

. I'm not certain it's "unrelated", since I did do a related fix on this item. - Jmabel (talk) 00:34, 30 May 2018 (UTC)

@Jmabel: Fixed now? --Tagishsimon (talk) 00:37, 30 May 2018 (UTC)

Hint: there are two ways to link from a wikidata item to the commons. The commons template uses the second of these methods i.e. not P373, but a sitelink [21]. --Tagishsimon (talk) 00:38, 30 May 2018 (UTC)

Fixed after a fashion, but the extra sitelink should not be needed in order for us to know what Commons category is associated with a Wikidata item. - Jmabel (talk) 00:41, 30 May 2018 (UTC)

Links to categories are a bit of a clusterfuck right now; P373 works in a different way from topic's main category (P910), and the use of P373 overlaps with the use of the commons sitelink; and yes, arguably the commons template should be using P373 and not the sitelink. So whereas all of that needs to be fixed, the fact remains that the current problem on the commons cats are all fixed, and we now know how to fix them. --Tagishsimon (talk) 00:46, 30 May 2018 (UTC)

So whenever I add P373 (which is often), should I also be adding that sitelink? And, if so, is there anywhere I should have learned that? - Jmabel (talk) 00:53, 30 May 2018 (UTC)

I would just add the sitelink. The P373 value is less important, and gets added by a bot. It's natural for the Commons template to use the sitelink, since as I understand it, when the category is opened on Commons, the sitelinked Wikidata item is automatically loaded, but that doesn't happen if its just in P373. Ghouston (talk) 01:44, 30 May 2018 (UTC)

The Commons template can't use P373 -- there's no easy efficient way in Lua to do look-ups as to whether there is an item which has a particular string as the value for a property. (One way this information can be got is eg through a SPARQL query called from Javascript, which is what the wdcat.js script does, but that's not appropriate for an infobox). The infobox template therefore relies on the sitelink to know which is the relevant Wikidata item for a Commons category.

User:Mike Peel recently did a bot run (bot request here) that added a lot of sitelinks for items that previously only had a P373 to Commons. I think he's now re-running that every 24 hours, to pick up new P373s and add sitelinks for them. However there are some cases the bot won't do automatically and leaves to humans -- eg if there is more than one item with a P373 to the same Commons cat.

There may also be a bot that is adding a P373 where a sitelink exists, as Ghouston suggests, but I don't know for sure whether that is the case.

P373 is rather easier for a WDQS query to use; on the other hand the sitelink is easier for a Commons infobox, or for a Commons SQL query. At the moment both are needed. Jheald (talk) 10:01, 30 May 2018 (UTC)

It's done by User:DeltaBot, e.g., recently on Zoja Golubeva (Q2503750). I didn't know a bot was doing it the other way around too. The P373 value is also useful when the Commons sitelink is on a related category item. Ghouston (talk) 10:07, 30 May 2018 (UTC)

It's nearly always better to add the commons sitelink, as that's a one-to-one link between the category and the Wikidata entry, and those always want to then be copied to P373 by DeltaBot. The other way around is more difficult, and may be missed by Pi bot (particularly if another P373 value to the same category exists in a different Wikidata entry). JHeald's explanation is spot-on for why the infobox can't use P373. Thanks. Mike Peel (talk) 10:52, 30 May 2018 (UTC)

Commons sitelink and Commons category (P373)

Would it ever be actively wrong to add a Commons sitelink equal to a correct P373? If so, when?
Would it ever be actively wrong to add a P373 equal to a correct Commons sitelink? If so, when? - Jmabel (talk) 00:11, 31 May 2018 (UTC)

Software blocks fixing - Human item - Latvian name in label copied to several other language-specific labels

I tried to fix, but software blocks me:

Could not save due to an error. The save has failed.
As an anti-abuse measure, you are limited from performing this action too many times in a short space of time, and you have exceeded this limit. Please try again in a few minutes.

Fixed some on each of the following items, but not all:

– The preceding unsigned comment was added by 85.182.28.185 (talk • contribs) at 23:29, 30 May 2018‎ (UTC).

Mapping the history of Political parties and members of it

Any good guidelines how to modell a political party and people member of it in Wikidata? I can see we have

parties doing name changes
parties merging and splitting
people being part of a group that later was formalised into a party
.....

Do we

create objects for all name changes
do we use ”same as"
how to modell merges?
how to modell that a person is member of a party that
1. change name
2. merge
3. split
4. ....

Another aspect is shown by this WD query is grouping by ideology see tweet - Salgo60 (talk) 07:56, 30 May 2018 (UTC)

Good questions, especially for adding any political data on Thailand - the Thai parties were loose groupings of political faction (Q1393724) around regional strongmen, who then choose the party which offered best chances or even best payment. And of course the parties merged and split, dissolved and re-created with same name, or changed named to hide an unpleasant past quite regularly there. Ahoerstemeier (talk) 09:50, 30 May 2018 (UTC)

The starting point would be the items that have been created for Wikipedia articles. Mergers etc., can be marked with significant event (P793). Ghouston (talk) 09:56, 30 May 2018 (UTC)

Interesting session at WD in Berlin last year "Well structured political data for the whole world: impossible utopia, or Wikidata at its best?"

Wikidata:WikiProject_every_politician

- Salgo60 (talk) 11:18, 30 May 2018 (UTC)

I think simple name changes can be reflected by adding start time (P580)/end time (P582) qualifiers to name (P2561) (or a more suitable equivalent, such as official name (P1448)) statements: that way anything that needs to display what the name of the entity was at a given time can use that field instead of the label. If the name change reflects something much more significant, such that it's actually a new entity being formed (whereby many of the other statements about the party would also change at this point), then it's probably best to create a new item with replaces (P1365)/replaced by (P1366) connections. Most mergers or splits seem like they'll fall into that category. In terms of reflecting someone's membership of the party, that should be a link to whatever item is most accurate at the time in question (most links will be on politicians holding office, which are usually expressed with parliamentary group (P4100) qualifiers on position held (P39) statements reflecting a very specific period). If this is a party that has changed name in the meantime, wikidata.org itself will display the current label, which might be a little anachronistic, but other users of the data can decide which is more relevant in their own context. --Oravrattas (talk) 09:47, 31 May 2018 (UTC)

Gaps ATC code (P267) assignments

During some checks I noticed significant gaps in the assignment of ATC codes to pharmaceutical substances. Some gaps were suprising, since the substance name matches exactly the substance name in the WHO ADC database. It should be possible to close these gaps by simply importing the ATC dataset. In other cases, the Drugbank had a complete list of the links for substances, but the information had not been added in the substance.

What could be done to close this Gap? I would invest time into a solution, since I would be interested to perform a lookup by ATC Code into Wikidata from an application. – The preceding unsigned comment was added by Ncarste (talk • contribs) at 21:16, 30 May 2018‎ (UTC).

@Ncarste: It looks like nobody ever uploaded the dataset to Mix n Match, that's probably a good first step at least. You could also discuss further (make sure any rights are checked etc.) at Wikidata:Dataset Imports. ArthurPSmith (talk) 14:39, 31 May 2018 (UTC)

Charity Navigator items: work in progress

Just a heads up that I'm working my way though Mix'n'Match catalogue 1057 for Charity Navigator ID (P4861), over the next week or two. I'm going to be creating a lot of items like Springfield Library and Museums Association (Q54556029) for US non-profit organisations.

At the end of that process, I will request a bot to move the misplaced URLS from the descriptions to official website (P856) and to add a more meaningful description. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:14, 31 May 2018 (UTC)

P571 vs P580

Hi everyone, i'm having some trouble deciding when to use inception (P571) and when to use start time (P580). I'm currently doing some bot imports on streets with the creation date and i'm using start time (P580) for that. However, some existing items use inception (P571). That caused me to do a little research, so i looked at the top 50 classes used by bot properties (P571, P580), and frankly things seem to be a bit of a mess.

P571 has 5.655 streets, but P580 has 3.911 roads. P571 has almost 20.000 football teams, but P580 has around 1.600 cycling teams. P571 has 10.000 magazines, P580 has 2.000. And the list goes on and on. There doesn't seem to be a consensus on what property to use for what, save for paintings, which consistently seem to use P571. Given that these two properties seem to be used for basically the same things, i would suggest either merging the two or to make the distinction more clear so that WD users know which property to use instead of just guessing or using whatever the convention is for existing classes. Husky (talk) 13:38, 29 May 2018 (UTC)

For organizations I think we consistently use P571. Look at the "see also" list for both properties, we have rather a large collection of properties with very close meanings there... I don't think merging would hurt; however P580 is often used as a qualifier (to indicate when a statement started to be true) while P571 is not, so that's one distinction that might get lost... ArthurPSmith (talk) 14:34, 29 May 2018 (UTC)

I don't suppose it would hurt to make P580 qualifier-only then... Husky (talk) 08:12, 30 May 2018 (UTC)

I'd support either merging these, or making it much clearer what the distinction is. When working with political data I often have to write quite awkward queries to handle the lack of consistency in which gets used (e.g. on instances of legislative term (Q15238777)). Having P580/P582 as qualifier-only would be enough for all the cases I can think of. --Oravrattas (talk) 09:54, 31 May 2018 (UTC)

I

Support making P580 qualifier-only, and moving the other uses to P571. ArthurPSmith (talk) 19:30, 31 May 2018 (UTC)

I think the idea behind not making P571 qualifier only was that it is better suited for events (not recurring ones, but single events, or instances of recurring events) and event-like items than P580. Like the broadcast of a TV series or a war are hardly something that fit P580. Of course, we can always rethink how to model these. – Máté (talk) 21:41, 31 May 2018 (UTC)

Cremation

Do we have a way of denoting that a person is cremated? And a way to turn off the error message for Findagrave for cremations. Findagrave demands that a place of burial be denoted. --RAN (talk) 13:31, 31 May 2018 (UTC)

@Richard Arthur Norton (1958- ): We have date of burial or cremation (P4602) if that helps to note that a person has been cremated. You may wish to propose extending the scope of place of burial (P119) to align with that of the date property. Mahir256 (talk) 21:05, 31 May 2018 (UTC)

Google isn't using us (sigh of relief)

Just ran across this. My first thought was - oh no, somebody was messing around on wikidata. But no, it wasn't us this time! :) ArthurPSmith (talk) 20:41, 31 May 2018 (UTC)

Looks like it was pulling from enwiki though: see this edit from a week ago which survived most of the past week. Interesting it was added by an IP address and also fixed by one. What happened to enwiki patrolling? ArthurPSmith (talk) 20:48, 31 May 2018 (UTC)

(ec) It came from enwiki, which needed almost one week to fix it. As far as I know, Google streams Wikipedia’s recent changes directly into their search index; use of Wikidata by them is not as obvious. —MisterSynergy (talk) 20:50, 31 May 2018 (UTC)

Făt-Frumos and Prince Charming

Do we have a property that would properly express the relation between Făt-Frumos (Q560382) and prince charming (Q1090843)? The best I can think of are said to be the same as (P460) and different from (P1889), but I'm not sure either is spot-on. - Jmabel (talk) 00:08, 31 May 2018 (UTC)

I wish we had a property "analogous to" or "similar to". If we can find a source that makes the comparison to Prince Charming,we could use said to be the same as (P460) with a reference including the exact quote. - PKM (talk) 22:08, 4 June 2018 (UTC)

@PKM: Does en:Făt-Frumos count? This from the Centre for Romanian Studies? - Jmabel (talk) 00:21, 5 June 2018 (UTC)

I'd use the link to Center for Romanian Studies as reference. :-) - PKM (talk) 00:26, 5 June 2018 (UTC)

This section was archived on a request by: Jmabel (talk) 22:00, 11 June 2018 (UTC)

[1] Maybe some works such as a Haiku might raise a grey zone here, but this is really not an important consideration for the topic of this discussion.

[note 1]