Shortcut: WD:PC

Wikidata:Project chat

From Wikidata
Jump to: navigation, search
Wikidata project chat
Place used to discuss any and all aspects of Wikidata: the project itself, policy and proposals, individual data items, technical issues, etc.
Please take a look at the frequently asked questions to see if your question has already been answered.
Also see status updates to keep up-to-date on important things around Wikidata.
Requests for deletions can be made here.
Merging instructions can be found here.

IRC channel: #wikidata connect
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2016/08.
Filing cabinet icon.svg
SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 1 day.





for permissions


for deletions


for deletion

for comment

and imports

a query



density vs. population density[edit]

I've noticed that density (P2054) has been used for "population density" - e.g. Rotterdam (Q34370), Rome (Q220), Lisbon (Q597), etc. Moreover, even worse, it is being used without any unit, so even if one accepted the wrong idea that density (P2054) can mean that, these numbers are still useless as it's not clear what they actually measure. I suspect most of those are produced by User:Titanopedia, but didn't check it.

So, should we have population density property? Should we keep those items around before the property is created and then migrate them? --Laboramus (talk) 08:16, 12 August 2016 (UTC)

A claim without unit and without source like that in Lisbon#P2054 is not worth migrating! -- Innocent bystander (talk) 14:06, 12 August 2016 (UTC)
Another thing: "population density" maybe is related to such a thing as "density" in English. But they are not at all related in other languages. If somebody only familiar with Swedish reads "density":1234 for UK they will finally understand why they meassure weight in "Stones". -- Innocent bystander (talk) 06:11, 13 August 2016 (UTC)
@Laboramus, Innocent bystander: I don't think we need a property for population density (Q22856). I mean, we already have population (P1082) and area (P2046), did we really need a third property who is just the division of the first two ? The only case I can see is if the source only give the density and not the population and the area, but it seems really very unlikely (or only if the source is bad, and then it wouldn't be a good idea to use it).
Indeed beware langages, I was puzled at first as in French the « densité » *can't* have a unit (as in French, « densité » is a false-friend for relative density (Q11027905)).
Cdlt, VIGNERON (talk) 16:49, 17 August 2016 (UTC)
@VIGNERON: I am not so sure that it is as simple as "is just the division of the first two". pop-density can be calculated both on the total area and sometimes on the land area. I am currently adding data about Swedish urban areas. The area of these were started to be reported in 1980 and both land and water area was then reported. Since 1990 only land area is measured and the water area is now not regarded as a part of the entity at all. For Swedish Municipalities there are four different areas reported, "land", "sea water", "water in the four great lakes" and "other lakes and watercourses waters". -- Innocent bystander (talk) 16:13, 19 August 2016 (UTC)
@Innocent bystander: sure, you need to to the right division with the right numbers (obvisouly, you don't divide either the 2010 population by the current area if there was a different area in 2010) but still, it's just a division, isn't it ? Cdlt, VIGNERON (talk) 18:19, 22 August 2016 (UTC)
Yes, in our templates at svwiki, we normally let the template do the division. The tricky part is when the latest updates of population isn't of the same date as the area. -- Innocent bystander (talk) 06:47, 23 August 2016 (UTC)
Wikidata:Property proposal/population density is currently open, you may wish to comment there. Robevans123 (talk) 10:33, 23 August 2016 (UTC)

Contribute to set a data quality framework for Wikidata[edit]

Dear Wikidata members, We are working on setting a data quality framework for Wikidata, as part of a research project carried out by members of the Web and Internet Science group of the University of Southampton.

Determining the quality of Wikidata is crucial for its future development. We believe that its community should have a primary role in defining what data quality means in Wikidata. Therefore, we would like to ask community members to contribute to our data quality framework draft by adding comments, suggestions, and concrete example of quality issues on Wikidata.

The draft has been published as a Request for Comment and can be found at this address: Data quality framework for Wikidata
Many thanks,
--Alessandro Piscopo (talk) 08:44, 12 August 2016 (UTC)

Hey :) Just for everyone's info: Alessandro has been working with us in the office for the past 2 weeks and it'd be great if you could support him in his work. I believe it will be valuable for Wikidata. --Lydia Pintscher (WMDE) (talk) 16:02, 12 August 2016 (UTC)
@Alessandro Piscopo: What about bringing quality statements (like the 1.0 classification) to Wikidata? Because they are language-specific they could be done as badges.--Kopiersperre (talk) 19:00, 12 August 2016 (UTC)
That would definitely be interesting. We should agree first what we mean with "quality" though. --Alessandro Piscopo (talk) 08:12, 16 August 2016 (UTC)

quality is in linking wiki links with wikidata statements[edit]

Hoi, attention to quality is good but I think the basics of what is perceived quality is in the occurrence of statements that describe links to other articles in Wikipedia. This allows for article level activitiy and work done in any language maps to work in all other languages. When we focus on what Wikidata is supposed to do in this way, most other quality considerations have a framework; the use that brings to being the data storage for Wikimedia projects. PS I blogged about this and welcome any arguments. Thanks, GerardM (talk) 12:21, 14 August 2016 (UTC)

@GerardM: interesting point of view. My doubt is: if Wikidata's quality should be considered in relation to what it can contribute to Wikipedia, don't you think that it may be limiting for the project? I think Wikidata might have much broader application that the mere support of Wikipedia. --Alessandro Piscopo (talk) 08:12, 16 August 2016 (UTC)
I see no reason why Wikidata should limit itself to being data storage for Wikipedia. If we have a recently discussed data set like >300.000 National Heritage buildings in the UK, most of that data isn't interesting for Wikipedia. On the other hand Wikidata works on integrating itself with OpenStreetMap and the from that point it can be quite useful.
There are also instances when Wikidata can be directly valuable. I use it for example as a multilingual dictionary for anatomy when I create my Anki cards.
I don't think that it's useful to think in terms of articles when looking at Wikidata. If an item has three links it might not be complex enough to make an article but those three links can still be very valuable to understand the structure of the underlying subject. ChristianKl (talk) 12:00, 23 August 2016 (UTC)
You forget the point to Wikidata. Yes, it can be more but the basics is that it supports Wikipedia and other projects. When it can bring substantial improvement in quality, both Wikidata and all the Wikipedias will benefit. This brings a practical and easily implementable difference that can be measured. It will brings us more contributors and this is imho more relevant than including external stuff. Thanks, GerardM (talk) 13:42, 23 August 2016 (UTC)
Wikipedia is certainly an important stakeholder but it's not the only stateholder that matters. Getting GLMA's (Galleries, Libraries, Archives, and Museums) to release their data is for example also very important project.
There are also various consumers of structured data besides Wikipedia. ChristianKl (talk) 15:48, 28 August 2016 (UTC)

Stuff about railways[edit]

I've several questions regarding the relationships between railway stations. I'm specifically working with the Pearl River Delta at the moment, but I'm certain these can apply generally.

  1. What would be the appropriate connections to make between Shenzhen Railway Station (Q837327), Luohu Station (Q843947), Luohu Port (Q877115), Lo Wu Control Point (Q23498332), and Lo Wu Station (Q15169)?
  2. Is it necessary to have connecting line (P81) as a property of a railway station and as a qualifier to an adjacent station (P197)? (This question also applies to connecting service (P1192).)
  3. In a similar vein, does having one of the aforementioned connection properties require the inclusion of the other?
  4. Should two metro 'lines' be considered 'services' if they share the same trackage at any point? (Here I'm thinking of The Loop (Q2225459) and much of the Washington Metro (Q171221), but also of the concurrency of Line 3, Shanghai Metro (Q1326495) and Line 4, Shanghai Metro (Q1326504).)
  5. What's the hierarchy between metro/rail systems, their component lines, and their stations? Are stations part of (P361) lines part of (P361) systems? (Or are stations and lines both part of (P361) systems?)
  6. How would direction (P560) work for rail lines that are in loops? (Should we just pick two or three stations and use them for orientation?)

The property discussion pages are not terribly active, so I'm hoping there's some sort of existing consensus on these matters. Mahir256 (talk) 04:02, 15 August 2016 (UTC)

Try Wikidata:WikiProject Railways. Its mainpage is quite empty (feel free to fill it), but the talk is alive.--Jklamo (talk) 07:30, 15 August 2016 (UTC)
The simple questions to answer are 3 and 6. For 3, yes, they should be considered separate services if they are presented as separate services in reliable sources (e.g. system maps clearly treat the District line (Q211265) and Circle line (Q210321) as separate services at e.g. Cannon Street station (Q800615) even though they share the same tracks).
For 6, I'd use clockwise direction (Q16726164)/anticlockwise direction (Q6692036) or whichever cardinal direction is travelled in to reach the next station (e.g. the Circle line (Q210321) from Cannon Street station (Q800615) to Mansion House tube station (Q1477336) is west (Q679)), depending what reliable sources describe it as. Thryduulf (talk) 08:45, 15 August 2016 (UTC)
connecting line (P81) is defined as a qualifier, rather than a "main property". Danrok (talk) 03:09, 22 August 2016 (UTC)

New gadget to sort the statements on items[edit]

Hello everybody,

As the sorting of the statements on a item page had issues for a while, I'm glad to annouce that there's now a gagdet for it! Gadget-statementSort.js sort all the statements of an item, based on a properties ordered list.

This gadget have been created by Ladsgroup, using a previous script writen by Soulkeeper. Thanks a lot for your work!

You can now enable this gadget in your preferences. If you have any question about the gadget or if you want to suggest some modifications on the properties list, don't hesitate to ask Ladsgroup or leave a message below.

Bests, Lea Lacroix (WMDE) (talk) 09:39, 16 August 2016 (UTC)

Great gadget! @Ladsgroup: It would be nice if I could completely overwrite the default property list by a self-maintained list in a custom .js page in my userspace. Could you implement something like that? Thanks and regards MisterSynergy (talk) 12:37, 16 August 2016 (UTC)
I made phab:T143383 to keep track of it :) Best Amir (talk) 04:53, 19 August 2016 (UTC)
@Lea Lacroix (WMDE): Is this gadget restricted to special browsers, or what am I looking for? -- Innocent bystander (talk) 09:19, 20 August 2016 (UTC)
@Innocent bystander: No it should work in any browser and statements should be ordered the same way across all items when you enable it. --Lydia Pintscher (WMDE) (talk) 17:55, 25 August 2016 (UTC)

Large upcoming data import[edit]

Just a heads-up: In 2014, I created ~40K items for Grade I and Grade II* listed buildings in the UK. This list has grown to ~44K buildings by now. As discussed here, we are preparing to import the remaining Grade II buildings, using a current list from National Heritage. That would be ~342K new items. You can see an example of what these items will look like at Morgan Hall, The Lawns (Q26263429). Unless there are serious objections, I will commence item creation this evening or tomorrow. The import will be single-thread, so as to not overload Wikidata, and will be bot-flagged (because RC). --Magnus Manske (talk) 14:33, 16 August 2016 (UTC)

No objections, thanks for these huge donations. Sjoerd de Bruin (talk) 15:07, 16 August 2016 (UTC)
Splendid news. I look forward to working on this important data. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:08, 16 August 2016 (UTC)

Update: Import has commenced after positive feedback :-) View progress here (may be mixed with other unrelated edits). --Magnus Manske (talk) 15:33, 16 August 2016 (UTC)

Thanks for updating these! Hopefully it will inspire some people to use the geograph image import on Commons to illustrate them. Jane023 (talk) 15:35, 16 August 2016 (UTC)
WD-FIST recently gained the ability to match image-less items and Commons images via coordinates (100m radius). Pure coincidence, surely. --Magnus Manske (talk) 15:46, 16 August 2016 (UTC)
@Magnus Manske: happy to see that other countries are working on historic buildings too. National Heritage List for England number (P1216) has some constraint violations on Wikidata:Database reports/Constraint violations/P1216. Do you plan to work on that too? Some are quite easy to fix, like Listed buildings in Christleton (Q15979145) where some bot added a bunch of identifiers. Multichill (talk) 20:27, 16 August 2016 (UTC)
I tried to fix some of these in 2014, but someone said since (in this example) I can't make a query of Listed buildings in Christleton (because it is quite hard to automatically find that level of location data), it should stay in there. I do disagree with that; maybe we should try to get village-level information through some combination of the location name in the raw data (which is ambiguous), the larger region (which is not), and the coordinates. --Magnus Manske (talk) 20:35, 16 August 2016 (UTC)
Wouldn't it be better to include the lowest level of admin territory (civil parish) rather than the higher ones of district or county? BTW Christleton is a civil parish... Robevans123 (talk) 07:02, 17 August 2016 (UTC)
Of course it would be better. But on the "lowest level", there are many places that share the same name. And even if a Wikidata search only turns up a single one, how do I know it's the only one, and not just missing items for the others? National Heritage data doesn't come annotated with Wikidata item numbers, you know... --Magnus Manske (talk) 07:56, 17 August 2016 (UTC)

Update: Now trying to use more fine-grained located in the administrative territorial entity (P131). I was careful to get it right, but with these numbers, there is always a chance of some of them being wrong. Nothing that can't be fixed, but be aware just in case. --Magnus Manske (talk) 19:14, 17 August 2016 (UTC)

Cool - great to have the extra detail. Sorry for the extra work -:) Robevans123 (talk) 12:31, 18 August 2016 (UTC)
  • The example item doesn't contain "instance of" building. I think it would be great if you assign an instance of to the items you created.
A heritage status can change. It would be good if you add a "retrieved" source qualifier that tells the reader when the statement got created. It would also be helpful if reference url is filled as "" for the example item.ChristianKl (talk) 15:12, 19 August 2016 (UTC)

UPDATE This has now been completed. --Magnus Manske (talk) 23:12, 23 August 2016 (UTC)

Policy on Interface Stability: final feedback wanted[edit]

Hello all,

repeated discussions about what constitutes a breaking change has prompted us, the Wikidata development team, to draft a policy on interface stability. The policy is intended to clearly define what kind of change will be announced when and where.

A draft of the policy can be found at Wikidata:Stable Interface Policy. Please comment on the talk page.

Note that this policy is not about the content of the Wikidata site, it's a commitment by the development team regarding the behavior of the software running on It is intended as a reference for bot authors, data consumers, and other users of our APIs.

We plan to announce this as the development team's official policy on Monday, August 22.

-- Daniel Kinzler (WMDE) (talk) 14:50, 16 August 2016 (UTC)

It might make sense to announce breaking changes with [Breaking Change] or a similar tag, so that it's not required to read all mails that go through the mailing list. ChristianKl (talk) 09:59, 23 August 2016 (UTC)
Yes, that's sensible. Perhaps we'll add it to the policy. -- Daniel Kinzler (WMDE) (talk) 14:39, 23 August 2016 (UTC)

The policy is now official, see #Announcing the Wikidata Stable Interface Policy. -- Daniel Kinzler (WMDE) (talk) 14:39, 23 August 2016 (UTC)

Multiple items with identical sitelinks[edit]

Examples: Tasley (Q24668011) and Tasley (Q24668012). Not a good sign. --Magnus Manske (talk) 15:53, 17 August 2016 (UTC)

see Wikidata:True duplicates --Pasleim (talk) 15:57, 17 August 2016 (UTC)
Yes, that's a known problem, sadly it's not trivial to fix. The cause is that bots often double post entity creation requests to the API, so that both entities are created at nearly the same time, leading to our uniqueness constraints not working. Cheers, Hoo man (talk) 16:06, 17 August 2016 (UTC)
Why can't a bot merge items like that automatically? ChristianKl (talk) 15:28, 23 August 2016 (UTC)

ItemDisambiguation limit at 100[edit]

The display limit for Special:ItemDisambiguation is set at 100. In cases of some ordinary names, e.g. John Campbell, this limit may be attained, or nearly so, for what is a reasonable request. In other words using such a page for normal disambiguation may be close to failing, and will fail as more items and aliases are added.

Could the number of hits be displayed? Could there be some fallback to a second page? It is highly desirable that this Special page should function as the global disambiguation equivalent for en:w:John Campbell, for example. Charles Matthews (talk) 06:12, 18 August 2016 (UTC)

As a work-around you can use this SPARQL query. --Edgars2007 (talk) 10:06, 18 August 2016 (UTC)

Thanks. Charles Matthews (talk) 07:06, 19 August 2016 (UTC)

Using Reasonator for disambiguation is much easier and informative.. Try John Campbell. Thanks, GerardM (talk) 05:56, 26 August 2016 (UTC)

Unsourced and Wikipedia sourced P91 statements[edit]

What to do with unsourced and Wikipedia sourced sexual orientation (P91) statements? This really troubles me. The property has been used 3613 times, 2824 don't contain sources. Don't know how much have Wikipedia as source. Sjoerd de Bruin (talk) 21:28, 18 August 2016 (UTC)

613 has imported from (P143). Count. --Edgars2007 (talk) 21:39, 18 August 2016 (UTC)
Maybe there is the same problem for religion (P140) --ValterVB (talk) 06:44, 19 August 2016 (UTC)
If no source, delete. Snipre (talk) 09:39, 19 August 2016 (UTC)
No: If no source, find one and add it. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:56, 19 August 2016 (UTC)
We can't add sensible data on person without source. If some users add this data but don't add source makes an error. I think that the data is to be deleted. --ValterVB (talk) 11:08, 19 August 2016 (UTC)
If there is no source and the claim is plausible then look for a source. If you find a source, add it. If you don't find a source after looking and the claim is contentious or potentially so and/or your search was extensive and thorough then remove it. If the claim is not plausible, remove it. We really need a way of flagging the remaining cases where someone has done no or only a cursory search. Thryduulf (talk) 11:50, 19 August 2016 (UTC)
Nobody can force someone to look for sources, but Wikimedia Foundation, on biographies of living, asks you to add the sources, so if there are no sources for these data should be deleted. We can't keep sensible data indefinitely without source. --ValterVB (talk) 12:14, 19 August 2016 (UTC)
+1. WP require sources and WD aims to provide sourced data, so if people don't want to play the game, their contributions have to be deleted as useless and potentially subject to conflict. Snipre (talk) 16:15, 19 August 2016 (UTC)
So by this logic all statements on items about living people without a reference should be deleted (maybe except external IDs). I see no point in singling out this specific property. If it has to go, all does, or we accept that statements need to be decided about one by one (like Thryduulf said). And—still by this logic—the possible automatic statement deletion would not concern items about people who have passed away (still, talking about all properties, not only P91). – Máté (talk) 16:31, 19 August 2016 (UTC)
Not all the data are problematic, but certainly the data regarding religion or sexuality are more delicate and must be sourced. --ValterVB (talk) 16:46, 19 August 2016 (UTC)
Well, I'd add at least age, gender, residence, ethnic group, birth name etc. to the list of potentially just as sensitive data as sexual orientation and religion are. – Máté (talk) 16:51, 19 August 2016 (UTC)
(edit conflict) I think it is possible to define different priorities for properties in terms of how required sources generally are. Something like religion is pretty high up in most cases (I don't think it vital that a Catholic cardinal has source for religion, a US presidential candidate on the other hand probably is), however a handedness (P552) statement is almost never going to be controversial and so I don't think we should remove them without having looked for sources. External identifiers are almost always going to be self-sourcing, so we can treat them as completely uncontroversial. I'd suggest levels:
  1. always required, will be deleted if a source is not provided within a short time of the statement being added (should only be used for a very few properties and almost never when used for deceased people);
  2. almost always required, will normally be deleted when applied to living people or recently deceased people if a source is not provided but exceptions are possible based on common sense, especially for deceased people. (more than level 1, but not too many)
  3. Should be provided, statements should be accompanied by a source but they will not be routinely deleted without consideration of the circumstances (this should be default for non-external id properties)
  • low priority, statements should be accompanied by a source but they will not normally be deleted unless verification has failed or the statement is both implausible and applied to a living person (only things that will rarely be controversial should be at this level)
  • self-sourcing, no independent source is required (probably only applies to external identifiers). Thryduulf (talk) 16:58, 19 August 2016 (UTC)
we need a team to source statements. the people who periodically drive by and suggest statement deletion, are not collaborating and improving the data. "so if people don't want to play the game", they can take their deletion game elsewhere. there is no consensus for required references. Slowking4 (talk) 12:17, 21 August 2016 (UTC)
other self-sourcing statements are bibliographical data of Books or Texts, when the item is linked to the source (on wikisource for ex.). It seems very irrelevant to say that "Title" is so and so, and source on... the Book itself, which would be linked in the wikisource link... :) --Hsarrazin (talk) 18:45, 23 August 2016 (UTC)
From the last dump, claim without source:

I think that is necessary delete them --ValterVB (talk) 18:30, 26 August 2016 (UTC)

Iff they cannot be sourced then they should be deleted. But how about creating a list of items that need sources finding - in many cases I suspect that there will be a source in one or more of the attached Wikipedia articles that just hasn't been copied to here. Thryduulf (talk) 20:17, 26 August 2016 (UTC)

local Wikibase[edit]

I'm trying to use Wikibase in my local wiki and I am half way done. But I have problems still (I don't know if it's better to ask at developers forum?):

  • After some change I cannot edit or create claims. Specifically: for creation I click "edit", then I choose right property and get eternal "loading sign" instead of input field; for editing - just nothing happened after clicking "edit". Editing through API with my bot is possible though.
  • How to edit label and description in other languages? I've installed Babel extension, imported LabelLister gadget but nevertheless I see only 1 language.
  • Even in that language I cannot edit label+desc+aliases simultaneuosly: only one of them is saved at a time, then reload of a page is needed to save another.

Can anyone help? --Infovarius (talk) 09:54, 19 August 2016 (UTC)

@Hoo man: probably better for a dev to look at this one. --Izno (talk) 11:28, 19 August 2016 (UTC)
@Infovarius: I'm quickly replying point by point:
  1. Make sure you are running Wikibase master and MediaWiki master. Do you get any JS errors in this case? If so, please report them.
  2. Installing Babel and editing your user page should be enough. Please note, that you will need to run jobs (run maintenance/runJobs.php in MediaWiki's root directory) in order for your user pages categories to be written to the database, so that Babel can pick them up.
  3. See #1
Cheers, Hoo man (talk) 13:03, 19 August 2016 (UTC)
Thanks for answer. I've installed master version of Wikibase and get "Fatal error: Class 'Wikibase\DataModel\Entity\ItemId' not found in C:\xampp\apps\mediawiki\htdocs\extensions\Wikibase\lib\WikibaseLib.entitytypes.php on line 37". Then I revert Wikibase upgrade.
I've copied MediaWiki 1.27 over my 1.26. Now I see all required labels/desc/aliases(!), but editing leads to error "SyntaxError: Unexpected token < in JSON at position 0".
After rolling master version of Wikibase again I get "Fatal error: Class 'Wikibase\Lib\DataTypeDefinitions' not found in C:\xampp\apps\mediawiki\htdocs\extensions\Wikibase\repo\includes\WikibaseRepo.php on line 300". I revert Wikibase again.
In console I have: 1) Unknown dependency:; 2) ReferenceError: $ is not defined(anonymous function) @ Item:Q2:935 --Infovarius (talk) 15:42, 22 August 2016 (UTC)
Did you run composer update for both MediaWiki and Wikibase after updating them? For MediaWiki you will also need to run maintenance/update.php after applying the update. Cheers, Hoo man (talk) 15:50, 22 August 2016 (UTC)
Oh, that helps! All described problems are gone, thank you! During maintenance/update.php there was an error "Error: 1071 Specified key was too long; max key length is 767 bytes", but I see no problems in functioning yet. --Infovarius (talk) 15:46, 23 August 2016 (UTC)

Hierarchical data examples[edit]

I am trying to find how hierarchical information is stored in wikidata? Most of the Olympics Sports pages have a Tournament Draw or Bracket like this or this. I would like to add this "who played who, at which stage" info, but I am a bit confused whether such info gets stored. The docs say lists and infoboxes are the main focus, so is this something that will be handled later? If its already being done could someone please point at a tutorial or examples for a beginner. Thanks! Quil1 (talk) 03:24, 20 August 2016 (UTC)

You could create an item:
Label: "W-j Kim vs. R Ega Agatha in round of 32 of the Archer at the 2016 Summer Olympics"
instance_of: "archery contest round at the Olympics"
participant: "W-j Kim"
participant: "R Ega Agatha"
is_part_of : "round 32 of the Archer at the 2016 Summer Olympics"
succeeds : "R Ega Agatha vs M Nespoli in round of 16 of the Archer at the 2016 Summer Olympics"
follows : "W-j Kim vs. G Sutherland in round of 64 of the Archer at the 2016 Summer Olympics"
follows : "R Ega Agatha vs. Y Xing in round of 64 of the Archer at the 2016 Summer Olympics"
This follows/succeeds? pattern stores the data in a Wikidata friendly way. However at the moment there's no easy way to integrate such data into Wikipedia. It's also possible that there are more specific properties than the one I listed here. ChristianKl (talk) 09:48, 26 August 2016 (UTC)
Hey, thanks for the reply. I have been wondering how to do this and was thinking along the lines of your suggestion. The whole thing does seem hard. And being new here maybe I should just pick something simpler to start with. Ideally there needs to be an Item (or a wikipedia page) for each MATCH containing round/tournament/score/winner etc. Round being a substitute for your follows/succeeds model. I am not sure but I think it would automatically create the linkages. So I was looking around and I found Match Box Templates for many sports eg - [1] [2] But not many sports have individual match pages (probably just too much data). That said lot of the data is found in these brackets and draws. So I am thinking I will pick a small tournament and start creating the items. Slowly going through the all docs to learn how to do that. I think it would be a great if through SPARQL one could run queries like - who played the semis of the 2015 Wimbledon or what was Federer's path through the US Open etc as lot of the info already exists in these pages. Quil1 (talk) 07:23, 27 August 2016 (UTC)
More a general comment: There is a Wikidata:WikiProject Sport results, but it is not very well developed. At this point we do not have an established concept how to add sports results to Wikidata, so you either need to develop something by yourself which might be superseded at some point in the future with a different approach, or you have to wait for a situation at which things are further developed by others. To my own experience there is still enough work to do on a much more basic level: make sure that sports persons, tournaments, organisations, venues, equipment, general items, etc. are properly defined, labelled, and linked to each other. Without such a robust data base it’d probably be difficult to model sports results anyway. Regards, MisterSynergy (talk) 07:33, 27 August 2016 (UTC)
I think thats what I was looking for and looks like someone has proposed matches Will follow up with them to see whats going on. Thanks! Quil1 (talk) 08:41, 27 August 2016 (UTC)

Aquarium life[edit]

aquarium fish (Q1448518) = "aquarium life" aka "aquarium animals" = "ornamental fish" is a categoría de Wikimedia aka Wikimedia list article? -- 00:36, 21 August 2016 (UTC)

No, it's not. It's a subclass of (P279) of fish (Q152), whose topic's main category (P910) is Category:Aquarium fish (Q8084627) and its Commons category (P373) -> Category:Aquarium fish. Strakhov (talk) 01:00, 22 August 2016 (UTC)

Property for listing teams than an individual has coached/managed[edit]

I was unable to find a property that could be used to specify the sports teams than an individual has coached/managed. I do not consider it ideal to use P:54 in such a context as it relates more to the teams an individual is associated with as a player, not as a coach. Does the property I am looking for need to be created or is there one that I simply have not found yet? Thanks, Lepricavark (talk) 00:12, 23 August 2016 (UTC)

Wikidata:Property proposal/head coach of is currently open, you may wish to comment there. Thryduulf (talk) 00:29, 23 August 2016 (UTC)
Thank you. I will do so. Lepricavark (talk) 11:07, 23 August 2016 (UTC)

How do I properly use Population property?[edit]

I try to use it but however, it always has a plus minus sign with "1". How do I fix this? MechQuester (talk) 06:07, 23 August 2016 (UTC)

If the number is exact, and there is not a range of valid values, write "+-0" after the digit entered. --β16 - (talk) 07:26, 23 August 2016 (UTC)

New gadget: currentDate[edit]

Hello everyone,

A new gadget has been added to our collection, this one (currentDate) automatically fills in the date of today when you use retrieved (P813). This one seems a perfect candidate to be enabled by default, we only need consensus for it. Thanks to TMg for creating this gadget!

Greetings, Sjoerd de Bruin (talk) 08:18, 23 August 2016 (UTC)

Seems a very useful gadget. Support making it default. Thryduulf (talk) 09:31, 23 August 2016 (UTC)
Agree. Robevans123 (talk) 10:35, 23 August 2016 (UTC)
I often add a "past date" to this property, but I still approve this! -- Innocent bystander (talk) 11:56, 23 August 2016 (UTC)
JFDI applies. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:17, 23 August 2016 (UTC)
at last ! I"ve been waiting for this soooo long !
support making it default too ! --Hsarrazin (talk) 19:08, 23 August 2016 (UTC)

Announcing the Wikidata Stable Interface Policy[edit]

After a brief period for final comments (thanks everyone for your input!), the Wikidata:Stable Interface Policy is now official.

This policy is intended to give authors of software that accesses Wikidata a guide to what interfaces and formats they can rely on, and which things can change without warning.

The policy is a statement of intent given by us, the Wikidata development team, regarding the software running on the site. It does not apply to any content maintained by the Wikidata community. -- Daniel Kinzler (WMDE) (talk) 14:37, 23 August 2016 (UTC)

Wikidata weekly summary #223[edit]

UBERON ID (P1554) should be datatype external id[edit]

There's a formatter url for UBERON ID (P1554) but currently the datatype is string, so it doesn't get automatically used. I think it would be benefitial to change the datatype to external id. ChristianKl (talk) 18:54, 23 August 2016 (UTC)

I don't think it will work with multiple formatter IDs. --Izno (talk) 19:07, 23 August 2016 (UTC)
@Izno: That hasn't stopped such properties like P:P2182 from being converted. -- Innocent bystander (talk) 05:23, 24 August 2016 (UTC)

Property to model that the biceps brachii "flexes elbow"[edit]

Currently enWiki displays in its infobox on that the biceps brachii has the "action" "flex elbow". Do we have an existing property to model this relationship? ChristianKl (talk) 19:11, 23 August 2016 (UTC)

I suppose you could say
< biceps brachii > use (P366) See with SQID < flexing >
of (P642) See with SQID < elbow >
but that seems very cumbersome and may not easily generalise. Thryduulf (talk) 21:51, 23 August 2016 (UTC)
That seems cumbersome so I produced a new property: ChristianKl (talk) 08:32, 24 August 2016 (UTC)

How to create a GUIDELINE for specific task or project here?[edit]

I was drafting here an Guidelines for external relationships to consolidate discussions and create an help page for new task-force people...

There are some example of that kind of Guideline (for an external task force linking ontologies)? Where the best place, here? Wikiversity?

PS: we are linking SchemaOrg with Wikidata... It is a startup project, and need some consolidation and collective consensus, to achieve the quality levels that we need.

Start a WikiProject. ChristianKl (talk) 08:15, 24 August 2016 (UTC)

"missing" items for series of ...[edit]

Within the realm of serial items. Some items cannot currently be given correct claims for instance of (P31), just because there is no existing class for it.

For example, we do have book series (Q277759) which is fine for books, but we do not have an item for say, a "series of events" which could have sub-classes such as "series of military campaigns".

Currently, we have a lot incorrect claims along the lines of Battles of Khalkhin Gol (Q188925), which is incorrectly claimed as a instance of a battle, despite being a series of battles.

So, I plan to create some suitable items for this. If anyone has any input or suggestions then please do comment. Danrok (talk) 00:32, 24 August 2016 (UTC)

For Battles of Khalkhin Gol (Q188925) in is a series but in other wiki is a single battle. Before change P31 is necessary to check and split the item. --ValterVB (talk) 06:49, 24 August 2016 (UTC)
Mmm tough problem at first sight. My first though is that "battles" and "series" do not mix very well all the time. Composition seems to be a better fit, if for one reason battles of some war can overlap and have sometime no obvious precedence order. The order might in some way related to a causality sequence, a battle can have been thought by one army because another one have been lost by the same army. A war is composed of several battles, and maybe several "sub-wars" ? A casestudy could be Hundred Years' War (Q12551) View with Reasonator See with SQID I guess. Does a war is a case of conflict that begins with a declaration of war (Q334516) View with Reasonator See with SQID and ends with a peace treaty of another way ?
May a war as a consequence cannot be composed of smaller wars and the composition holds at some higher level like an armed conflict that can be composed by some other armed conflict ? Can a battle be composed of smaller battles ?
A lot of questions and no answer, sorry /o\ author  TomT0m / talk page 07:35, 24 August 2016 (UTC)
The definition of battle is "part of a war which is well defined in duration, area and force commitment". I don't see why a series of battles can't be a battle according to that definition. It's also worth noting that different cultures speak differently about the same event and there's no reason why the English version should have preference. ChristianKl (talk) 10:23, 24 August 2016 (UTC)
Try significant event (P793) --Succu (talk) 21:18, 24 August 2016 (UTC)
part of (P361) is your best tool here, I think.
"Battle" is a pretty broad term, and a "battle" can include other "battles" - consider Battle of the Somme (Q132568) (1 July - 17 November 1916), which began with Battle of Albert (Q1992231) (1 July - 13 July), then Battle of Bazentin Ridge (Q2634717) (14 July - 17 July), and so on. These can all be "instance of: battle" and part of (P361) of the larger battle. They may also contain smaller events, which might also be labelled battles, or P31:engagement (Q6680005) (though I don't think many pages use these yet).
In some cases, battles might also be P361 of a military campaign (Q831663), a connected series of battles - so Battle of Kvam (Q20112888) is P361:Operation Weserübung (Q150939) which is P361:Norwegian Campaign (Q5084679)... and in the grand scheme of things, P361:World War II (Q362). So you can use P361 to go all the way up and down the chain.
In the specific case of Battles of Khalkhin Gol (Q188925), I'd consider whether P31:military campaign (Q831663) is the best way to go - they all fitted together as part of an overall series of battles. Andrew Gray (talk) 21:35, 24 August 2016 (UTC)
@Andrew Gray: military campaign (Q831663) Mmm is not this related to only one opponent actions ? A campain is how one side organized its actions imho. So it's in most case not appropriate as a battle is the sum of the actions of both side. It can be even a NPOV problem if you mix the two concept inappropriately. author  TomT0m / talk page 06:18, 25 August 2016 (UTC)

I think I picked a bad example, battles and wars are complex to define. Plus, there's the language problem, and the different way things are defined in different languages. Danrok (talk) 21:40, 24 August 2016 (UTC)
Just explicit the definitions you use. It's not exactly a problem specific to wars :) Wikidata should be definition based, not term based. author  TomT0m / talk page 06:18, 25 August 2016 (UTC)

Q18396215 to be merged with Q16983762[edit]

per similarity of the subjects covered. -- 10:48, 24 August 2016 (UTC)

Thanks, but you could have merged them yourself. Jared Preston (talk) 11:19, 24 August 2016 (UTC)
How could I do that? -- 12:11, 24 August 2016 (UTC)
Help:Merge. —MisterSynergy (talk) 12:18, 24 August 2016 (UTC)
It works! Thanks. -- 12:24, 24 August 2016 (UTC)

creating guidance on the process of importing data into Wikidata[edit]

Hi all

I've started a draft of guidance on importing data from external datasets into Wikidata, its very rough at the moment, I would very much appreciate some help.


--John Cummings (talk) 12:27, 24 August 2016 (UTC)

Perhaps something regarding sourcing and references? Danrok (talk) 12:41, 25 August 2016 (UTC)

Wikipedia corpus to Wikidata[edit]

Extracting data from corpus is a very promising and challenging problem, many research teams around the world are working on this problem, some of the examples: creating a module that make diagnose of a disease by scanning medical corpus, creating a voice assistance and much more. Wikipedia “as source of corpus” while wikidata is the “structured data” of that corpus. Offers a great environment to approach this problem.

Before getting into the details of this problem, I am interested to know how open is the Wikidata community to this problem? --GhassanMas (talk) 19:11, 24 August 2016 (UTC)

Wikidata imports a lot of information from Wikipedia already. In general Wikidata prefers to have data that cites sources from outside of Wikipedia.
Additionally there are the projects and that try to extract facts via machine learning and provide it to the PrimarySource tool where WikiData editors can approve or disapprove suggestions from the machine learning algorithms.
If you are an academic working on extracting facts from corpus data there's a good chance that you can work well with Wikidata, but it's worth to first understand the structure of Wikidata and how it plays with the Primary Source tool. ChristianKl (talk) 15:10, 25 August 2016 (UTC)
It is important to understand that the Primary Source tool is something that is not used for importing Wikidata and many other sources by the Wikidata community. Understanding this difference is vital. The biggest problem is not getting data into the PMS but finding people to consider the data in there. The statistics prove that the PMS is dysfunctional. Thanks, GerardM (talk) 05:54, 26 August 2016 (UTC)
"Not used" doesn't seem to be what the statistics says. It doesn't get used as much as desired but it still get's used. Even if only a subset of the data is evaluated by humans, an academic group that provides data via the primary sources tool can expect to get some feedback over what claims get approved and rejected.
I consider the PMS a work in progress. With increased data there's a higher probability that when I browse an item the PMS will suggest a statement or reference that I find valuable to add.
More data also means that there are higher returns to improving the UI of the PMS and thus improving usage. ChristianKl (talk) 09:44, 26 August 2016 (UTC)
Seriously, when you consider the number of statements approved and the process whereby this happened you would not say this. I regularly "approve" info from Freebase on the basis that it is likely correct no verification happens and consequently the statistics do not only show little traffic, it does not show either that there is a meaningful process going on. Thanks GerardM (talk) 10:12, 27 August 2016 (UTC)

Preferred sourcing for imported data already linked in identifiers?[edit]

I'm rewriting the script I use to generate items such as Thomas Thompson (Q26689403) to include sourcing. At the moment, every item has a History of Parliament ID (P1614) property, giving a clickable link to the main source. Given this, should I source the individual statements as state:en-1 (P248):The History of Parliament (Q7739799), or as reference URL (P854):(the URL from P1614)? I'm leaning towards the reference URL (P854) approach but thought I'd better check which is preferred. Andrew Gray (talk) 19:50, 24 August 2016 (UTC)

ValterVB adviced me to use reference URL (P854) in similar cases, but we'd like to hear other opinions. --Epìdosis 20:04, 24 August 2016 (UTC) I've read again your message: in these cases I've always used state:en-1 (P248) + reference URL (P854); the properties ValterVB adviced me never to use in references are identifiers (in this case History of Parliament ID (P1614)). --Epìdosis 20:08, 24 August 2016 (UTC)
per Help:Sources#Databases you would need to use state:en-1 (P248):The History of Parliament (Q7739799) and History of Parliament ID (P1614) as source. --Pasleim (talk) 20:09, 24 August 2016 (UTC)
I didn't know it. Thank you, Pasleim. --Epìdosis 20:36, 24 August 2016 (UTC)
Interesting. Epìdosis's recommendation feels much more natural to me. The database approach it seems to require that we tag The History of Parliament (Q7739799) as P31:database, which would be wrong - it's a reference work that we record identifiers for, not a "database" like PubChem. It fits a lot better with the website approach in the section above (P248/P854). Andrew Gray (talk) 20:49, 24 August 2016 (UTC)
@Andrew Gray: So what is The History of Parliament (Q7739799) ? Because when I read the label of History of Parliament ID (P1614), I have "identifier on the History of Parliament website". According to that definition we have an online database. Snipre (talk) 21:01, 24 August 2016 (UTC)
Fair point - the difference here is a bit hazy :-). I wouldn't call it "instance of: database", though - and according to the help page that's a key element. Andrew Gray (talk) 21:17, 24 August 2016 (UTC)
@Andrew Gray: The first thing is to clarify the status of The History of Parliament (Q7739799): an item can't be at the same time a project and a reference work. Then if we consider that History of Parliament ID (P1614) is the identifier of the online version of The History of Parliament defined as a reference work then we can then follow the recommandations of Help:Sources#Databases. Here we have to better define what we want to link because The History of Parliament is now the same denomination for 3 things: a project (which is an organization), a reference work (published mainly as books) and a website build as a database. Theoretically we should have 3 items, each describing these 3 different concepts. Snipre (talk) 21:33, 24 August 2016 (UTC)
Hmm. I agree that "project" and "work" is a little confusing, but it's good enough for the moment (there is a project, it produces a work). We can fall back on only "work", though, if preferred. However, it's not a database nor is it built as a database - unless we define everything online as a database! I'm taking the information from there but I'm transcribing it by hand and then uploading with a script. Perhaps I shouldn't have used the word "imported"... Andrew Gray (talk) 21:43, 24 August 2016 (UTC)
@Andrew Gray: No, it's not good enough because if you try just to expand a little the information of The History of Parliament (Q7739799), we will have problems when adding specific properties. Can a project have a author (P50) property ? Can a reference work have a inception (P571) property ? If we follow your reasoning "there is a project, it produces a work", all data present in WD should be stored in the entity (Q35120) item as everything is an entity. We have to create enough items to identify correctly the concepts in order to avoid the current problem of using the same item for different purposes.
Then what is a database ?
* Systematically organized or structured repository of indexed information (usually as a group of linked data files) that allows easy retrieval, updating, analysis, and output of data
* A comprehensive collection of related data organized for convenient access, generally in a computer
* A collection of pieces of information that is organized and used on a computer
So the website can be considered as a database because one part of it at least is composed of structured information accessible by an automatic query. Snipre (talk) 07:56, 25 August 2016 (UTC)
By this definition, so can literally any website with information on it in an organised form. Wikipedia is a 'database'. I don't think it's a very helpful way of thinking about things, and to be honest I think the arbitrary distinction between 'website' and 'database' made by the help page just causes more confusion. Andrew Gray (talk) 20:43, 25 August 2016 (UTC)
If retrieved (P813) is intented to be added too, I'd say using the ID property for sourcing would be wrong, because the date won't be true if the formatter URL changes (?). Strakhov (talk) 23:06, 24 August 2016 (UTC)
@Strakhov: You have retrieved (and verified) the information stored in the statement at the given date rather than retrieved a particular URL. I can see your point, but I don’t think we should worry about this one. —MisterSynergy (talk) 08:39, 25 August 2016 (UTC)
Just food for thought. Not even sure if it was of the nutritive kind when I wrote it. :) Strakhov (talk) 11:43, 25 August 2016 (UTC)
In this case, the URL and ID property are fairly interchangeable anyway - the ID is a URL slug :-) Andrew Gray (talk) 20:43, 25 August 2016 (UTC)
From what I can see, the use of such properties like History of Parliament ID (P1614) in the references makes state:en-1 (P248) redundant! In the page of P1614, there is a subject item of this property (P1629)-claim that links to The History of Parliament (Q7739799). That chain of relations is probably enough to describe this. In fact, that relation have we used on svwiki, when we decipher the references here at Wikidata. See for example note 1 at sv:Adelaide av Bourbon-Orléans. The use of both P248 and P227 here now gives two links to "Gemeinsame Normdatei", one of them are obviously redundant here, and I think it is P248! -- Innocent bystander (talk) 07:39, 25 August 2016 (UTC)
@Innocent bystander: No, state:en-1 (P248) is not redundant because it gives you directly the item where you can find additional properties if needed. Without state:en-1 (P248) you have to first retrieve the item connected with the property and then you can retrieve the data you want. If you take the time to look ãt the current templates used in the different WPs to cite sources, you can see that most of them require much more data than available in the sources section below a statement. So better to provide from the beginning the most related items of a source in order to reduce queries. Snipre (talk) 14:40, 25 August 2016 (UTC)
  • Pinging @Carcharoth:. He was interested in this stuff two weeks ago. Strakhov (talk) 21:14, 24 August 2016 (UTC)
    • Sorry, no idea what I can usefully say! Am following it with interest, though. Carcharoth (talk) 22:10, 24 August 2016 (UTC)


Demonstrating the options - all on an item which already has History of Parliament ID (P1614):1790-1820/member/thompson-thomas-i-1767-1818

Just P248 - item link to work
Just P854 - reference URL
P248 & P854 - item link and reference URL
P248 & P1614 - item link and property with identifier
According to Help:Sources, for a "database"

This structure avoids any change of the source structure in case of URL modification (if the URL change, you just have to correct the URL in the property History of Parliament ID (P1614)) and allows a nice link where the long URL can be hidden below the title value when using the source in Wikipedia (everyone prefers to see something like that Thomas Thompson instead of that ).

I'm reluctant to use P1476 here because it would involve a lot more effort (I'd have to call up and scrape a few thousand pages to get the title phrase for each statement), but I accept that's not a very good argument against it ;-) Andrew Gray (talk) 20:43, 25 August 2016 (UTC)
A year ago I wondered why a title (P1476) is at all necessary, since one could simply use the item’s label as a reasonable title for instance. However, User:Snipre came up with a convincing argument on Help talk:Sources: there are “Bonnie and Clyde problem”-like situations with Wikidata items and external database entries that do not have a 1:1 equivalence, and thus a Wikidata label is not necessarily identical to the title of a corresponding database entry. This does indeed require extra effort, but I think it is worth to do this work. —MisterSynergy (talk) 20:54, 25 August 2016 (UTC)
At some point I'm going to scrape all of these and do a lot of processing for a different part of the project. I might leave off doing P1476 for now and then go back and add it to all the relevant qualifiers in one go when I'm processing the pages anyway - that would simplify the item-creation work for now and allow me to get that part completed. (I still have ~11000 to go!). Andrew Gray (talk) 17:36, 26 August 2016 (UTC)

Follow-up questions[edit]

This is somehow related to the above discussion, thus I put it here. The structure of references is important to give data users (e.g. in Wikipedias) the ability to easily generate references based on our data. They (a) need to find all relevant information for a valid reference (e.g. in en:Template:Cite web), and (b) expect to find that information in always the same structure (i.e. our references are always composed of the same properties). However, we have a couple of different reference structures defined in Help:Sources (“Database”, “Webpage”, “Book”, etc.) and I have two follow-up questions:

  • How can one see which reference structure is actually used in a particular case? Take all properties and decide whether they form something useful?
  • Is there any technique known to query “incomplete” sources here at Wikidata, e.g. by using the Wikidata Query Service? This would be useful for reference maintenance.

I would be happy to hear about your ideas. Thanks, MisterSynergy (talk) 08:51, 25 August 2016 (UTC)

  • One large problem with adapting Wikidata-references with templates like "Cite web" is that we here technically allow 17 reference URL (P854) inside the same reference. (Nothing is technically stopping that option.) And no template on any Wikipedia is adapted to that. The template we use on svwiki in the article sv:Adelaide av Bourbon-Orléans (which I mention above) on is adapted to any number of reference-url's. The references will look very strange, but they are fully readable, all 17 of them. This template is not based on any present template, it was instead adapted to the open framework we have here on Wikidata. It has many flaws, but it is a start. -- Innocent bystander (talk) 10:33, 25 August 2016 (UTC)
Interesting. Are there any use cases for multiple reference URL (P854) (or identifier statements) in a single reference? Can’t we define those references as “technically invalid references” and make them show up on maintenance lists? Who picked the number of 17 and why? —MisterSynergy (talk) 10:47, 25 August 2016 (UTC)
I do not know if multiple P854 is a big issue, and it should probably be regarded as something that has to be maintained. I have at least seen references with two valid urls. My point is that unexpected use of properties are to be expected in our references. It is a good intention to maintain those, but my experience from Wikipedia and from our contraints-lists here at WD is that these maintainance-lists tends to be longer and longer by time. We have to take into consideration that we never will fix them all. One of the most common use of references here is the cases when an url is all that is found in the reference. If the webpage is in Armenian or the webpage is dead, very very few of us can set a "title" to such a reference. -- Innocent bystander (talk) 12:50, 25 August 2016 (UTC)
unexpected use of properties in references — I really feel uncomfortable with such a guideline, although I admit that it might be the best we can reasonably achieve here. I’m from dewiki, whose community is extremely skeptical about Wikidata, and at the moment they say: “There are no references at Wikidata, so we don’t use it!” Once we’ll get that right I see them saying: “Wikidata references are useless and messy, so we don’t use it!” —MisterSynergy (talk) 13:04, 25 August 2016 (UTC)
@MisterSynergy: Frankly speaking I can say that the use of data from WD by WPs is often the last problem of several contributors in WD. Just think about the continuing import of data from WP into WD without original sources or the different games which help to add data to WD but without adding the source too. Snipre (talk) 14:44, 25 August 2016 (UTC)

The closing of RFC without a summation is not a good practice[edit]

I find it unhelpful and both a little pointless and disappointing that some of the Wikidata:Requests for comment are simply closed as "no consensus" without a summation of the viewpoints. Where a conversation has taken place and someone is moved to close it, I see true value in the closure to explain the alternate viewpoints, and why how that person closing the discussion has determined the position of the community. People have invested in putting their points of view, and if someone cannot take the effort to summarise, then what are they doing closing the discussion? An open discussion is not problematic, it just looks untidy to some.  — billinghurst sDrewth 23:49, 24 August 2016 (UTC)

That is why lot of people fear to close RfC, as result there were open RfC for more than one year. I think that there is no benefit from RfC being open so long, as situation may change during the time. So I support timely closing RfC, even with no consensus result (and without a summation of the viewpoints). --Jklamo (talk) 07:24, 26 August 2016 (UTC)

Translation of Wikidata weekly summary[edit]

Hi, even if this discussion is about translation, I prefer to talk about that here rather in Wikidata:Translators' noticeboard. So, if Matěj Suchánek does not read here, he could be interested. My question is rather simple, as Wikidata is a multi-language project, would it be possible to translate the Wikidata weekly summary before they are published so that, user can read them in their native language? I know that TomT0m do this work on French Wikipedia a posteriori so I wonder if it would be possible, first technically, to translate these news a priori. If so, would it be possible for users registered here to receive this summary in their preferred language if the translation has been done in this language? Pamputt (talk) 18:50, 25 August 2016 (UTC)

I know the parsoid/wikitext team does a call for translation before publishing their status update, but I don't know how they dispatch it. author  TomT0m / talk page 18:53, 25 August 2016 (UTC)
Do you know the page where we can find this call? Pamputt (talk) 19:06, 25 August 2016 (UTC)
Recieved it in my mailbox for some reason, after a bit of digging it's visual editor team, not parsoid, and the mail can be found on the internet : author  TomT0m / talk page 19:12, 25 August 2016 (UTC)
(edit conflict) We could re-implement the logic of Tech News distribution. But if we really do, we need to overhaul the way the weekly summary is composed at first. Perhaps linking from the English weekly summary which gets distributed to a translated version would be sufficient.
Note that we were able to translate several status updates in 2014. Matěj Suchánek (talk) 19:14, 25 August 2016 (UTC)

Another information, what I am asking for already exists for Tech news. Pamputt (talk) 19:07, 25 August 2016 (UTC)

Special:Translate exists on this wiki, so it is technically possible to have the newsletter in multiple languages, and delivered in multiple languages, as you have seen with Tech News. The issue is always going to be timeliness of the production, the translation, and then the delivery.  — billinghurst sDrewth 07:12, 26 August 2016 (UTC)
And the wasted time that could be used for translating help pages and other stuff. Sjoerd de Bruin (talk) 07:47, 26 August 2016 (UTC)

Thanks for your feedbacks on this topic. I'm currently working on improving the Weekly Summary, from the content to the technical issues, and I'll be glad to hear your feedbacks and ideas about this :) Lea Lacroix (WMDE) (talk) 08:57, 26 August 2016 (UTC)

This is not necessary, imho. --Molarus 09:40, 26 August 2016 (UTC)

@Molarus: I did not get what you think is not necessary. If you mean about the translation of weekly summary, I think we are not the good person to judge about this question since we understand English. If people want to translate the news then it is only a matter of technical issue. Pamputt (talk) 23:12, 28 August 2016 (UTC)

Extracting properties from Wikivoyage[edit]

Wikivoyage, the wiki travel guide, has recently begun linking to the Wikidata item of every museum/park/hotels/etc (for those having a Wikidata item).

Each item has very detailed information, and often the Wikidata item has almost no information. So: Anyone willing to create a bot that copies information from Wikivoyage to Wikidata? :-)

Example: Grand Hotel Beijing (Q10902432) only had a single statement when I linked to it, whereas the Wikivoyage item has much more details:

| name=Grand Hotel Beijing
| alt=北京贵宾楼饭店; Běijīng Guìbīnlóu Fàndiàn
| url=
| email=
| address=35 East Chang'an Street (东长安街35号; Dōngchángānjiē)
| lat=39.90743
| long=116.40279
| directions=Two blocks E of Tiananmen Square
| phone=+86 10 6513 7788
| tollfree=
| fax=+86 10 6513 0049
| checkin=
| checkout=
| price=Listed rates for doubles ¥3,450-14,950, discounted rates ¥765-10,500, breakfast ¥184
| content=Five-star hotel located in a traditional building in a small street overlooking the Forbidden City. Rooms with free internet except for the cheapest ones. The rooms are 32-66m2 except for the very most expensive, which is more than 100sqm. Business center, gift shop, ticket office, fitness, pool and sauna available. Chinese and Western restaurants as well as coffee shop, bar and room service.

What Wikidata could reuse: English name, country, name in the local language of the country, website, address, coordinates, phone/email/fax.

Conveniently, the Wikivoyage items and their properties are available for download as a nice big CSV file. Cheers! Syced (talk) 14:34, 26 August 2016 (UTC)

If you already have it in CSV, you can use QuickStatements, I think, to get started. --Izno (talk) 15:30, 26 August 2016 (UTC)
You may also have a look at HarvestTemplates. --Pasleim (talk) 16:10, 26 August 2016 (UTC)

Weird connection[edit]

I have commons:File:Commuters, who have just come off the train 1a33849v.jpg and commons:File:Commuters, who have just come off the train 1a33849v (cropped).jpg watchlisted on Commons. Every time an edit is made to Q61 (Washington, D.C., which has nothing to do with the files), it shows up on my Commons watchlist as being connected to these two files. I can't figure out where the connection was made, why it was made, or how to remove it. Thanks, Pi.1415926535 (talk) 21:35, 26 August 2016 (UTC)

@Pi.1415926535: Both of them have the template "Institucion:Library of Congress", whose location is Washington (Q61). Strakhov (talk) 21:38, 26 August 2016 (UTC)
Here you can uncheck Show Wikidata edits in your watchlist. "All or nothing", I guess. Strakhov (talk) 21:40, 26 August 2016 (UTC)
So changes to a wikidata item linked in a creator template that is used in those file pages show up on a watchlist under those files? That doesn't seem to make sense - there is not a direct link made anywhere from the wikidata item to the files. If I change a template on commons, that change doesn't show up in my watchlist under all the files that use the template. Perhaps I'm just ignorant of how wikidata works in this regard. Pi.1415926535 (talk) 21:51, 26 August 2016 (UTC)

Should Wikidata link to NSFW websites?[edit]

I clicked on "Random Primary Sources item" and the Primary Sources tool directed me to Puma Swede (Q1069901). The Primary Sources tool suggests links to PornHub and xvideos. Should those websites be recommended as sources by the Primary Sources tool? ChristianKl (talk) 19:13, 27 August 2016 (UTC)

Regardless the fact they are NSFW (I think it's not that important), they are not by any means reliable sources (as most of its content is probably user generated/uploaded porn videos (there are "log-in/create account" buttons in both webs and videos are uploaded by user accounts). And, consequently, a lot of videos would be copyright violations. Using sources like these in "BLP" Q's is not ideal either. I'd remove both pages from the tool. Also, they seem useless for sourcing statements even in porn Q's. Strakhov (talk) 21:31, 27 August 2016 (UTC)
There is a backlist at Wikidata:Primary sources tool/URL blacklist that these should probably be added to. --Lydia Pintscher (WMDE) (talk) 09:38, 28 August 2016 (UTC)
Okay, I added them to the list. ChristianKl (talk) 10:38, 28 August 2016 (UTC)
@Lydia Pintscher (WMDE): "backlist" or "blacklist"? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:48, 28 August 2016 (UTC)
To answer the question in your subject heading: yes, per en:WP:NOTCENSORED (which we we do well to adopt, or at least adapt, here). Furthermore, NSFW where? USA? Singapore? China? Iran? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:48, 28 August 2016 (UTC)
Something being Not Safe For Work does not depend that much on the country, but on the company someone is working in. :) Strakhov (talk) 20:19, 28 August 2016 (UTC)

Allowing Wikipedia's to make a choice to only import statements with non-Wikipedia references[edit]

As far as I understand many Wikipedia's decide against important data from Wikidata because too much data in Wikidata is without references. Have we considered a setting whereby a template that automatically imports data can choose to only import data when there's a non-Wikipedia reference for a claim? ChristianKl (talk) 19:25, 27 August 2016 (UTC)

English Wikipedia has worked on a concept for that. --Izno (talk) 19:47, 27 August 2016 (UTC)

Risperidone versus placebo for schizophrenia[edit]

For whatever reason some people have added substances as a "medicine" to Wikidata. They have added substances that are in a database of approved substances into Wikidata and it gives these substances some legitimacy. Risperidone versus placebo for schizophrenia is a Cochrane publication that throws a lot of cold water on one of these substances.

The questions I pose are:

  • Should we include substances registered as a medicine? The point being is that is proves little about efficacy even when compared to a placebo. For this substance Cochrane says: "The margin of improvement chosen by the researchers as their outcome may not be clinically meaningful."
  • When we do, how do we link to relevant literature. It is NOT a reference because the fact that it IS registered is all the current reference says . Cochrane provides the information about efficacy and that is utterly relevant and different.

My proposal is to remove all substances that are marked as medicinal. Thanks, GerardM (talk) 06:59, 28 August 2016 (UTC)

I would oppose wholesale removal. If something is registered as a medicine we should, per NPOV, record that it is so registered regardless of any disputes about whether it should be. We have statement disputed by (P1310) which could probably be used to indicate that someone disputes that it is effective (perhaps as a qualifier to use (P366)?). Thryduulf (talk) 08:46, 28 August 2016 (UTC)
When a substance is harmful and we state that it is a medicine, we take on a responsibility. This is why we are careful with living people as well. NPOV is of a lesser relevance than the harm that it does.
The US airforce shut down on of their own helicopters in Iraq partly because the US airforce doesn't consider helicopters to be aircraft. If someone takes Risperidone and dies to side effects that aren't responded to because you said that Risperidone isn't a drug, that's great harm to living people. ChristianKl (talk) 10:19, 28 August 2016 (UTC)
On we use a disclaimer int the page about medicine. More or less: « The information is not medical advice and may not be accurate. The contents are only for illustrative purposes and do not replace medical advice » with a link to Wikipedia:Medical disclaimer (Q10640396)--ValterVB (talk) 10:43, 28 August 2016 (UTC)
Could you give an example of the items that you are talking about? ChristianKl (talk) 09:27, 28 August 2016 (UTC)
Read the title of this post and read the associated article at Cochrane. NB this is just a random substance. Thanks, GerardM (talk) 09:49, 28 August 2016 (UTC)
The word Medicine doesn't appear on Risperidone (Q412443). What appears is pharmaceutical drug (Q12140) which is about whether a substance is intended to be used to cure but which isn't related to whether the substance cures in practice. ChristianKl (talk) 10:07, 28 August 2016 (UTC)
This substance does not cure. Not at all, there is no claim to that effect. Thanks, GerardM (talk) 11:06, 28 August 2016 (UTC)
There's an FDA approval for the drug for a certain usage. As such it's used by some doctors with the intent to cure. The fact that it might not actually cure patients doesn't imply that no doctor uses it for that intent. ChristianKl (talk) 13:15, 28 August 2016 (UTC)

Before a big discussion that may point nowhere if we do not do that, I'd like that we put some points straight, like "what exactly is a medicine". We have to get our definition straight before any discussion on such a subject. author  TomT0m / talk page 09:31, 28 August 2016 (UTC)

Really? A medicine is a substance prescribed by a doctor for medicinal purposes. So when a doctor prescribes substances banned in another country as a drug, it is still a medicine per the definiton. Thanks, GerardM (talk) 09:52, 28 August 2016 (UTC)
In that case it doesn't matter whether Cochrane says it's effective. The fact that Cochrane speaks about in the first place is indication that some doctor somewhere uses it for medical purposes. Cochrane also explicitely calls it a drug "Risperidone is one among the atypical or new generation of drugs. "ChristianKl (talk) 10:10, 28 August 2016 (UTC)
The question is should we have such information and when we do, how do we signal prominently that the efficacy of a drug is very much in doubt. Thanks, GerardM (talk) 11:20, 28 August 2016 (UTC)
Having good information about substances is valuable to a varity of bioinformatics applications. Data about whether a substance is a pharmaceutical drug is useful data. Having data about the conditions for which a drug is used is useful.
Currently it seems like we don't have a Wikidata property that speaks about clinical effectiveness. It would be useful to have such a property but as far as I see there's currently no large database about clinical effectiveness that we could import. That problem isn't easily solvable. It might be solvable in a few years if Cochrane provides their data in a form that can be easily imported.
The data that's currently listed at "medical condition treated" seems to be based on the drug having successfully passed clinical trials. It's data from ChEMBL that has a property for "Max phase for indication". That property could be translated into a qualifier for Wikidata.
Apart from that the definition for both "medical condition treated" and "pharmaceutical drug" could be edited to explicitly say that they aren't making statements about clinical effectiveness. ChristianKl (talk) 13:15, 28 August 2016 (UTC)
@GerardM: Yes, really. Don't forget that we are in a concept based and multilingual environment, it's crutial that everything agrees on what is going on and what the definitions of the concept we use are. Take en:Medicine for example, this word includes a lot more than just active substances. Is a cure a medicine ? I'd get yes. Do we are actually speaking of "medicinal substances", "active molecule" ? Do you say a "medicine" is a prescription, in general, like "do run one hour a day" "take this kind of pills - a set of action a doctor makes to makes you better ? author  TomT0m / talk page 12:09, 28 August 2016 (UTC)
You then get to a situation where you mistake "active molecule" for a medicine and that is also problematic. Typically people expect that a doctor prescribes substances that are beneficial. When it is scientifically in doubt that they are, it should not be prescribed, it should not be a medicine and we should be careful what we say on matters that make people worse of. Thanks, GerardM (talk) 16:05, 28 August 2016 (UTC)
Sorry but you're not really answering. I'm asking definition just to be sure we're all on the same page, not to be lectured about what should or should not be prescribed . So we need definitions, and preferably scientific one, you're probably right on this, on sorting out all those concepts. author  TomT0m / talk page 16:10, 28 August 2016 (UTC)
Don't forget also that the effectiveness of a treatment may change over time, as does what is prescribed for a given condition (the two are not always directly related) and we need to be able to record both currently used medicines and those that were previously used but no longer are. Also, what counts as a "medicine" and what substances may or may not be prescribed also vary temporally and geographically - e.g. it became legal to prescribe medical cannabis (Q1033379) in Colorado (Q1261) only in 2000, it remains illegal in Utah (Q829). Thryduulf (talk) 20:11, 28 August 2016 (UTC)
I'm not really happy with the status quo where for a pair like Provigal (brand name) and Modafinil (substance name) both are mixed up in the same item. I think it might be worth renaming pharmaceutical drug (Q12140) into "pharmaceutical substance" and using "pharmaceutical drug" as a label for items like Provigal that have a specific company that manufactures them. ChristianKl (talk) 10:13, 28 August 2016 (UTC)

qualifiers for P2046?[edit]

When I add area (P2046) from censuses, I always add "as of" as qualifier, since it often change by time. I am aware that many other use of P2046 does not need a timestamp. But I think it would be nice to see a mandatory use of this qualifier for administrative units and populated places. It is of course less useful in many other items.

It is at least worth discussing. What does the statistics look like for qualifiers for P2046 today? -- Innocent bystander (talk) 14:29, 28 August 2016 (UTC)

Why not "begin date" and "end date" (or is "as of" just an alias for "date"?) ? Or even some new items if we consider the name is not enough common information to qualify to places as identical and that places with different borders imply different places. Administrative units can be tricky : they are both an administration and a place. Some change in the law of the state can justify to create a new item (on the same spirit - a significant change in the role of the administration can justify the fact we consider it's a new entity). author  TomT0m / talk page 14:46, 28 August 2016 (UTC)
To Swedish municipalities, the most common reason to changes in area are probably updated measurements. Digital land survey was maybe invented decades ago, but it is still under implementation. The borders maybe not technically have changed for decades, but they have not been very well measured. The changes in area in the end of the 19th century was probably even larger, since we in some areas didn't had very good maps at all, especially not in sparsely populated areas.
When it comes to urban areas, they are very special. They are defined "as of 31/12" every five years. That is what I have mostly worked with the last weeks.
Laws are often changed gradually, that makes it difficult to see what changes are "significant". In the 19th and the first half of the 20th century we had many kinds of municipalities in Q34. In 1952 that was changed, the same law was now applied to most kinds of municipalities. But it was not until 1971 every municipality was renamed from "X City/Market town/Rural municipality" to "X Municipality".
What is most significant, that the law changed, or that the name changed? According to sv.wikipedia, it is the change of names. But that is maybe natural, since they normally split subjects based on their names. -- Innocent bystander (talk) 16:41, 28 August 2016 (UTC)

Inquire about "AWB"[edit]

Hello.Why "AWB" is unavailable here?It is very useful.Thank you --ديفيد عادل وهبة خليل 2 (talk) 15:17, 28 August 2016 (UTC)

a) what you would like to do with AWB here on Wikidata? Real use-cases, please, not something like "maintain Wikidata" or "make some edits", which doesn't say anything about what you want to do. Wikidata works pretty differently from other wiki projects, we have specific tools for editing Wikidata. b) AWB works on "normal" wiki-pages, like this or talk pages. --Edgars2007 (talk) 15:24, 28 August 2016 (UTC)
Ask maintainers. It could be useful eg. on property talk pages but that's all. Matěj Suchánek (talk) 15:27, 28 August 2016 (UTC)
@ديفيد عادل وهبة خليل 2: because AWB is more aligned with a flat text (free text) area— primarily regex text replacement, or some decision trees for disambiguation — whereas Wikidata is a completely different data structure (see Wikidata:Glossary#Claims and statements). AWB would work fine in the flat namespaces like user, user talk, wikidata, but cannot do the data calls. It is simply the wrong designed tool for working in the Q: and P: namespaces.  — billinghurst sDrewth 22:15, 28 August 2016 (UTC)

HarvestTemplates issue[edit]

Sometimes, when I am suing HarvestTemplates, I get the error "WQS query expired". What causes that, and is there any way to fix it? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:29, 28 August 2016 (UTC)

Ask Pasleim, it's his tool. Mbch331 (talk) 19:00, 28 August 2016 (UTC)
Quite obivious what causes that. HarvestTemplates makes several queries during loading which may run for longer than 30 seconds, thus expire (happens for properties with many values or classes with many subclasses). So this is rather problem of Query Service. Matěj Suchánek (talk) 19:05, 28 August 2016 (UTC)