Shortcut: WD:PC

Wikidata:Project chat

From Wikidata
Jump to: navigation, search
Wikidata project chat
Place used to discuss any and all aspects of Wikidata: the project itself, policy and proposals, individual data items, technical issues, etc.
Please take a look at the frequently asked questions to see if your question has already been answered.
Also see status updates to keep up-to-date on important things around Wikidata.
Requests for deletions can be made here.
Merging instructions can be found here.

IRC channel: #wikidata connect
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2016/08.
Filing cabinet icon.svg
SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 1 day.

Project
chat

Administrators'
noticeboard

Development
team

Translators'
noticeboard

Requests
for permissions

Interwiki
conflicts

Requests
for deletions

Property
proposal

Properties
for deletion

Requests
for comment

Partnerships
and imports

Request
a query

Bot
requests

Contents

Contribute to set a data quality framework for Wikidata[edit]

Dear Wikidata members, We are working on setting a data quality framework for Wikidata, as part of a research project carried out by members of the Web and Internet Science group of the University of Southampton.

Determining the quality of Wikidata is crucial for its future development. We believe that its community should have a primary role in defining what data quality means in Wikidata. Therefore, we would like to ask community members to contribute to our data quality framework draft by adding comments, suggestions, and concrete example of quality issues on Wikidata.

The draft has been published as a Request for Comment and can be found at this address: Data quality framework for Wikidata
Many thanks,
--Alessandro Piscopo (talk) 08:44, 12 August 2016 (UTC)

Hey :) Just for everyone's info: Alessandro has been working with us in the office for the past 2 weeks and it'd be great if you could support him in his work. I believe it will be valuable for Wikidata. --Lydia Pintscher (WMDE) (talk) 16:02, 12 August 2016 (UTC)
@Alessandro Piscopo: What about bringing quality statements (like the 1.0 classification) to Wikidata? Because they are language-specific they could be done as badges.--Kopiersperre (talk) 19:00, 12 August 2016 (UTC)
That would definitely be interesting. We should agree first what we mean with "quality" though. --Alessandro Piscopo (talk) 08:12, 16 August 2016 (UTC)

quality is in linking wiki links with wikidata statements[edit]

Hoi, attention to quality is good but I think the basics of what is perceived quality is in the occurrence of statements that describe links to other articles in Wikipedia. This allows for article level activitiy and work done in any language maps to work in all other languages. When we focus on what Wikidata is supposed to do in this way, most other quality considerations have a framework; the use that brings to being the data storage for Wikimedia projects. PS I blogged about this and welcome any arguments. Thanks, GerardM (talk) 12:21, 14 August 2016 (UTC)

@GerardM: interesting point of view. My doubt is: if Wikidata's quality should be considered in relation to what it can contribute to Wikipedia, don't you think that it may be limiting for the project? I think Wikidata might have much broader application that the mere support of Wikipedia. --Alessandro Piscopo (talk) 08:12, 16 August 2016 (UTC)
I see no reason why Wikidata should limit itself to being data storage for Wikipedia. If we have a recently discussed data set like >300.000 National Heritage buildings in the UK, most of that data isn't interesting for Wikipedia. On the other hand Wikidata works on integrating itself with OpenStreetMap and the from that point it can be quite useful.
There are also instances when Wikidata can be directly valuable. I use it for example as a multilingual dictionary for anatomy when I create my Anki cards.
I don't think that it's useful to think in terms of articles when looking at Wikidata. If an item has three links it might not be complex enough to make an article but those three links can still be very valuable to understand the structure of the underlying subject. ChristianKl (talk) 12:00, 23 August 2016 (UTC)
You forget the point to Wikidata. Yes, it can be more but the basics is that it supports Wikipedia and other projects. When it can bring substantial improvement in quality, both Wikidata and all the Wikipedias will benefit. This brings a practical and easily implementable difference that can be measured. It will brings us more contributors and this is imho more relevant than including external stuff. Thanks, GerardM (talk) 13:42, 23 August 2016 (UTC)
Wikipedia is certainly an important stakeholder but it's not the only stateholder that matters. Getting GLMA's (Galleries, Libraries, Archives, and Museums) to release their data is for example also very important project.
There are also various consumers of structured data besides Wikipedia. ChristianKl (talk) 15:48, 28 August 2016 (UTC)

New gadget to sort the statements on items[edit]

Hello everybody,

As the sorting of the statements on a item page had issues for a while, I'm glad to annouce that there's now a gagdet for it! Gadget-statementSort.js sort all the statements of an item, based on a properties ordered list.

This gadget have been created by Ladsgroup, using a previous script writen by Soulkeeper. Thanks a lot for your work!

You can now enable this gadget in your preferences. If you have any question about the gadget or if you want to suggest some modifications on the properties list, don't hesitate to ask Ladsgroup or leave a message below.

Bests, Lea Lacroix (WMDE) (talk) 09:39, 16 August 2016 (UTC)

Great gadget! @Ladsgroup: It would be nice if I could completely overwrite the default property list by a self-maintained list in a custom .js page in my userspace. Could you implement something like that? Thanks and regards MisterSynergy (talk) 12:37, 16 August 2016 (UTC)
I made phab:T143383 to keep track of it :) Best Amir (talk) 04:53, 19 August 2016 (UTC)
@Lea Lacroix (WMDE): Is this gadget restricted to special browsers, or what am I looking for? -- Innocent bystander (talk) 09:19, 20 August 2016 (UTC)
@Innocent bystander: No it should work in any browser and statements should be ordered the same way across all items when you enable it. --Lydia Pintscher (WMDE) (talk) 17:55, 25 August 2016 (UTC)

ItemDisambiguation limit at 100[edit]

The display limit for Special:ItemDisambiguation is set at 100. In cases of some ordinary names, e.g. John Campbell, this limit may be attained, or nearly so, for what is a reasonable request. In other words using such a page for normal disambiguation may be close to failing, and will fail as more items and aliases are added.

Could the number of hits be displayed? Could there be some fallback to a second page? It is highly desirable that this Special page should function as the global disambiguation equivalent for en:w:John Campbell, for example. Charles Matthews (talk) 06:12, 18 August 2016 (UTC)

As a work-around you can use this SPARQL query. --Edgars2007 (talk) 10:06, 18 August 2016 (UTC)

Thanks. Charles Matthews (talk) 07:06, 19 August 2016 (UTC)

Using Reasonator for disambiguation is much easier and informative.. Try John Campbell. Thanks, GerardM (talk) 05:56, 26 August 2016 (UTC)

Unsourced and Wikipedia sourced P91 statements[edit]

What to do with unsourced and Wikipedia sourced sexual orientation (P91) statements? This really troubles me. The property has been used 3613 times, 2824 don't contain sources. Don't know how much have Wikipedia as source. Sjoerd de Bruin (talk) 21:28, 18 August 2016 (UTC)

613 has imported from (P143). Count. --Edgars2007 (talk) 21:39, 18 August 2016 (UTC)
Maybe there is the same problem for religion (P140) --ValterVB (talk) 06:44, 19 August 2016 (UTC)
If no source, delete. Snipre (talk) 09:39, 19 August 2016 (UTC)
No: If no source, find one and add it. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:56, 19 August 2016 (UTC)
We can't add sensible data on person without source. If some users add this data but don't add source makes an error. I think that the data is to be deleted. --ValterVB (talk) 11:08, 19 August 2016 (UTC)
If there is no source and the claim is plausible then look for a source. If you find a source, add it. If you don't find a source after looking and the claim is contentious or potentially so and/or your search was extensive and thorough then remove it. If the claim is not plausible, remove it. We really need a way of flagging the remaining cases where someone has done no or only a cursory search. Thryduulf (talk) 11:50, 19 August 2016 (UTC)
Nobody can force someone to look for sources, but Wikimedia Foundation, on biographies of living, asks you to add the sources, so if there are no sources for these data should be deleted. We can't keep sensible data indefinitely without source. --ValterVB (talk) 12:14, 19 August 2016 (UTC)
+1. WP require sources and WD aims to provide sourced data, so if people don't want to play the game, their contributions have to be deleted as useless and potentially subject to conflict. Snipre (talk) 16:15, 19 August 2016 (UTC)
So by this logic all statements on items about living people without a reference should be deleted (maybe except external IDs). I see no point in singling out this specific property. If it has to go, all does, or we accept that statements need to be decided about one by one (like Thryduulf said). And—still by this logic—the possible automatic statement deletion would not concern items about people who have passed away (still, talking about all properties, not only P91). – Máté (talk) 16:31, 19 August 2016 (UTC)
Not all the data are problematic, but certainly the data regarding religion or sexuality are more delicate and must be sourced. --ValterVB (talk) 16:46, 19 August 2016 (UTC)
Well, I'd add at least age, gender, residence, ethnic group, birth name etc. to the list of potentially just as sensitive data as sexual orientation and religion are. – Máté (talk) 16:51, 19 August 2016 (UTC)
(edit conflict) I think it is possible to define different priorities for properties in terms of how required sources generally are. Something like religion is pretty high up in most cases (I don't think it vital that a Catholic cardinal has source for religion, a US presidential candidate on the other hand probably is), however a handedness (P552) statement is almost never going to be controversial and so I don't think we should remove them without having looked for sources. External identifiers are almost always going to be self-sourcing, so we can treat them as completely uncontroversial. I'd suggest levels:
  1. always required, will be deleted if a source is not provided within a short time of the statement being added (should only be used for a very few properties and almost never when used for deceased people);
  2. almost always required, will normally be deleted when applied to living people or recently deceased people if a source is not provided but exceptions are possible based on common sense, especially for deceased people. (more than level 1, but not too many)
  3. Should be provided, statements should be accompanied by a source but they will not be routinely deleted without consideration of the circumstances (this should be default for non-external id properties)
  • low priority, statements should be accompanied by a source but they will not normally be deleted unless verification has failed or the statement is both implausible and applied to a living person (only things that will rarely be controversial should be at this level)
  • self-sourcing, no independent source is required (probably only applies to external identifiers). Thryduulf (talk) 16:58, 19 August 2016 (UTC)
we need a team to source statements. the people who periodically drive by and suggest statement deletion, are not collaborating and improving the data. "so if people don't want to play the game", they can take their deletion game elsewhere. there is no consensus for required references. Slowking4 (talk) 12:17, 21 August 2016 (UTC)
other self-sourcing statements are bibliographical data of Books or Texts, when the item is linked to the source (on wikisource for ex.). It seems very irrelevant to say that "Title" is so and so, and source on... the Book itself, which would be linked in the wikisource link... :) --Hsarrazin (talk) 18:45, 23 August 2016 (UTC)
From the last dump, claim without source:

I think that is necessary delete them --ValterVB (talk) 18:30, 26 August 2016 (UTC)

Iff they cannot be sourced then they should be deleted. But how about creating a list of items that need sources finding - in many cases I suspect that there will be a source in one or more of the attached Wikipedia articles that just hasn't been copied to here. Thryduulf (talk) 20:17, 26 August 2016 (UTC)

Meanwhile, an anonymous user keeps adding unreferenced claims for sensitive properties. Sjoerd de Bruin (talk) 17:18, 30 August 2016 (UTC)

Hierarchical data examples[edit]

I am trying to find how hierarchical information is stored in wikidata? Most of the Olympics Sports pages have a Tournament Draw or Bracket like this or this. I would like to add this "who played who, at which stage" info, but I am a bit confused whether such info gets stored. The docs say lists and infoboxes are the main focus, so is this something that will be handled later? If its already being done could someone please point at a tutorial or examples for a beginner. Thanks! Quil1 (talk) 03:24, 20 August 2016 (UTC)

You could create an item:
__________________
Label: "W-j Kim vs. R Ega Agatha in round of 32 of the Archer at the 2016 Summer Olympics"
instance_of: "archery contest round at the Olympics"
participant: "W-j Kim"
participant: "R Ega Agatha"
is_part_of : "round 32 of the Archer at the 2016 Summer Olympics"
succeeds : "R Ega Agatha vs M Nespoli in round of 16 of the Archer at the 2016 Summer Olympics"
follows : "W-j Kim vs. G Sutherland in round of 64 of the Archer at the 2016 Summer Olympics"
follows : "R Ega Agatha vs. Y Xing in round of 64 of the Archer at the 2016 Summer Olympics"
________________
This follows/succeeds? pattern stores the data in a Wikidata friendly way. However at the moment there's no easy way to integrate such data into Wikipedia. It's also possible that there are more specific properties than the one I listed here. ChristianKl (talk) 09:48, 26 August 2016 (UTC)
Hey, thanks for the reply. I have been wondering how to do this and was thinking along the lines of your suggestion. The whole thing does seem hard. And being new here maybe I should just pick something simpler to start with. Ideally there needs to be an Item (or a wikipedia page) for each MATCH containing round/tournament/score/winner etc. Round being a substitute for your follows/succeeds model. I am not sure but I think it would automatically create the linkages. So I was looking around and I found Match Box Templates for many sports eg - [1] [2] But not many sports have individual match pages (probably just too much data). That said lot of the data is found in these brackets and draws. So I am thinking I will pick a small tournament and start creating the items. Slowly going through the all docs to learn how to do that. I think it would be a great if through SPARQL one could run queries like - who played the semis of the 2015 Wimbledon or what was Federer's path through the US Open etc as lot of the info already exists in these pages. Quil1 (talk) 07:23, 27 August 2016 (UTC)
More a general comment: There is a Wikidata:WikiProject Sport results, but it is not very well developed. At this point we do not have an established concept how to add sports results to Wikidata, so you either need to develop something by yourself which might be superseded at some point in the future with a different approach, or you have to wait for a situation at which things are further developed by others. To my own experience there is still enough work to do on a much more basic level: make sure that sports persons, tournaments, organisations, venues, equipment, general items, etc. are properly defined, labelled, and linked to each other. Without such a robust data base it’d probably be difficult to model sports results anyway. Regards, MisterSynergy (talk) 07:33, 27 August 2016 (UTC)
I think thats what I was looking for and looks like someone has proposed matches Will follow up with them to see whats going on. Thanks! Quil1 (talk) 08:41, 27 August 2016 (UTC)

New gadget: currentDate[edit]

Hello everyone,

A new gadget has been added to our collection, this one (currentDate) automatically fills in the date of today when you use retrieved (P813). This one seems a perfect candidate to be enabled by default, we only need consensus for it. Thanks to TMg for creating this gadget!

Greetings, Sjoerd de Bruin (talk) 08:18, 23 August 2016 (UTC)

Seems a very useful gadget. Support making it default. Thryduulf (talk) 09:31, 23 August 2016 (UTC)
Agree. Robevans123 (talk) 10:35, 23 August 2016 (UTC)
I often add a "past date" to this property, but I still approve this! -- Innocent bystander (talk) 11:56, 23 August 2016 (UTC)
JFDI applies. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:17, 23 August 2016 (UTC)
at last ! I"ve been waiting for this soooo long !
support making it default too ! --Hsarrazin (talk) 19:08, 23 August 2016 (UTC)
Support making this default. - PKM (talk) 23:52, 29 August 2016 (UTC)

"missing" items for series of ...[edit]

Within the realm of serial items. Some items cannot currently be given correct claims for instance of (P31), just because there is no existing class for it.

For example, we do have book series (Q277759) which is fine for books, but we do not have an item for say, a "series of events" which could have sub-classes such as "series of military campaigns".

Currently, we have a lot incorrect claims along the lines of Battles of Khalkhin Gol (Q188925), which is incorrectly claimed as a instance of a battle, despite being a series of battles.

So, I plan to create some suitable items for this. If anyone has any input or suggestions then please do comment. Danrok (talk) 00:32, 24 August 2016 (UTC)

For Battles of Khalkhin Gol (Q188925) in en.wiki is a series but in other wiki is a single battle. Before change P31 is necessary to check and split the item. --ValterVB (talk) 06:49, 24 August 2016 (UTC)
Mmm tough problem at first sight. My first though is that "battles" and "series" do not mix very well all the time. Composition seems to be a better fit, if for one reason battles of some war can overlap and have sometime no obvious precedence order. The order might in some way related to a causality sequence, a battle can have been thought by one army because another one have been lost by the same army. A war is composed of several battles, and maybe several "sub-wars" ? A casestudy could be Hundred Years' War (Q12551) View with Reasonator See with SQID I guess. Does a war is a case of conflict that begins with a declaration of war (Q334516) View with Reasonator See with SQID and ends with a peace treaty of another way ?
May a war as a consequence cannot be composed of smaller wars and the composition holds at some higher level like an armed conflict that can be composed by some other armed conflict ? Can a battle be composed of smaller battles ?
A lot of questions and no answer, sorry /o\ author  TomT0m / talk page 07:35, 24 August 2016 (UTC)
The definition of battle is "part of a war which is well defined in duration, area and force commitment". I don't see why a series of battles can't be a battle according to that definition. It's also worth noting that different cultures speak differently about the same event and there's no reason why the English version should have preference. ChristianKl (talk) 10:23, 24 August 2016 (UTC)
Try significant event (P793) --Succu (talk) 21:18, 24 August 2016 (UTC)
part of (P361) is your best tool here, I think.
"Battle" is a pretty broad term, and a "battle" can include other "battles" - consider Battle of the Somme (Q132568) (1 July - 17 November 1916), which began with Battle of Albert (Q1992231) (1 July - 13 July), then Battle of Bazentin Ridge (Q2634717) (14 July - 17 July), and so on. These can all be "instance of: battle" and part of (P361) of the larger battle. They may also contain smaller events, which might also be labelled battles, or P31:engagement (Q6680005) (though I don't think many pages use these yet).
In some cases, battles might also be P361 of a military campaign (Q831663), a connected series of battles - so Battle of Kvam (Q20112888) is P361:Operation Weserübung (Q150939) which is P361:Norwegian Campaign (Q5084679)... and in the grand scheme of things, P361:World War II (Q362). So you can use P361 to go all the way up and down the chain.
In the specific case of Battles of Khalkhin Gol (Q188925), I'd consider whether P31:military campaign (Q831663) is the best way to go - they all fitted together as part of an overall series of battles. Andrew Gray (talk) 21:35, 24 August 2016 (UTC)
@Andrew Gray: military campaign (Q831663) Mmm is not this related to only one opponent actions ? A campain is how one side organized its actions imho. So it's in most case not appropriate as a battle is the sum of the actions of both side. It can be even a NPOV problem if you mix the two concept inappropriately. author  TomT0m / talk page 06:18, 25 August 2016 (UTC)


I think I picked a bad example, battles and wars are complex to define. Plus, there's the language problem, and the different way things are defined in different languages. Danrok (talk) 21:40, 24 August 2016 (UTC)
Just explicit the definitions you use. It's not exactly a problem specific to wars :) Wikidata should be definition based, not term based. author  TomT0m / talk page 06:18, 25 August 2016 (UTC)

creating guidance on the process of importing data into Wikidata[edit]

Hi all

I've started a draft of guidance on importing data from external datasets into Wikidata, its very rough at the moment, I would very much appreciate some help.

Thanks

--John Cummings (talk) 12:27, 24 August 2016 (UTC)

Perhaps something regarding sourcing and references? Danrok (talk) 12:41, 25 August 2016 (UTC)

Wikipedia corpus to Wikidata[edit]

Extracting data from corpus is a very promising and challenging problem, many research teams around the world are working on this problem, some of the examples: creating a module that make diagnose of a disease by scanning medical corpus, creating a voice assistance and much more. Wikipedia “as source of corpus” while wikidata is the “structured data” of that corpus. Offers a great environment to approach this problem.

Before getting into the details of this problem, I am interested to know how open is the Wikidata community to this problem? --GhassanMas (talk) 19:11, 24 August 2016 (UTC)

Wikidata imports a lot of information from Wikipedia already. In general Wikidata prefers to have data that cites sources from outside of Wikipedia.
Additionally there are the projects https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References and https://meta.wikimedia.org/wiki/Grants:Project/WikiFactMine that try to extract facts via machine learning and provide it to the PrimarySource tool where WikiData editors can approve or disapprove suggestions from the machine learning algorithms.
If you are an academic working on extracting facts from corpus data there's a good chance that you can work well with Wikidata, but it's worth to first understand the structure of Wikidata and how it plays with the Primary Source tool. ChristianKl (talk) 15:10, 25 August 2016 (UTC)
It is important to understand that the Primary Source tool is something that is not used for importing Wikidata and many other sources by the Wikidata community. Understanding this difference is vital. The biggest problem is not getting data into the PMS but finding people to consider the data in there. The statistics prove that the PMS is dysfunctional. Thanks, GerardM (talk) 05:54, 26 August 2016 (UTC)


"Not used" doesn't seem to be what the statistics says. It doesn't get used as much as desired but it still get's used. Even if only a subset of the data is evaluated by humans, an academic group that provides data via the primary sources tool can expect to get some feedback over what claims get approved and rejected.
I consider the PMS a work in progress. With increased data there's a higher probability that when I browse an item the PMS will suggest a statement or reference that I find valuable to add.
More data also means that there are higher returns to improving the UI of the PMS and thus improving usage. ChristianKl (talk) 09:44, 26 August 2016 (UTC)
Seriously, when you consider the number of statements approved and the process whereby this happened you would not say this. I regularly "approve" info from Freebase on the basis that it is likely correct no verification happens and consequently the statistics do not only show little traffic, it does not show either that there is a meaningful process going on. Thanks GerardM (talk) 10:12, 27 August 2016 (UTC)
The numbers seem like currently there are 180000 approved statements and 50000 disapproved ones. That means that both approving and disapproving happens. Giving the amount of data in Wikidata I think you can plausibly argue that this isn't a lot of activity but it isn't nothing. There's also work on making the UI better. ChristianKl (talk) 12:53, 29 August 2016 (UTC)
I add Freebase statements because I pity the effort that went into Freebase. I add a lot of them, I also remove statements that became redundant.. Never, mind. The biggest issue you fail to address is the lack of process going on. Is there or is there any meaningful validation in the PMS.. For me that answer is obvious. Thanks, GerardM (talk) 17:17, 29 August 2016 (UTC)

Preferred sourcing for imported data already linked in identifiers?[edit]

I'm rewriting the script I use to generate items such as Thomas Thompson (Q26689403) to include sourcing. At the moment, every item has a History of Parliament ID (P1614) property, giving a clickable link to the main source. Given this, should I source the individual statements as stated in (P248):The History of Parliament (Q7739799), or as reference URL (P854):(the URL from P1614)? I'm leaning towards the reference URL (P854) approach but thought I'd better check which is preferred. Andrew Gray (talk) 19:50, 24 August 2016 (UTC)

ValterVB adviced me to use reference URL (P854) in similar cases, but we'd like to hear other opinions. --Epìdosis 20:04, 24 August 2016 (UTC) I've read again your message: in these cases I've always used stated in (P248) + reference URL (P854); the properties ValterVB adviced me never to use in references are identifiers (in this case History of Parliament ID (P1614)). --Epìdosis 20:08, 24 August 2016 (UTC)
per Help:Sources#Databases you would need to use stated in (P248):The History of Parliament (Q7739799) and History of Parliament ID (P1614) as source. --Pasleim (talk) 20:09, 24 August 2016 (UTC)
I didn't know it. Thank you, Pasleim. --Epìdosis 20:36, 24 August 2016 (UTC)
Interesting. Epìdosis's recommendation feels much more natural to me. The database approach it seems to require that we tag The History of Parliament (Q7739799) as P31:database, which would be wrong - it's a reference work that we record identifiers for, not a "database" like PubChem. It fits a lot better with the website approach in the section above (P248/P854). Andrew Gray (talk) 20:49, 24 August 2016 (UTC)
@Andrew Gray: So what is The History of Parliament (Q7739799) ? Because when I read the label of History of Parliament ID (P1614), I have "identifier on the History of Parliament website". According to that definition we have an online database. Snipre (talk) 21:01, 24 August 2016 (UTC)
Fair point - the difference here is a bit hazy :-). I wouldn't call it "instance of: database", though - and according to the help page that's a key element. Andrew Gray (talk) 21:17, 24 August 2016 (UTC)
@Andrew Gray: The first thing is to clarify the status of The History of Parliament (Q7739799): an item can't be at the same time a project and a reference work. Then if we consider that History of Parliament ID (P1614) is the identifier of the online version of The History of Parliament defined as a reference work then we can then follow the recommandations of Help:Sources#Databases. Here we have to better define what we want to link because The History of Parliament is now the same denomination for 3 things: a project (which is an organization), a reference work (published mainly as books) and a website build as a database. Theoretically we should have 3 items, each describing these 3 different concepts. Snipre (talk) 21:33, 24 August 2016 (UTC)
Hmm. I agree that "project" and "work" is a little confusing, but it's good enough for the moment (there is a project, it produces a work). We can fall back on only "work", though, if preferred. However, it's not a database nor is it built as a database - unless we define everything online as a database! I'm taking the information from there but I'm transcribing it by hand and then uploading with a script. Perhaps I shouldn't have used the word "imported"... Andrew Gray (talk) 21:43, 24 August 2016 (UTC)
@Andrew Gray: No, it's not good enough because if you try just to expand a little the information of The History of Parliament (Q7739799), we will have problems when adding specific properties. Can a project have a author (P50) property ? Can a reference work have a inception (P571) property ? If we follow your reasoning "there is a project, it produces a work", all data present in WD should be stored in the entity (Q35120) item as everything is an entity. We have to create enough items to identify correctly the concepts in order to avoid the current problem of using the same item for different purposes.
Then what is a database ?
* Systematically organized or structured repository of indexed information (usually as a group of linked data files) that allows easy retrieval, updating, analysis, and output of data
* A comprehensive collection of related data organized for convenient access, generally in a computer
* A collection of pieces of information that is organized and used on a computer
So the website http://www.historyofparliamentonline.org can be considered as a database because one part of it at least is composed of structured information accessible by an automatic query. Snipre (talk) 07:56, 25 August 2016 (UTC)
By this definition, so can literally any website with information on it in an organised form. Wikipedia is a 'database'. I don't think it's a very helpful way of thinking about things, and to be honest I think the arbitrary distinction between 'website' and 'database' made by the help page just causes more confusion. Andrew Gray (talk) 20:43, 25 August 2016 (UTC)
If retrieved (P813) is intented to be added too, I'd say using the ID property for sourcing would be wrong, because the date won't be true if the formatter URL changes (?). Strakhov (talk) 23:06, 24 August 2016 (UTC)
@Strakhov: You have retrieved (and verified) the information stored in the statement at the given date rather than retrieved a particular URL. I can see your point, but I don’t think we should worry about this one. —MisterSynergy (talk) 08:39, 25 August 2016 (UTC)
Just food for thought. Not even sure if it was of the nutritive kind when I wrote it. :) Strakhov (talk) 11:43, 25 August 2016 (UTC)
In this case, the URL and ID property are fairly interchangeable anyway - the ID is a URL slug :-) Andrew Gray (talk) 20:43, 25 August 2016 (UTC)
From what I can see, the use of such properties like History of Parliament ID (P1614) in the references makes stated in (P248) redundant! In the page of P1614, there is a subject item of this property (P1629)-claim that links to The History of Parliament (Q7739799). That chain of relations is probably enough to describe this. In fact, that relation have we used on svwiki, when we decipher the references here at Wikidata. See for example note 1 at sv:Adelaide av Bourbon-Orléans. The use of both P248 and P227 here now gives two links to "Gemeinsame Normdatei", one of them are obviously redundant here, and I think it is P248! -- Innocent bystander (talk) 07:39, 25 August 2016 (UTC)
@Innocent bystander: No, stated in (P248) is not redundant because it gives you directly the item where you can find additional properties if needed. Without stated in (P248) you have to first retrieve the item connected with the property and then you can retrieve the data you want. If you take the time to look ãt the current templates used in the different WPs to cite sources, you can see that most of them require much more data than available in the sources section below a statement. So better to provide from the beginning the most related items of a source in order to reduce queries. Snipre (talk) 14:40, 25 August 2016 (UTC)
  • Pinging @Carcharoth:. He was interested in this stuff two weeks ago. Strakhov (talk) 21:14, 24 August 2016 (UTC)
    • Sorry, no idea what I can usefully say! Am following it with interest, though. Carcharoth (talk) 22:10, 24 August 2016 (UTC)

Options[edit]

Demonstrating the options - all on an item which already has History of Parliament ID (P1614):1790-1820/member/thompson-thomas-i-1767-1818

Just P248 - item link to work
Just P854 - reference URL
P248 & P854 - item link and reference URL
P248 & P1614 - item link and property with identifier
According to Help:Sources, for a "database"

This structure avoids any change of the source structure in case of URL modification (if the URL change, you just have to correct the URL in the property History of Parliament ID (P1614)) and allows a nice link where the long URL can be hidden below the title value when using the source in Wikipedia (everyone prefers to see something like that Thomas Thompson instead of that http://www.historyofparliamentonline.org/volume/1790-1820/member/thompson-thomas-i-1767-1818 ).

I'm reluctant to use P1476 here because it would involve a lot more effort (I'd have to call up and scrape a few thousand pages to get the title phrase for each statement), but I accept that's not a very good argument against it ;-) Andrew Gray (talk) 20:43, 25 August 2016 (UTC)
A year ago I wondered why a title (P1476) is at all necessary, since one could simply use the item’s label as a reasonable title for instance. However, User:Snipre came up with a convincing argument on Help talk:Sources: there are “Bonnie and Clyde problem”-like situations with Wikidata items and external database entries that do not have a 1:1 equivalence, and thus a Wikidata label is not necessarily identical to the title of a corresponding database entry. This does indeed require extra effort, but I think it is worth to do this work. —MisterSynergy (talk) 20:54, 25 August 2016 (UTC)
At some point I'm going to scrape all of these and do a lot of processing for a different part of the project. I might leave off doing P1476 for now and then go back and add it to all the relevant qualifiers in one go when I'm processing the pages anyway - that would simplify the item-creation work for now and allow me to get that part completed. (I still have ~11000 to go!). Andrew Gray (talk) 17:36, 26 August 2016 (UTC)

Follow-up questions[edit]

This is somehow related to the above discussion, thus I put it here. The structure of references is important to give data users (e.g. in Wikipedias) the ability to easily generate references based on our data. They (a) need to find all relevant information for a valid reference (e.g. in en:Template:Cite web), and (b) expect to find that information in always the same structure (i.e. our references are always composed of the same properties). However, we have a couple of different reference structures defined in Help:Sources (“Database”, “Webpage”, “Book”, etc.) and I have two follow-up questions:

  • How can one see which reference structure is actually used in a particular case? Take all properties and decide whether they form something useful?
  • Is there any technique known to query “incomplete” sources here at Wikidata, e.g. by using the Wikidata Query Service? This would be useful for reference maintenance.

I would be happy to hear about your ideas. Thanks, MisterSynergy (talk) 08:51, 25 August 2016 (UTC)

  • One large problem with adapting Wikidata-references with templates like "Cite web" is that we here technically allow 17 reference URL (P854) inside the same reference. (Nothing is technically stopping that option.) And no template on any Wikipedia is adapted to that. The template we use on svwiki in the article sv:Adelaide av Bourbon-Orléans (which I mention above) on is adapted to any number of reference-url's. The references will look very strange, but they are fully readable, all 17 of them. This template is not based on any present template, it was instead adapted to the open framework we have here on Wikidata. It has many flaws, but it is a start. -- Innocent bystander (talk) 10:33, 25 August 2016 (UTC)
Interesting. Are there any use cases for multiple reference URL (P854) (or identifier statements) in a single reference? Can’t we define those references as “technically invalid references” and make them show up on maintenance lists? Who picked the number of 17 and why? —MisterSynergy (talk) 10:47, 25 August 2016 (UTC)
I do not know if multiple P854 is a big issue, and it should probably be regarded as something that has to be maintained. I have at least seen references with two valid urls. My point is that unexpected use of properties are to be expected in our references. It is a good intention to maintain those, but my experience from Wikipedia and from our contraints-lists here at WD is that these maintainance-lists tends to be longer and longer by time. We have to take into consideration that we never will fix them all. One of the most common use of references here is the cases when an url is all that is found in the reference. If the webpage is in Armenian or the webpage is dead, very very few of us can set a "title" to such a reference. -- Innocent bystander (talk) 12:50, 25 August 2016 (UTC)
unexpected use of properties in references — I really feel uncomfortable with such a guideline, although I admit that it might be the best we can reasonably achieve here. I’m from dewiki, whose community is extremely skeptical about Wikidata, and at the moment they say: “There are no references at Wikidata, so we don’t use it!” Once we’ll get that right I see them saying: “Wikidata references are useless and messy, so we don’t use it!” —MisterSynergy (talk) 13:04, 25 August 2016 (UTC)
@MisterSynergy: Frankly speaking I can say that the use of data from WD by WPs is often the last problem of several contributors in WD. Just think about the continuing import of data from WP into WD without original sources or the different games which help to add data to WD but without adding the source too. Snipre (talk) 14:44, 25 August 2016 (UTC)

The closing of RFC without a summation is not a good practice[edit]

I find it unhelpful and both a little pointless and disappointing that some of the Wikidata:Requests for comment are simply closed as "no consensus" without a summation of the viewpoints. Where a conversation has taken place and someone is moved to close it, I see true value in the closure to explain the alternate viewpoints, and why how that person closing the discussion has determined the position of the community. People have invested in putting their points of view, and if someone cannot take the effort to summarise, then what are they doing closing the discussion? An open discussion is not problematic, it just looks untidy to some.  — billinghurst sDrewth 23:49, 24 August 2016 (UTC)

That is why lot of people fear to close RfC, as result there were open RfC for more than one year. I think that there is no benefit from RfC being open so long, as situation may change during the time. So I support timely closing RfC, even with no consensus result (and without a summation of the viewpoints). --Jklamo (talk) 07:24, 26 August 2016 (UTC)

Translation of Wikidata weekly summary[edit]

Hi, even if this discussion is about translation, I prefer to talk about that here rather in Wikidata:Translators' noticeboard. So, if Matěj Suchánek does not read here, he could be interested. My question is rather simple, as Wikidata is a multi-language project, would it be possible to translate the Wikidata weekly summary before they are published so that, user can read them in their native language? I know that TomT0m do this work on French Wikipedia a posteriori so I wonder if it would be possible, first technically, to translate these news a priori. If so, would it be possible for users registered here to receive this summary in their preferred language if the translation has been done in this language? Pamputt (talk) 18:50, 25 August 2016 (UTC)

I know the parsoid/wikitext team does a call for translation before publishing their status update, but I don't know how they dispatch it. author  TomT0m / talk page 18:53, 25 August 2016 (UTC)
Do you know the page where we can find this call? Pamputt (talk) 19:06, 25 August 2016 (UTC)
Recieved it in my mailbox for some reason, after a bit of digging it's visual editor team, not parsoid, and the mail can be found on the internet : http://osdir.com/ml/general/2016-07/msg03130.html author  TomT0m / talk page 19:12, 25 August 2016 (UTC)
(edit conflict) We could re-implement the logic of Tech News distribution. But if we really do, we need to overhaul the way the weekly summary is composed at first. Perhaps linking from the English weekly summary which gets distributed to a translated version would be sufficient.
Note that we were able to translate several status updates in 2014. Matěj Suchánek (talk) 19:14, 25 August 2016 (UTC)

Another information, what I am asking for already exists for Tech news. Pamputt (talk) 19:07, 25 August 2016 (UTC)

Special:Translate exists on this wiki, so it is technically possible to have the newsletter in multiple languages, and delivered in multiple languages, as you have seen with Tech News. The issue is always going to be timeliness of the production, the translation, and then the delivery.  — billinghurst sDrewth 07:12, 26 August 2016 (UTC)
And the wasted time that could be used for translating help pages and other stuff. Sjoerd de Bruin (talk) 07:47, 26 August 2016 (UTC)

Thanks for your feedbacks on this topic. I'm currently working on improving the Weekly Summary, from the content to the technical issues, and I'll be glad to hear your feedbacks and ideas about this :) Lea Lacroix (WMDE) (talk) 08:57, 26 August 2016 (UTC)

This is not necessary, imho. --Molarus 09:40, 26 August 2016 (UTC)

@Molarus: I did not get what you think is not necessary. If you mean about the translation of weekly summary, I think we are not the good person to judge about this question since we understand English. If people want to translate the news then it is only a matter of technical issue. Pamputt (talk) 23:12, 28 August 2016 (UTC)

Extracting properties from Wikivoyage[edit]

Wikivoyage, the wiki travel guide, has recently begun linking to the Wikidata item of every museum/park/hotels/etc (for those having a Wikidata item).

Each item has very detailed information, and often the Wikidata item has almost no information. So: Anyone willing to create a bot that copies information from Wikivoyage to Wikidata? :-)

Example: Grand Hotel Beijing (Q10902432) only had a single statement when I linked to it, whereas the Wikivoyage item has much more details:

{{sleep
| name=Grand Hotel Beijing
| alt=北京贵宾楼饭店; Běijīng Guìbīnlóu Fàndiàn
| url=http://www.grandhotelbeijing.com/
| email=
| address=35 East Chang'an Street (东长安街35号; Dōngchángānjiē)
| lat=39.90743
| long=116.40279
| directions=Two blocks E of Tiananmen Square
| phone=+86 10 6513 7788
| tollfree=
| fax=+86 10 6513 0049
| checkin=
| checkout=
| price=Listed rates for doubles ¥3,450-14,950, discounted rates ¥765-10,500, breakfast ¥184
| content=Five-star hotel located in a traditional building in a small street overlooking the Forbidden City. Rooms with free internet except for the cheapest ones. The rooms are 32-66m2 except for the very most expensive, which is more than 100sqm. Business center, gift shop, ticket office, fitness, pool and sauna available. Chinese and Western restaurants as well as coffee shop, bar and room service.
}}

What Wikidata could reuse: English name, country, name in the local language of the country, website, address, coordinates, phone/email/fax.

Conveniently, the Wikivoyage items and their properties are available for download as a nice big CSV file. Cheers! Syced (talk) 14:34, 26 August 2016 (UTC)

If you already have it in CSV, you can use QuickStatements, I think, to get started. --Izno (talk) 15:30, 26 August 2016 (UTC)
All I have is Wikidata QIDs and properties. QuickStatements does not seem to accept Wikidata QIDs as a first column. Syced (talk) 03:19, 29 August 2016 (UTC)
@Syced: Just leave the top field empty, and it will! – Máté (talk) 05:18, 29 August 2016 (UTC)
Indeed that works, thank you :-) Syced (talk) 07:33, 29 August 2016 (UTC)
You may also have a look at HarvestTemplates. --Pasleim (talk) 16:10, 26 August 2016 (UTC)
The templates are already harvested, using a dedicated tool that understands all intricacies of Wikivoyage, so HarvestTemplates won't help here. Syced (talk) 03:19, 29 August 2016 (UTC)

Weird connection[edit]

I have commons:File:Commuters, who have just come off the train 1a33849v.jpg and commons:File:Commuters, who have just come off the train 1a33849v (cropped).jpg watchlisted on Commons. Every time an edit is made to Q61 (Washington, D.C., which has nothing to do with the files), it shows up on my Commons watchlist as being connected to these two files. I can't figure out where the connection was made, why it was made, or how to remove it. Thanks, Pi.1415926535 (talk) 21:35, 26 August 2016 (UTC)

@Pi.1415926535: Both of them have the template "Institucion:Library of Congress", whose location is Washington (Q61). Strakhov (talk) 21:38, 26 August 2016 (UTC)
Here you can uncheck Show Wikidata edits in your watchlist. "All or nothing", I guess. Strakhov (talk) 21:40, 26 August 2016 (UTC)
So changes to a wikidata item linked in a creator template that is used in those file pages show up on a watchlist under those files? That doesn't seem to make sense - there is not a direct link made anywhere from the wikidata item to the files. If I change a template on commons, that change doesn't show up in my watchlist under all the files that use the template. Perhaps I'm just ignorant of how wikidata works in this regard. Pi.1415926535 (talk) 21:51, 26 August 2016 (UTC)

Should Wikidata link to NSFW websites?[edit]

I clicked on "Random Primary Sources item" and the Primary Sources tool directed me to Puma Swede (Q1069901). The Primary Sources tool suggests links to PornHub and xvideos. Should those websites be recommended as sources by the Primary Sources tool? ChristianKl (talk) 19:13, 27 August 2016 (UTC)

Regardless the fact they are NSFW (I think it's not that important), they are not by any means reliable sources (as most of its content is probably user generated/uploaded porn videos (there are "log-in/create account" buttons in both webs and videos are uploaded by user accounts). And, consequently, a lot of videos would be copyright violations. Using sources like these in "BLP" Q's is not ideal either. I'd remove both pages from the tool. Also, they seem useless for sourcing statements even in porn Q's. Strakhov (talk) 21:31, 27 August 2016 (UTC)
There is a backlist at Wikidata:Primary sources tool/URL blacklist that these should probably be added to. --Lydia Pintscher (WMDE) (talk) 09:38, 28 August 2016 (UTC)
Okay, I added them to the list. ChristianKl (talk) 10:38, 28 August 2016 (UTC)
@Lydia Pintscher (WMDE): "backlist" or "blacklist"? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:48, 28 August 2016 (UTC)
Blacklist. --Lydia Pintscher (WMDE) (talk) 09:38, 29 August 2016 (UTC)
To answer the question in your subject heading: yes, per en:WP:NOTCENSORED (which we we do well to adopt, or at least adapt, here). Furthermore, NSFW where? USA? Singapore? China? Iran? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:48, 28 August 2016 (UTC)
Something being Not Safe For Work does not depend that much on the country, but on the company someone is working in. :) Strakhov (talk) 20:19, 28 August 2016 (UTC)
I don't think not crawling a specific source with StrepHit means censorship. It still feels a bit like an excuse to say that the only reason to have pornhub in the blacklist is that it's UGC but I'm okay with that decision. To what extend can pornsites who aren't UGC be reputable sources? ChristianKl (talk) 13:54, 30 August 2016 (UTC)
Well even porn sites that are UGC are reliable sources for information about themselves. A non-UGC porn site with biographies of the models they employ will probably be a reliable source for that model's career as a porn model. If a website has a page about each model with a bio and links to where they feature then that might be a useful external ID? (I don't generally work in this topic area so I'm unsure what sources are typically used) Other that it will probably need to be a case-by-case assessment of the reliability of what information is actually presented on the website. Thryduulf (talk) 14:55, 30 August 2016 (UTC)
As an answer to the starting question, I think that the only things Wikidata should categorically not link to is material that is illegal in the US (the jurisdiction where the WMF is legally based) and maybe Germany (where WMDE is presumably legally based). Everything else should be judged based on it's reliability and utility for the purpose we want to use it for. Thryduulf (talk) 14:58, 30 August 2016 (UTC)

Allowing Wikipedia's to make a choice to only import statements with non-Wikipedia references[edit]

As far as I understand many Wikipedia's decide against important data from Wikidata because too much data in Wikidata is without references. Have we considered a setting whereby a template that automatically imports data can choose to only import data when there's a non-Wikipedia reference for a claim? ChristianKl (talk) 19:25, 27 August 2016 (UTC)

English Wikipedia has worked on a concept for that. --Izno (talk) 19:47, 27 August 2016 (UTC)
The best solution was to remove all Wikipedia references. I always do so whenever I happen to edit a statement. -- JakobVoss (talk) 13:16, 29 August 2016 (UTC)
en:Module:WikidataIB: function getSourcedValue, if anyone's interested. --RexxS (talk) 18:27, 30 August 2016 (UTC)

Risperidone versus placebo for schizophrenia[edit]

For whatever reason some people have added substances as a "medicine" to Wikidata. They have added substances that are in a database of approved substances into Wikidata and it gives these substances some legitimacy. Risperidone versus placebo for schizophrenia is a Cochrane publication that throws a lot of cold water on one of these substances.

The questions I pose are:

  • Should we include substances registered as a medicine? The point being is that is proves little about efficacy even when compared to a placebo. For this substance Cochrane says: "The margin of improvement chosen by the researchers as their outcome may not be clinically meaningful."
  • When we do, how do we link to relevant literature. It is NOT a reference because the fact that it IS registered is all the current reference says . Cochrane provides the information about efficacy and that is utterly relevant and different.

My proposal is to remove all substances that are marked as medicinal. Thanks, GerardM (talk) 06:59, 28 August 2016 (UTC)

I would oppose wholesale removal. If something is registered as a medicine we should, per NPOV, record that it is so registered regardless of any disputes about whether it should be. We have statement disputed by (P1310) which could probably be used to indicate that someone disputes that it is effective (perhaps as a qualifier to use (P366)?). Thryduulf (talk) 08:46, 28 August 2016 (UTC)
When a substance is harmful and we state that it is a medicine, we take on a responsibility. This is why we are careful with living people as well. NPOV is of a lesser relevance than the harm that it does.
The US airforce shut down on of their own helicopters in Iraq partly because the US airforce doesn't consider helicopters to be aircraft. If someone takes Risperidone and dies to side effects that aren't responded to because you said that Risperidone isn't a drug, that's great harm to living people. ChristianKl (talk) 10:19, 28 August 2016 (UTC)
On it.wiki we use a disclaimer int the page about medicine. More or less: « The information is not medical advice and may not be accurate. The contents are only for illustrative purposes and do not replace medical advice » with a link to Wikipedia:Medical disclaimer (Q10640396)--ValterVB (talk) 10:43, 28 August 2016 (UTC)
Could you give an example of the items that you are talking about? ChristianKl (talk) 09:27, 28 August 2016 (UTC)
Read the title of this post and read the associated article at Cochrane. NB this is just a random substance. Thanks, GerardM (talk) 09:49, 28 August 2016 (UTC)
The word Medicine doesn't appear on Risperidone (Q412443). What appears is pharmaceutical drug (Q12140) which is about whether a substance is intended to be used to cure but which isn't related to whether the substance cures in practice. ChristianKl (talk) 10:07, 28 August 2016 (UTC)
This substance does not cure. Not at all, there is no claim to that effect. Thanks, GerardM (talk) 11:06, 28 August 2016 (UTC)
There's an FDA approval for the drug for a certain usage. As such it's used by some doctors with the intent to cure. The fact that it might not actually cure patients doesn't imply that no doctor uses it for that intent. ChristianKl (talk) 13:15, 28 August 2016 (UTC)
We do not need to controversial data from other sources. You again use the word "cure" many FDA approved substances do not cure at all and, this is not what FDA approval means. My point is that we can link to other sources but we do not have to include their controversial data. It is controversial because scientific evidence points to the fact that its use is in doubt or that there is no evidence at all. Thanks, GerardM (talk) 07:43, 30 August 2016 (UTC)

Before a big discussion that may point nowhere if we do not do that, I'd like that we put some points straight, like "what exactly is a medicine". We have to get our definition straight before any discussion on such a subject. author  TomT0m / talk page 09:31, 28 August 2016 (UTC)

Really? A medicine is a substance prescribed by a doctor for medicinal purposes. So when a doctor prescribes substances banned in another country as a drug, it is still a medicine per the definiton. Thanks, GerardM (talk) 09:52, 28 August 2016 (UTC)
In that case it doesn't matter whether Cochrane says it's effective. The fact that Cochrane speaks about in the first place is indication that some doctor somewhere uses it for medical purposes. Cochrane also explicitely calls it a drug "Risperidone is one among the atypical or new generation of drugs. "ChristianKl (talk) 10:10, 28 August 2016 (UTC)
The question is should we have such information and when we do, how do we signal prominently that the efficacy of a drug is very much in doubt. Thanks, GerardM (talk) 11:20, 28 August 2016 (UTC)
Having good information about substances is valuable to a varity of bioinformatics applications. Data about whether a substance is a pharmaceutical drug is useful data. Having data about the conditions for which a drug is used is useful.
Currently it seems like we don't have a Wikidata property that speaks about clinical effectiveness. It would be useful to have such a property but as far as I see there's currently no large database about clinical effectiveness that we could import. That problem isn't easily solvable. It might be solvable in a few years if Cochrane provides their data in a form that can be easily imported.
The data that's currently listed at "medical condition treated" seems to be based on the drug having successfully passed clinical trials. It's data from ChEMBL that has a property for "Max phase for indication". That property could be translated into a qualifier for Wikidata.
Apart from that the definition for both "medical condition treated" and "pharmaceutical drug" could be edited to explicitly say that they aren't making statements about clinical effectiveness. ChristianKl (talk) 13:15, 28 August 2016 (UTC)
@GerardM: Yes, really. Don't forget that we are in a concept based and multilingual environment, it's crutial that everything agrees on what is going on and what the definitions of the concept we use are. Take en:Medicine for example, this word includes a lot more than just active substances. Is a cure a medicine ? I'd get yes. Do we are actually speaking of "medicinal substances", "active molecule" ? Do you say a "medicine" is a prescription, in general, like "do run one hour a day" "take this kind of pills - a set of action a doctor makes to makes you better ? author  TomT0m / talk page 12:09, 28 August 2016 (UTC)
You then get to a situation where you mistake "active molecule" for a medicine and that is also problematic. Typically people expect that a doctor prescribes substances that are beneficial. When it is scientifically in doubt that they are, it should not be prescribed, it should not be a medicine and we should be careful what we say on matters that make people worse of. Thanks, GerardM (talk) 16:05, 28 August 2016 (UTC)
Sorry but you're not really answering. I'm asking definition just to be sure we're all on the same page, not to be lectured about what should or should not be prescribed . So we need definitions, and preferably scientific one, you're probably right on this, on sorting out all those concepts. author  TomT0m / talk page 16:10, 28 August 2016 (UTC)
Don't forget also that the effectiveness of a treatment may change over time, as does what is prescribed for a given condition (the two are not always directly related) and we need to be able to record both currently used medicines and those that were previously used but no longer are. Also, what counts as a "medicine" and what substances may or may not be prescribed also vary temporally and geographically - e.g. it became legal to prescribe medical cannabis (Q1033379) in Colorado (Q1261) only in 2000, it remains illegal in Utah (Q829). Thryduulf (talk) 20:11, 28 August 2016 (UTC)
Party of NPOV means that Wikidata's role isn't to say what people should do or shouldn't do. You might think that it's good that people do what you think they should do and you might even right but it's still Wikidata policy to not take side in a dispute.
In this case we have academic databanks like drugbank (University of Alberta), ChEMBL (European Molecular Biology Laboratory) and ::::: NDF-RT (Department of Veterans Affairs) who all use the term "drug" in a way where whether or not a drug is effective is not important for whether something is a drug.
Doctors expect to be told certain things when they ask a patient whether the patients takes any drugs regularly. Medicine has a culture where a doctor prefers to have more information rather than less. Telling patients that certain substances aren't drugs and thus don't have to be listed when the patient is queried for drugs is actively harmful. In general substances that are intended to treat a disease count here and it doesn't matter whether they actually are effective. A doctor wants to know when his patients take homeopathic drugs even when they don't believe that the drug does something on it's own. It provides information about a patient actually thinking they have an issue that warrants a drug. There can also be placebo effects that are useful to know for a doctor.ChristianKl (talk) 18:38, 29 August 2016 (UTC)


I'm not really happy with the status quo where for a pair like Provigal (brand name) and Modafinil (substance name) both are mixed up in the same item. I think it might be worth renaming pharmaceutical drug (Q12140) into "pharmaceutical substance" and using "pharmaceutical drug" as a label for items like Provigal that have a specific company that manufactures them. ChristianKl (talk) 10:13, 28 August 2016 (UTC)

qualifiers for P2046?[edit]

When I add area (P2046) from censuses, I always add "as of" as qualifier, since it often change by time. I am aware that many other use of P2046 does not need a timestamp. But I think it would be nice to see a mandatory use of this qualifier for administrative units and populated places. It is of course less useful in many other items.

It is at least worth discussing. What does the statistics look like for qualifiers for P2046 today? -- Innocent bystander (talk) 14:29, 28 August 2016 (UTC)

Why not "begin date" and "end date" (or is "as of" just an alias for "date"?) ? Or even some new items if we consider the name is not enough common information to qualify to places as identical and that places with different borders imply different places. Administrative units can be tricky : they are both an administration and a place. Some change in the law of the state can justify to create a new item (on the same spirit - a significant change in the role of the administration can justify the fact we consider it's a new entity). author  TomT0m / talk page 14:46, 28 August 2016 (UTC)
To Swedish municipalities, the most common reason to changes in area are probably updated measurements. Digital land survey was maybe invented decades ago, but it is still under implementation. The borders maybe not technically have changed for decades, but they have not been very well measured. The changes in area in the end of the 19th century was probably even larger, since we in some areas didn't had very good maps at all, especially not in sparsely populated areas.
When it comes to urban areas, they are very special. They are defined "as of 31/12" every five years. That is what I have mostly worked with the last weeks.
Laws are often changed gradually, that makes it difficult to see what changes are "significant". In the 19th and the first half of the 20th century we had many kinds of municipalities in Q34. In 1952 that was changed, the same law was now applied to most kinds of municipalities. But it was not until 1971 every municipality was renamed from "X City/Market town/Rural municipality" to "X Municipality".
What is most significant, that the law changed, or that the name changed? According to sv.wikipedia, it is the change of names. But that is maybe natural, since they normally split subjects based on their names. -- Innocent bystander (talk) 16:41, 28 August 2016 (UTC)

Inquire about "AWB"[edit]

Hello.Why "AWB" is unavailable here?It is very useful.Thank you --ديفيد عادل وهبة خليل 2 (talk) 15:17, 28 August 2016 (UTC)

a) what you would like to do with AWB here on Wikidata? Real use-cases, please, not something like "maintain Wikidata" or "make some edits", which doesn't say anything about what you want to do. Wikidata works pretty differently from other wiki projects, we have specific tools for editing Wikidata. b) AWB works on "normal" wiki-pages, like this or talk pages. --Edgars2007 (talk) 15:24, 28 August 2016 (UTC)
Ask maintainers. It could be useful eg. on property talk pages but that's all. Matěj Suchánek (talk) 15:27, 28 August 2016 (UTC)
@ديفيد عادل وهبة خليل 2: because AWB is more aligned with a flat text (free text) area— primarily regex text replacement, or some decision trees for disambiguation — whereas Wikidata is a completely different data structure (see Wikidata:Glossary#Claims and statements). AWB would work fine in the flat namespaces like user, user talk, wikidata, but cannot do the data calls. It is simply the wrong designed tool for working in the Q: and P: namespaces.  — billinghurst sDrewth 22:15, 28 August 2016 (UTC)
@Edgars2007, Matěj Suchánek, billinghurst:Do you see that "specific tools for editing Wikidata" is better than a copy of AWB suitable for Wikidata?Thank you --ديفيد عادل وهبة خليل 2 (talk) 06:57, 29 August 2016 (UTC)
As we already have said to you, Wikidata is completely different. To get AWB work in main and property namespace, it would need a complete rewrite, not just some small adaption. If you don't give an answer to 'a)' part of my first post in this section, I don't see any reason for continuing this discussion. --Edgars2007 (talk) 07:28, 29 August 2016 (UTC)
@Edgars2007:I'm talking about usage generally not in a specific thing.Thank you for everyone --ديفيد عادل وهبة خليل 2 (talk) 08:54, 29 August 2016 (UTC)

HarvestTemplates issue[edit]

Sometimes, when I am suing HarvestTemplates, I get the error "WQS query expired". What causes that, and is there any way to fix it? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:29, 28 August 2016 (UTC)

Ask Pasleim, it's his tool. Mbch331 (talk) 19:00, 28 August 2016 (UTC)
Quite obivious what causes that. HarvestTemplates makes several queries during loading which may run for longer than 30 seconds, thus expire (happens for properties with many values or classes with many subclasses). So this is rather problem of Query Service. Matěj Suchánek (talk) 19:05, 28 August 2016 (UTC)

Is it OK if I create an item for each embassy and consulate?[edit]

Is it OK if I create an item for each embassy and consulate? Or would that be too many items? Each item I have the host country, sending country, address, phone number, and sometimes website and email. In total that would probably make about a thousand items.

The information is currently stored on Wikivoyage like this. Cheers! Syced (talk) 02:23, 29 August 2016 (UTC)

In general 'too many items' on Wikidata isn't a concern if the items that you create contain information from reputable sources.
What's more of an issue with embassies is that Wikidata currently has no "host country" and "sending country" properties. We either need to create them or create a qualifer for the country (P17). ChristianKl (talk) 10:36, 29 August 2016 (UTC)
Have a look in the archives of this page for previous discussions about this (there are at least 2 maybe 3). I think operator (P137) and operating area (P2541) were involved but I can't remember the detail off the top of my head. Thryduulf (talk) 00:34, 30 August 2016 (UTC)
Properties do exist, actually, here are the relevant ontologies: Wikidata:WikiProject International relations. Cheers! Syced (talk) 03:06, 30 August 2016 (UTC)
Okay, that's great. Currently it seems items like https://www.wikidata.org/wiki/Q18180367 don't fill operator the way they should according to the proposal, so I thought the property doesn't exist. Now, it seems it's just work to apply the agreed upon properties and it would be great if someone would do it for the sake of both wikivoyage and Wikidata in general.ChristianKl (talk) 15:18, 30 August 2016 (UTC)
ChristianKl: I tried to apply the ontology to Mongolian Embassy in Berlin (Q18180367), would you mind checking whether I understood the ontology correctly or not? If yes, I will try to fix the other embassies (query) and consulates (query), then import the data from Wikivoyage. Does it sound OK? Thanks a lot! Syced (talk) 03:56, 31 August 2016 (UTC)
Your edit is mostly great. When it comes to located in the administrative territorial entity (P131) it's generally better to use a more specific administrative level like the city then the country. [User:Thryduulf] edited it in this case. It this case Pankow (Q4707648) would be the most specific level available, but putting in the city is generally good enough especially in cases where you deal with foreign cities where it's not trival to find out the lowest administrative level available. It great to see someone clean up the embassies :) ChristianKl (talk) 09:04, 31 August 2016 (UTC)

IRC spamming and bans?[edit]

Hi all, it is probably related to the spamming the #wikidata channel on irc.freenode.net was seeing yesterday, but I can no longer join the channel. Did everyone now get preemptively banned?? --Egon Willighagen (talk) 06:22, 29 August 2016 (UTC)

Unidentified users are not allowed to enter the channel until further notice, to avoid this now and in the future: please register your nickname. Sorry. Sjoerd de Bruin (talk) 07:04, 29 August 2016 (UTC)
OK, I already expected something like that (+10 for confirming). So, to all who run in the "ban" message when trying to join the Freenode channel, register and/or identifier your nickname. --Egon Willighagen (talk) 07:54, 29 August 2016 (UTC)

Redirects on Wikipedia are not processed in Wikidata[edit]

When a Wikipedia article gets redirected in any language, the link to the original article remains in the respective wikidata entry e.g. (e.g. Q175854, where the sitelink to the german Wikipedia is redirected from "Phobie (Psychiatrie)" to "Angststörung"). It can make sense to maintain that link. However, when one tries to do updates to the original wikidata entry, the api returns the following error message:

 The link <a class="external text" href="https://de.wikipedia.org/wiki/Angstst%C3%B6rung">dewiki:Angststörung</a> is already used by item <a href="/wiki/Q544006" title="Q544006">Q544006</a>. You may remove it from  <a href="/wiki/Q544006" title="Q544006">Q544006</a> if it does not belong there or merge the items if they are about the exact same topic.

This suggests that one should remove the original wikipedia link from the relevant wikidata entry. This is doable if you speak the relevant language and maybe with some effort if its in an unknown language that uses the same writing system as the languages you know. However, when it is in a language that uses a different writing system, it is too much to ask, unless it is okay to simply delete without inspection. Any advise on how to deal with this issue, would be appreciated? Would it be possible to automatically update sitelinks in wikidata once wikipedia merges/redirection happen? --Andrawaag (talk) 08:16, 29 August 2016 (UTC)

The problem is that a fundamental constraint of wikidata is that each Wikipedia article can be linked to only Wikidata item and vice versa - i.e. there is the assumption that there is a one-to-one correspondence between Wikipedia articles and concepts, but this is not the case (search for "Bonnie and Clyde problem" for more background). So, if the destination article already has a Wikipedia link then you will not be able to add a second one. Leaving the site link attached to the redirect is a workaround that maintains incoming interwiki links (e.g. if the de.wp article is merged, the merged article can still be found from en.wp but not vice versa). Hopefully this wont remain the case forever, but I don't think a fix is coming in the short term. Thryduulf (talk) 09:27, 29 August 2016 (UTC)
The main problem is to separate the intensional from the unintensional redirect-sitelinks. On svwiki a lot of bot created articles are merged. Sometimes the related sitelink can stay, since it is about a identifiable concept. Other times they (the sitelinks) are nothing but duplicates, that should be deleted. -- Innocent bystander (talk) 17:27, 29 August 2016 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── The principal problem I come across is the constraint that in a Wikidata entry, I cannot create a link to a redirect on en-wp. The "improved version problem" is a variant of the "Bonny and Clyde problem", but is much less likely to be a function of which Wikipedia you're working on.

There is a video game engine called "Source" which has been superseded by an improved version called "Source 2". The differences are nowhere near sufficient to justify separate articles, so Source 2 is just a redirect to a section in the "Source" article: en:Source (game engine)#Source 2. However, that's no reason not to have a Wikidata entry for Source 2 (Q21658271), and we can add software engine (P408) to a video game using the "Source 2" engine like Dota 2 (Q771541). That's just fine for Wikidata, but it won't allow Source 2 (Q21658271) to create a link to the redirect Source 2 on en-wp. So, when we want to fetch the Wikidata value for software engine to put into the en-wp infobox for en:Dota 2, we find there's no sitelink to pick up.

The only viable solution is to turn the en-wp redirect into an article temporarily; make the sitelink on Wikidata; then revert the new article back to a redirect. As Derek and Clive (Q5262484) would say "I ask you: Is that any way to run a ballroom?" --RexxS (talk) 19:57, 30 August 2016 (UTC)

List generation input[edit]

Hello folks,

The Wikidata development team is currently working on tools to improve list creation on Wikipedia, based on Wikidata data.

In order to understand what could be useful for you and why, we suggest you three examples of user scenarios, in which you could recognize some of your current uses: how do you currently edit some lists on Wikipedia, which tools or processes do you use, and what can be improved.

You can answer some short questions and add comments on our assumptions on each related talk page. This input is very important to help us understand how you edit the lists on Wikipedia, and what tools could be useful for you.

Thanks to all of you who will take a few minutes to answer our questions! Jan Dittrich (WMDE) (talkcontribslogs) Lea Lacroix (WMDE) (talk) 09:26, 29 August 2016 (UTC)

Changing the position of "type" and "datatype" in the JSON?[edit]

Currently, the "type" and "datatype" properties are placed after "datavalue" / "value". This makes it quite cumbersome to parse the JSON file when I'm streaming it, as I only know the type of the value after having already read it. Would it be possible to put "type"/"datatype" before the value? That way I could simply switch on them and handle the value correctly. Interestingly, the documentation mentions them in the proposed order, not in the currently implemented one (in the description, not the examples). I searched for issues or other pointers whether this was already proposed, but couldn't find anyting. Any pointers and feedback welcome. --Joe 776 (talk) 12:15, 29 August 2016 (UTC)

There is no order in JSON object keys. Please prepare your tool for random key order or it will likely break some time in the future. -- JakobVoss (talk) 13:13, 29 August 2016 (UTC)
I just learned of the Wikidata Toolkit. That saves me a lot of the manual work. Thanks for that quick reply! --Joe 776 (talk) 15:05, 29 August 2016 (UTC)

Wikidata weekly summary #224[edit]

Can't add new languages[edit]

When i try to create a new item in order to add new languages to an existing item, it shows up as a completely different page instead. How do i add new languages (not wikipedia links, but spaces to add label, description etc.) to an existing page? The main wikidata page says as soon as an item is created in a different language it can be edited in all languages, but that's not what happened. YuriNikolai (talk) 18:34, 29 August 2016 (UTC)

Do you see only pt/en/es as available languages? Or even less? Syced (talk) 03:09, 30 August 2016 (UTC)
@YuriNikolai: Which item do you want to improve? Matěj Suchánek (talk) 06:08, 30 August 2016 (UTC)

Several ones, pretty much every one i find really. And i see many languages on all items im working on, but i'd still like to add more. YuriNikolai (talk) 00:14, 31 August 2016 (UTC)

@YuriNikolai: Try enabling the labelLister gadget. nyuszika7h (talk) 10:07, 31 August 2016 (UTC)
I have no clue what you try to achieve. New languages ie languages we want to have labels for have a process because it is in the software / configuration where this is possible. Thanks, GerardM (talk) 16:54, 31 August 2016 (UTC)

Properties ready for creation backlog[edit]

Three weeks ago, I noted here that the backlog of property proposals with consensus for creation was up to 25. It is now nearly double that, at 46. There have been several reminders posted in the interim.

The set of editors with the technical capability to create these properties is clearly unwilling to do so. As a community, how shall we address this?

User:Lydia Pintscher (WMDE), do have a fall-back plan? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:23, 30 August 2016 (UTC)

Wikidata:Property proposal/Overview and Wikidata:Property proposal/Attention needed are also failing to draw input to proposal discussion from both property creators and the wider community. What should we try next? Thryduulf (talk) 14:47, 30 August 2016 (UTC)
I would like to help and created a request for the required permissions at https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Other_rights#Requests_for_the_property_creator_right ChristianKl (talk) 14:54, 30 August 2016 (UTC)
It's really time for athe gadget? --Edgars2007 (talk) 15:08, 30 August 2016 (UTC)
A gadget to create a property? If so, yes please, because there are so many steps after creating a new property. --Stryn (talk) 17:28, 30 August 2016 (UTC)

Vegetarians and Vegetarianism activists[edit]

Hi, I'm looking for some comments. Previously there were three different categories regarding people and vegetarianism, i.e.:

@Spinoziano: merged all the three items in Q7472972, stating that no project has more than one category dealing about people and vegetarianism, and that in any case people are included in categories "Vegetarians" because they are known for that, so they can be considered like activists. Maybe the first two items (activists and supporters) could be merged. However, I don't agree with the total choice, because the meaning of the categories is different: in one case you're included because you are a vegetarian, in the other case you're included because you're actively promoting the vegetarian ideology and movement. And I don't agree that since "they are known for being vegetarian" that means that they can be considered "activists", and consider "vegetarians" equal to "vegetarian activists". In the same way Atheists and Atheism activists are kept separated, for example. What do you think? --Superchilum(talk to me!) 20:12, 30 August 2016 (UTC)

I don't think merging the categories is the right decision. The different items point to three clearly different concepts and it's useful to understand precisely what someone means when they use a concept. ChristianKl (talk) 20:35, 30 August 2016 (UTC)
Please, fact. Instead of "I think" please give definitions. Without definition the merge is understandable because there is no way to distinguish between the items.
In this case what is a vegetarianism activist ? A star who put in Facebook its choice to be vegetarian ? If a vegetarian admits in public its choice, is he considered as vegetarianism supporter ? If a scientist publish a document about the risk to eat too much meat, can he be considered as a vegetarian.
We need clear definitions, if possible from external and recognized sources and most of the time problems occurs because people think. You can have hundreds of differents opinions in WD so your opinion has no more worth than another one. Instead of asking what people think do the most efficient job: find a source who say that that person is a vegetarianism activist. We will save time and energy. Snipre (talk) 22:04, 30 August 2016 (UTC)
Unnecessarily rude. --Superchilum(talk to me!) 09:50, 31 August 2016 (UTC)
In topic: "external and recognized sources" are needed in the Wikipedias. On es.wiki, for example, the main voice of the "Category:Vegetarians" is List of vegetarians, which contain "people who followed a vegetarian diet". The same for example on pt.wiki and ro.wiki. This kind of categorization has been deleted both on en.wiki and it.wiki, choosing more specific categories: Vegetarianism activists ("people whose vegetarianism activism is considered one of the defining characteristics") and Vegetarianism supporters ("people who have supported, difended etc. the vegetarian diet... mere adhesion to vegetarianism is not sufficient to include the person in the category"). In fact, vegetarianism can be adopted for different reasons: respect for sentient life, religious beliefs, or health-related, political, environmental, cultural, aesthetic, economic, or personal preference (see en:Vegetarianism and the sources cited), while activism "consists of efforts to promote, impede, or direct social, political, economic, or environmental change, or stasis with the desire to make improvements in society and to correct social injustice" (see en:Activism and the sources cited), so they are not necessarily correspondent. --Superchilum(talk to me!) 12:44, 31 August 2016 (UTC)
vegetarianism (Q83364) is a diet. Activisim is about wanting to get other people to do something. It's possible to want to get other people to stop eating meat while one eat meat. It's also possible to have a vegetarian diet without tellling anyone about it or caring about what other people eat.
Apart from the fact an action being understandable in no way implies that it was good. I can understand when people engage in fraud, but that doesn't mean that fraud isn't a crime.
This debate doesn't happen to be about whether any particular person should be listed as "vegetarian activist" on WD. Individual Wikipedia's make a decision to list them that way, and it's not WD role to try to overrule them and argue that their categories are wrong.
The standard for merging in Help:Merge is "absolutely certain" that the two are the same. ChristianKl (talk) 13:56, 31 August 2016 (UTC)
Exactly. Different wikis have different categorizations. Sometimes the categories are the same (in which case the category should be represented by one item in WD). Sometimes the categories are different (in which case the categories should be represented by different items in WD).
In the specific case under discussion, an activist or supporter may not necessarily be a follower, and certainly most followers are not necessarily activists. Following through on the merged items under discussion, and possible members of the different categories, it would appear that Albert Einstein was a supporter of vegetarianism, but only followed the diet for the last year of his life, and probably cannot be counted as an activist. See here for details on Einstein from the International Vegetarian Union (Q430696)... Robevans123 (talk) 15:12, 31 August 2016 (UTC)

YouTube channel ID[edit]

How do I specify channels in YouTube channel ID (P2397) that don't seem to have an ID, just a name, such as https://www.youtube.com/user/LuciaGilVEVO? Maybe a separate property nees to be created, or can some magic be done to make this support both? nyuszika7h (talk) 21:02, 30 August 2016 (UTC)

See the source of the page: <meta itemprop="channelId" content="UCb9ThkmYIUXyHb5pJiTyn1w">. Good idea for a bot? Sjoerd de Bruin (talk) 21:08, 30 August 2016 (UTC)
Simple property usage instructions might also help. ChristianKl (talk) 11:28, 31 August 2016 (UTC)
Or, open one of their videos, and click back to the channel with the link under the video. – Máté (talk) 04:49, 31 August 2016 (UTC)
You can just add the full link. Every twelve hours those will be cleaned by a script I wrote. Mbch331 (talk) 16:27, 31 August 2016 (UTC)

enabling Wikidata data access in user language[edit]

Hello all,

As previously announced, the access to Wikidata data in user language has been deployed yesterday (see the ticket). If you have any question or problem, please let us know.

Thanks hoo_man for this new feature! Lea Lacroix (WMDE) (talk) 07:27, 31 August 2016 (UTC)

Tools to create items, need guidance for the Wikisources[edit]

I have generally created root level English Wikisource items manually, as they are one at a time, and each property/item pair need to be individually recorded.

Now I am trying to create items for the subpages of a biographical work Eminent English liberals in and out of Parliament (Q26722460) using PetScan. I can create a list, however when I try to get PetScan to create the items it just says it starts and then nothing progresses. I don't see that I can add contextual links to respective pages, it just seems to be addition of common statements.

So I went to look at QuickStatements, and that is a different beast again, and I can see that I can individualise the statements (with some effort) though there is no indication of how to add badges. There is no apparent ability to share PetScan lists with QuickStatements

Now I am probably an absolute nonce, however, the tools do not seem well-suited to readily get Wikisource pages into Wikidata. Is there a better tool to be using? Is there better ways to research the codes required for these tools? Who is able to give guidance to assist the Wikisource community to get sensible data in place systematically and with depth rather than having to bash around a slow manual approach? Thanks.  — billinghurst sDrewth 14:08, 31 August 2016 (UTC)

Merge Q2344417 into Q17486063[edit]

Hello, could someone merge Stephanus le Moine (Q2344417) into Étienne Le Moine (Q17486063), please? They are the same person. Thanks. 128.127.107.211 15:28, 31 August 2016 (UTC)

It's easy to merge yourself see https://www.wikidata.org/wiki/Help:Merge. ChristianKl (talk) 15:31, 31 August 2016 (UTC)