Wikidata:Project chat/Archive/2019/11

From Wikidata
Jump to navigation Jump to search

Contents

How to handle marriage endings in Wikidata

We may have discussed it before, but for example, look at Vincent Price (Q219640). As a reader I want to know how each marriage ended, was it a divorce, an annulment, or the death of the spouse, or in the last marriage the death of the subject. We struggled with various methods at the English Wikipedia with a back and forth between competing methods, followed by purges of the data, then it was back to ad hoc additions with various methods when things cooled down. --RAN (talk) 23:03, 1 November 2019 (UTC)

You can add the qualifier end cause (P1534) to spouses with values such as death (Q4) (for the subject), death of spouse (Q24037741), divorce (Q93190), annulment (Q701040), etc. Looks like user:Jura1 took care of it already in the case of Mr. Price. -Animalparty (talk) 03:51, 2 November 2019 (UTC)
Thanks! RAN (talk) 04:25, 2 November 2019 (UTC)
I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. SCIdude (talk) 15:11, 5 November 2019 (UTC)

Projectwide taxon item confusion

Lately I've been going through the tedious, yet mostly satisfying work of adding species kept (P1990) to zoos, like Edinburgh Zoo (Q1284778) for example. However, there's a problem I've noticed: there are a shocking amount of (seemingly?) redundant taxon items, often related to scientific synonyms, like Zoothera dohertyi (Q21129670) and Geokichla dohertyi (Q903833), which both are about the same species of bird. Often times on these items, only one receives all wikilinks, and they differ in identifiers, even on the same websites. Is it safe to merge these when I come across them, or is there something I'm not getting? --AmaryllisGardener talk 04:22, 1 November 2019 (UTC)

  • If I follow the specialists correctly, the items are really about taxon names. Each name has a separate history/identifiers etc. It's an oddity of these that they group sitelinks on one of the items for the same species. --- Jura 06:46, 1 November 2019 (UTC)
Don't conflations have a solution in Wikipedia article covering multiple topics (Q21484471)? --SCIdude (talk) 07:29, 1 November 2019 (UTC)
Not sure if it is even correct to have species kept (P1990): Geokichla dohertyi (Q903833), because a taxon is not a species, it's just one (among possible others) classifier for a species. But I am not an expert in this. Steak (talk) 11:45, 1 November 2019 (UTC)
species kept (P1990)'s doc specifies that the allowed values are "any species or other taxons" --AmaryllisGardener talk 18:29, 1 November 2019 (UTC)

Adapting Gravemap for WiRs (javascript skills needed)

Hi all. If someone has some javascript skills and a spare hour, would they be interested in having a go at adapting the source code of the gravemap UI to show the results of this wikidata query? The default map unfortunately has lots of items overlap in location and so hide one another. I've asked the original dev (user:Yarl) but they're swamped at the moment so I thought I'd open up the call to the community just in case. T.Shafee(evo&evo) (talk) 10:58, 1 November 2019 (UTC)

Can anyone help with "Potential Issues" when creating a new statement?

I'm hoping someone can help with "potential issues" when "creating a new statement." I am a musician and when I enter my Spotify ID, the "potential issues" icon comes up upon publishing. Does this mean A) My data won't read when searched in google? It seems everything I enter has a potential issue, and none of the potential issues are true: IE, when I enter my spotify artist ID, it says one of the issues is that I am not Human or a Musical Ensemble. Two things I know to be true. Thanks!

Wikidata is not a website that exists for publishing information about yourself. ChristianKl❫ 16:00, 1 November 2019 (UTC)
I'm sorry, but that's bunk. Which page disallows it? If somebody meets Wikidata:Notability, by any measure, they are "data", and can be in Wikidata. It doesn't matter if you, I, a robot, or the artist themselves add the data, so long as Wikidata:Living people or m:Terms of use/Paid contributions amendment aren't violated. Wikidata:Autobiography, while not an "official" guideline, expressly permits it. And use common sense: Beyond establishing mere existence, there is little room for self-promotion on Wikidata, at least no where near the level possible on prose-based projects like Wikipedia. Besides Assume good faith, perhaps Wikidata needs a policy of don't bite newcomers. -Animalparty (talk) 21:26, 1 November 2019 (UTC)

Instance of an animal/plant/organism

How do you represent a specific animal? See, e.g. Binky (Q4914416). There is an error because polar bear (Q33609) isn't a subclass of anything, but according to Wikidata:WikiProject_Taxonomy taxa should (with rare exceptions) not be subclasses of anything; they should have parent taxon (P171) instead. Tamme-Lauri oak (Q3736402) doesn't have the error because Quercus robur (Q165145) has subclass of (P279) = Quercus (Q12004) but I think this is not how it's supposed to work on the Quercus robur (Q165145) item. Calliopejen1 (talk) 23:35, 31 October 2019 (UTC)

There is the additional complication here of polyphyletic groups of organisms known under a single common name-- e.g. algae (Q37868). That group may be a subclass of some higher-order group (for algae (Q37868), eukaryote (Q19088)), without having a parent taxon. But then you get an error because the higher-order group (itself a taxon) is not a subclass of anything. (I tried to add one at eukaryote (Q19088) before I knew the rules and got reverted[1].) Calliopejen1 (talk) 23:37, 31 October 2019 (UTC)
Others also have instance of individual animal (Q26401003), but there is still a constraint violation on the first statement. This should be changed, but I'm not sure if the best solution is to change the constraint, or create a new property. Peter James (talk) 12:47, 1 November 2019 (UTC)
This problem with subclasses also happens for things like fawn (Q29838967). Calliopejen1 (talk) 19:38, 1 November 2019 (UTC)
I think an approach like with animal breed (P4743) would work better. --- Jura 08:30, 2 November 2019 (UTC)

One or more properties?

Opensofias
Tobias1984
Micru
Arthur Rubin
Cuvwb
TomT0m
Tylas
Physikerwelt
Lymantria
Bigbossfarin
Infovarius
Helder
PhilMINT
Malore
Nomen ad hoc
Lore.mazza51
Pictogram voting comment.svg Notified participants of WikiProject Mathematics

There is already a ProofWiki ID property that links to articles in all namespaces of the proofwiki wiki. Is this the right approach or is better to create multiple properties, like "ProofWiki proof ID", "ProofWiki mathematician ID", "ProofWiki definition ID", etc? The second approach would help to restrict the domain of the single properties to "proof", "mathematician", "definition", etc instead of "mathematical concept".--Lore.mazza51 (talk) 22:06, 1 November 2019 (UTC)

Why? The WD items themselves should be classified the way you say via P31, and that's all what's needed IMO. --SCIdude (talk) 08:25, 2 November 2019 (UTC)
@SCIdude: I'm referring to the properties of the property. Having multiple properties allows to more easily recognize mistakes in their use because a property like "ProofWiki proof ID" can be used only for proofs while "ProofWiki ID" can be used for a mathematical concept, a mathematician, a book, etc.--Lore.mazza51 (talk) 15:03, 2 November 2019 (UTC)
Pragmatically I'd say let's wait and see how often this happens (i.e. I suspect overdesign). --SCIdude (talk) 15:07, 2 November 2019 (UTC)
  • For sites with different identifiers for different things (e.g. "123" is a work in one scheme, but a creator in another scheme), there needs to be separate properties.
If it's just the same identifier that is used for different things, we generally don't try to split the IDs (e.g. "123" is a person, "124" a work).
There was some debate if wikipages should be considered external-ids at all. The initial consensus was no (use string-datatype), now it tends to a yes (use external-id-datatype).
We already have ProofWiki ID (P6781) with external-id for all types of pages, but that doesn't seem to be actively used. --- Jura 15:17, 2 November 2019 (UTC)

Burials in cemeteries

We currently list where people are buried in their entry, is there any interest in the reciprocal property, listing the people buried in the entry for the cemetery? It could easily be done with a bot, we could even gray-out the data so it can only be edited from the person's entry. We always assume people are going to using SPARQL to query Wikidata. I think most people will come to it from a Google search, especially for entries not in Wikipedia. --RAN (talk) 23:50, 2 November 2019 (UTC)

  • That is going to hang an awful lot of data on any large cemetery. Imagine what it would be like for Père Lachaise Cemetery (Q311). - Jmabel (talk) 03:08, 3 November 2019 (UTC)
  • You can activate the gadget "relateditems" for this purpose. It displays inverse statements, e.g. on an item for a cemetery it lists people buried there. --Pasleim (talk) 03:46, 3 November 2019 (UTC)
  • I don't think this would be a good idea. Not everything has to be reciprocal, and this sort of approach would make the parent items very unwieldy. Andrew Gray (talk) 11:24, 3 November 2019 (UTC)

Autocompletion for properties in search box?

Is there a way to get autocompletion for properties in the search box at the top right-hand corner on this wiki?

--Gittenburg (talk) 09:10, 3 November 2019 (UTC)

I don't think so. I usually search with "P:..." and then search the property on the next page. ChristianKl❫ 12:07, 3 November 2019 (UTC)

URLs for Library of Congress Control Number (LCCN) (bibliographic)

I noticed that the hot links generated for instances of Property:P1144 don't work. It appears that the link should not be https://lccn.loc.gov/$1 but rather https://lccn.loc.gov/item/$1. As a newcomer I'm not confident to change P1144 on my own initiative, especially as this would put it out of step with the documentation at https://www.wikidata.org/w/index.php?title=MediaWiki:Gadget-AuthorityControl.js&oldid=179329592. I'm hoping somebody with more experience will pick this up. Apologies if this is not the correct forum to raise this.--Keith Edkins (talk) 14:16, 3 November 2019 (UTC)

@Keith Edkins: For Treasure Island (Q14944010) the current formatter gives https://lccn.loc.gov/11025047 while your proposal would lead to https://lccn.loc.gov/item/11025047. The former works while the latter does not. So, on which item does the current formatter not work? This might be an indication of something being wrong with that item. Toni 001 (talk) 14:58, 3 November 2019 (UTC)
@Toni 001: OK the problem seems to have been fixed at the other end. The examples I was looking at are working fine now. Issue closed.--Keith Edkins (talk) 15:20, 3 November 2019 (UTC)

Quick Statements for Commons

For what can I use QuickStatements for Wikimedia Commons and where is the Syntax for that described. I want to add Captions and it where great if it works with QuickStatements or is there another Tool I can use for it. -- Hogü-456 (talk) 15:46, 3 November 2019 (UTC)

Excuse me

Can someone add

because Template:Conversion-zh need to use they.--Sunny00217 (talk) 04:04, 2 November 2019 (UTC)

@Sunny00217: ✓ Done. Multichill (talk) 22:24, 3 November 2019 (UTC)

Fixing "unknown"

I ran across a couple of items yesterday which had "unknown" (unknown (Q24238356)) as the specified value rather than the unknown value special value. It looks like this is reasonably common for two specific properties - copyright status (P6216) and use restriction status (P7261), which have about 145k uses, and ~500 on all other properties.

P6216 and P7261 seem to be intentionally permitted values for those properties, which makes sense, but for everything else, should we just go ahead and migrate these to unknown value? There are about 500 uses in total, mostly on collection (P195), location (P276), and cause of death (P509). Andrew Gray (talk) 11:20, 3 November 2019 (UTC)

There are a few items such as Category:Unidentified serial killers (Q7031743) that seem to use the item correctly. Most of the ~500 would however work better with unknown value.
@Jarekt, Hannolans: it seems you argued in the past for using unknown (Q24238356) with copyright status (P6216). Can you explain why it might be better then unknown value? ChristianKl❫ 12:06, 3 November 2019 (UTC)
Not sure, we have 'unknown' and 'anonymous' in external dataset imports. Also for example in In Copyright - Rights-holder(s) Unlocatable or Unidentifiable (Q47530802). unknown is also used in Wikimedia Commons in the template https://commons.wikimedia.org/wiki/Template:Unknown to be used when the author is 'unknown'. Unknown is also related to orphan works. Probably we should also check which direction Commons takes?--Hannolans (talk) 12:51, 3 November 2019 (UTC)
The Wikidata datamodel generally allows you to be able to know whether or not two items have the same value by seeing whether they link to the same item. If I ask a question like: "What's the last book that the author who published the most books that were published at publisher X?" Wikidata should be able to answer that question to the best of it's knowledge. When we start to model all books where we don't know about the author as if they were written by the same author anonymous (Q4233718) a question like that gets answered wrong. Translating the commons template this way instead of translating is as unknown value would bring us many errors like that.
As far as In Copyright - Rights-holder(s) Unlocatable or Unidentifiable (Q47530802) goes, the item has additional meaning in that an investigation was made. ChristianKl❫ 13:28, 3 November 2019 (UTC)
I almost never use the unknown value special value and forgot about it's existence when setting one-of constraint (Q21510859) constraints for copyright status (P6216). I am fine with using unknown value special value instead of unknown (Q24238356). --Jarekt (talk) 03:03, 4 November 2019 (UTC)

Maps and graphs broken on Wikidata:Wikidata in Wikimedia projects

Hi all

I notice the maps and graphs on Wikidata:Wikidata in Wikimedia projects#Maps_and_graphs are broken, it looks like there has been some backend changes that have broken it, it doesn't look like there have been any changes made to the page its self. Does anyone know how to fix it, I don't have the technical knowledge to fix it myself.

Thanks

--John Cummings (talk) 20:35, 1 November 2019 (UTC)

@John Cummings: Same issue as [2]? Ayack (talk) 14:05, 4 November 2019 (UTC)

Why do we have two lexeme for Run ?

See Lexeme:L162400 and Lexeme:L279 --Eatcha (talk) 13:35, 2 November 2019 (UTC)

Do you delete them, or merge them. Although, nothing to be merged here afaik. Thanks --Eatcha (talk) 16:18, 2 November 2019 (UTC)
Merged! ArthurPSmith (talk) 18:27, 4 November 2019 (UTC)

Best practices for documenting a person's immigration/emigration?

What is the best way to describe a person emigrating from their birth country and immigrating to a new country (i.e. more or less permanently)? The dates, source country, and target country could be modeled. Are there devoted properties, or would it be something like significant event (P793) -> immigration (Q131288), with qualifiers such as date and "from"/"to" nations? -Animalparty (talk) 22:34, 3 November 2019 (UTC)

residence (P551) with start and end dates if known. ChristianKl❫ 22:38, 3 November 2019 (UTC)
That would work, but residence (P551) seems more appropriate for city, mansion, notable building, etc. Is it advisable to have redundancy in, say, residence (P551) = United States of America (Q30), residence (P551) = New York (Q1384), residence (P551) = New York City (Q60)? -Animalparty (talk) 22:49, 3 November 2019 (UTC)

No, just give the most specific statement(s). Any software using WD should be able to deduce USA/New York from New York City. SCIdude (talk) 07:40, 4 November 2019 (UTC)

New template for SPARQL query pages

I’ve put together a new template, {{Query page}}, which allows you to store a query on a dedicated wiki page and transclude it elsewhere in a variety of styles. This is convenient because when you need to update the query (e. g. due to data model changes, item merges, changes to the query service software), you only need to do it in one place and then everything that transcludes the query will be updated automatically.

For example, since I ported User:TweetsFactsAndQueries/Queries/editorial cartoons to the new template, you can use it like this:

{{User:TweetsFactsAndQueries/Queries/editorial cartoons|style=SPARQL}}
SELECT ?cartoon ?cartoonLabel (SAMPLE(?image_) AS ?image) (MAX(?date_) AS ?date) WHERE {
  ?cartoon wdt:P31/wdt:P279* wd:Q2916094.
  OPTIONAL { ?cartoon wdt:P18 ?image_. }
  OPTIONAL { ?cartoon wdt:P571|wdt:P577 ?date_. }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
GROUP BY ?cartoon ?cartoonLabel
ORDER BY ?date
Try it!
{{User:TweetsFactsAndQueries/Queries/editorial cartoons|style=link embed with source}}
editorial cartoons (source)
… as you can see in [{{User:TweetsFactsAndQueries/Queries/editorial cartoons|style=url}} this query], …
… as you can see in this query, …

You can see full list of available styles on the template documentation page; new styles can be defined as subpages of Template:query page/style/. Wikidata:Events/Wikidata Zurich Training2019/Showcase queries has some more examples of the “link embed with source” style (diff).

I hope this is useful to some of you! I plan to migrate all of my queries at User:TweetsFactsAndQueries/Queries to this soon, it’ll just take a while. --TweetsFactsAndQueries (talk) 00:30, 4 November 2019 (UTC)

Heh, and as if to demonstrate the motivation for this template, that query was actually broken (by phabricator:T235540) and had to be fixed :) --TweetsFactsAndQueries (talk) 00:50, 4 November 2019 (UTC)
  • How about making it use the dedicated namespace? --- Jura 07:53, 4 November 2019 (UTC)
  • @TweetsFactsAndQueries: While this is nice, longer-term IMO it would still also be good to be able to have more of the functionality of Quarry for WDQS (phab:T104762 "Setup sparqly service at https://sparqly.wmflabs.org/ (like Quarry)") -- a service that would remember one's queries, allow one to label them, and also to share them, so that one could more easily retrieve and revisit one's own old queries, without one having to save a url or copy a query to a wiki page for each one. The ticket could use some love, to show this is still something the community would value. As User:Legoktm implies in this comment [3], such a service, that would mint its own URLs, might well be preferable to a URL shortener that changes URL every time the query is modified, and currently refuses to shorten queries over 2000 characters. Jheald (talk) 08:15, 4 November 2019 (UTC)
@Jheald: that’s a separate need, though, isn’t it? phabricator:T104762 is about freezing a query and its results at a certain point in time, whereas {{Query page}} facilitates making both variable, so that you can update the query later when necessary.
I suppose once Listeria supports {{query page}}, you can build an approximation of Sparqly by using {{Wikidata list}} on a query page’s talk page and sharing permalinks to revisions of that talk page. --TweetsFactsAndQueries (talk) 10:39, 4 November 2019 (UTC)
@TweetsFactsAndQueries: The top of the ticket talks about "a web service where people can make SQL queries and share these queries and the result". It seems to me that the making and sharing (and remembering) of queries is the primary ask of the ticket, presenting an archived snapshot of a set of frozen results is an additional further ask. In that respect, I think User:GZWDer ("Bugreporter") may have been wrong to close phab:T211130 as a duplicate -- really it's an additional sub-task.
Quarry also allows one to go back and edit and re-run an existing query, so a query can be updated under the same URL when necessary.
The point of the ticket, I think, is to have a service that takes care of all of the remembering and publication of a query itself, like Quarry, without having the user having to go through the hoops of saving the URL somewhere, or copying the query to a wikipage (with or without added Listeria monitoring). Jheald (talk) 10:49, 4 November 2019 (UTC)

For template queries, see also the collection on Category:Query template and Category:Partial query, that store parametric queries and templates that help building queries. author  TomT0m / talk page 12:12, 4 November 2019 (UTC)

Two links to the same project

Why I cannot link twice to WMC? It used to work. --Juandev (talk) 11:20, 4 November 2019 (UTC)

When did that work? 622 017 074 (Hej!) 11:59, 4 November 2019 (UTC)

Wikidata weekly summary #389

Podcasts

Do we have any models for items on podcasts? Is there any guidance on how granular to go? I've been adding a bit of info on Speaking with Shadows (Q70345619) and its episodes and it's got me wondering about stuff like how to represent bonus episodes when using series ordinal (P1545). Richard Nevell (talk) 19:02, 4 November 2019 (UTC)

Wikidata:Map_data

Hi all

Simon Cobb, Nav Evans and myself have written Wikidata:Map_data, instructions on uploading map data to Commons and using it on Wikidata and other Wikimedia projects, its a little fiddly to say the least. Let us know if anything is missing or unclear.

Thanks

--John Cummings (talk) 23:56, 4 November 2019 (UTC)

Google Code-In will soon take place again! Mentor tasks to help new contributors!

Hi everybody! Google Code-in (GCI) will soon take place again - a seven week long contest for 13-17 year old students to contribute to free software projects. Tasks should take an experienced contributor about two or three hours and can be of the categories Code, Documentation/Training, Outreach/Research, Quality Assurance, and User Interface/Design. Do you have any Lua, template, gadget/script or similar task that would benefit your wiki? Or maybe some of your tools need better documentation? If so, and you can imagine enjoying mentoring such a task to help a new contributor, please check out mw:Google Code-in/2019 and become a mentor. If you have any questions, feel free to ask at our talk page. Many thanks in advance! --Martin Urbanec 07:28, 5 November 2019 (UTC)

I am a patroller on Commons, but not on WD

Then why I am asked to mark changes as patrolled? Thanks -- Eatcha (talk) 17:18, 5 November 2019 (UTC)

@Eatcha: We don’t have a distinct patroller group here – every autoconfirmed user has the patrol right. --Lucas Werkmeister (talk) 18:25, 5 November 2019 (UTC)
Thanks --Eatcha (talk) 18:33, 5 November 2019 (UTC)

Dirk Landau ( Martin Rutsch)

I’ve done an image search and the photo of this person is being used online dating site. Thought somebody should be made aware of this. Not sure how this site works . ?  – The preceding unsigned comment was added by 1.128.109.3 (talk • contribs) at 08:57, 11 November 2019‎ (UTC).

This doesn't seem to be a Wikidata issue. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:32, 11 November 2019 (UTC)
This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:32, 11 November 2019 (UTC)

Quantities: Instances and Classes

In recent weeks I've been improving units in Wikidata, which is still a large, ongoing and probably never ending project. Very helpful was alignment with the WL by the property Wolfram Language unit code (P7007), which allows for instance comparison of conversion factors. Now that we have Wolfram Language quantity ID (P7431), similar, large scale improvements will be possible. However, one issue in Wikidata is the classification of (physical) quantities: What are the individuals and what are the classes? I estimate that Wikidata has items for about 1000 physical quantities, but currently there is no single query that finds all of them, and nothing more.

I'd like to propose following the scheme outlined in Defining 'kind of quantity' (Q71548419). It is consistent with the treatment of quantities in the relevant standards, namely International Vocabulary of Metrology (3rd edition, 2012) (Q70257574) and ISO 80000 (Q568496).

Proposal (using examples; for precise terminology, refer to the text):

  • 5 kg, 3 apples, 5 rad, ... are individual quantities
  • length, area, radius, apple count, ... are classes of individual quantities
  • radius is a "subclass of" length; lengths is a "subclass of" physical quantity
  • 5 kg is an "instance of" mass; 5 kg is also an "instance of" a physical quantity

Notes on Defining 'kind of quantity' (Q71548419): Page 5, figure 2 illustrates this idea very nicely. The box called M0 are what Wikidata calls "instances". The box called M1 contains classes, solid arrows are "subclass of" relations in Wikidata.

If we consistently follow that scheme then query [1] will contain exactly those elements that we have in mind when talking about "physical quantity".

I'm writing this comment to raise awareness of the tricky issue "instance of" vs "subclass of" in the context of quantities, to solicit feedback, support or criticism, and potentially pointers to previous discussion that I have missed.

[1] Query physical quantities (the classes like "length", "area", not the individuals like 5 m, ...):

select distinct ?pq ?pqLabel where {
  ?pq wdt:P279* wd:Q107715 . # physical quantity
  service wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . }
}

Try it!

Toni 001 (talk) 14:10, 21 October 2019 (UTC)

  • @Toni 001: Thanks - this is clear enough, and I Symbol support vote.svg Support your approach. This use of instance of (P31) when it more logically should be subclass of (P279) is a very general problem within wikidata's class hierarchy, and we've been rationalizing it a bit with arguments about metaclasses, but I think just converting more of those abstract relations to subclass relations would work in most cases (and I agree it makes sense in this one). ArthurPSmith (talk) 17:17, 21 October 2019 (UTC)
  • Symbol support vote.svg Support Ghouston (talk) 00:29, 22 October 2019 (UTC)
  • Pictogram voting comment.svg Comment$ I think the general question is which ones are instances of units and which ones are classes of units. Anything with a SI conversion should be an instance. --- Jura 13:55, 23 October 2019 (UTC)
@Jura1: This suggestion is about quantities, not units. The relation is that a quantity (that is, an instance of physical quantity) is a composite object, the two components are: number and unit. Toni 001 (talk) 15:38, 23 October 2019 (UTC)
Yeah, I noticed, but I'm not really sure what items you have in mind (I don't think there are too many like 45° angle (Q42315784)). Maybe 100 yottametres (Q3597943), but that is an order of magnitude. --- Jura 15:54, 23 October 2019 (UTC)
I see. Here is what I found:
  1. There are currently about 1700 subclasses of physical quantity (Q107715). This is already pretty good (compared to a week ago): I guess less than half of them don't belong there, and some (not sure yet, could be hundreds) are missing.
  2. There are currently about 11000 instances of physical quantity (Q107715) (or subclasses thereof). This list is what should contain things like 5m, 45°, but contains a lot of things that should not be there. This will be a bigger project to clean that up, and might trigger some separate discussions. For instance, the list contains the year 1184. Is that really an individual quantity? This item states "instance of year", and year states "subclass of orbital period". Both statements don't seem wrong when looked at individually, but the issue is that "year" has two different meanings: A point (well, somewhat spread) on some time scale / calendar, and a duration. The latter is sometimes called "annum" to distinguish it from the former.

To summarize, I first want to concentrate on point 1, getting quantity classes cleaned up, and in the process tighten some property constraints (for instance, measured physical quantity (P111) currently allows both subclasses and instances of physical quantity as values - it should only allows subclasses). Toni 001 (talk) 19:34, 23 October 2019 (UTC)

Well, top-down queries tend to get surprising results if they don't time out. It's generally easier to fix things bottom-up. --- Jura 08:54, 24 October 2019 (UTC)
@Jura1: Yes. The top-town approach is important for understanding the ontology. Here is an example of a query that will help improving quantities bottom-up: All the quantities described by ISO 80000 (Q568496); I guess that list contains less than 10% so far, so I'm going through the standard to add values for described by source (P1343):
select ?quantity ?quantityLabel ?quantitySymbols ?wlQuantity ?source ?sourceLabel ?itemNumber where {
  ?quantity wdt:P279+ wd:Q71550118 .                    # quantity
  ?quantity p:P1343 [
    ps:P1343 ?source ;
    pq:P958 ?itemNumber ;
  ] .
  ?source wdt:P629 / wdt:P361 wd:Q568496 .              # any edition of any part of ISO 80000
  optional { ?quantity wdt:P7431 ?wlQuantity . }
  service wikibase:label { bd:serviceParam wikibase:language "en" . }
  optional {
    select ?quantity (group_concat(?quantitySymbol; separator = ", ") as ?quantitySymbols) where {
      ?quantity wdt:P416 ?quantitySymbol .
    } group by ?quantity
  }
} order by ?itemNumber
Try it! Toni 001 (talk) 22:58, 3 November 2019 (UTC)
Symbol oppose vote.svg Oppose There's some leak in this scheme, and I can imagine where. Compare density (Q29539) and intensive quantity (Q3387041). In your scheme they are both subclasses of physical quantity (Q107715) which is illogical. From my point of view physical quantity (Q107715) should be M2 (metaclass), intensive quantity (Q3387041) and similar would be its subclasses (and thus also M2), and items like density (Q29539) (which has definite units of measure) are P31 of any previous (and thus M1). I don't care about M0 (which you call "individual quantities" but I'd rather call them "individual measurements") now as they seem to be rare in WD, but I suspect that there should be another relation (not P31) between them and M1-quantities. --Infovarius (talk) 22:58, 27 October 2019 (UTC)
@Infovarius:
  • Number-unit-combinations like 7 g/m^3 are called individual or particular quantity in different standards. There is a class for all of them, individual quantity (Q71550118). This class contains various subclasses, formed by applying objective or conventional restrictions, which are described by adjectives or qualifiers like "physical", "chemical", "intensive", "extensive", "areal", "molar", "base", "computational", ... . Those classes can overlap, as their restrictions are concerned with different aspects of quantities. Those classes then contain further subclasses which refer to a particular "property of a phenomenon, body, or substance" (definition of "quantity" in VIM3), for instance density, length, radius, wavelength and so on. That is a "flat", but consistent model, far from illogical.
  • quantity and measurement are not the same.
  • One question might be how to list (that is, query) things like length, area, ... but excluding physical quantity, base quantity, ... . But this is solved already: Defining 'kind of quantity' (Q71548419) defines the concept of general quantity (Q71758646) of which physical quantity (Q107715), ... are instances (yes, instances, not subclasses). Then the query simply has to look for subclasses of individual quantity (Q71550118) excluding instances of general quantity (Q71758646).
Toni 001 (talk) 05:26, 31 October 2019 (UTC)
@Ain92: The distinguishing feature between density (Q29539) and intensive quantity (Q3387041) is that the latter is an instance of general quantity (Q71758646) (a concept explained in Defining 'kind of quantity' (Q71548419)), while the former is not. Toni 001 (talk) 05:46, 31 October 2019 (UTC)


Constraint for reference

I want to add a constraint to Elo rating (P1087) that a reference using retrieved (P813) and either stated in (P248) or reference URL (P854) is required at each statement. Is there a predefined contraint, or do I need a complex constraint? Steak (talk) 09:51, 5 November 2019 (UTC)

Thanks, but I am not sure if this is what I want. What does citation needed constraint (Q54554025) comprise? Does it include the constraint to add a date with retrieved (P813)? Note also that imported from Wikimedia project (P143) would as far as I can see also count as "reference", but actually they should be constraint violations. Steak (talk) 11:18, 5 November 2019 (UTC)
If you want a more limited set of properties in references, a complex constraint is needed. --- Jura 11:34, 5 November 2019 (UTC)
It is, at least at de:wp, totally normal that a reference has a retrieval date. And I don't see a reason why this should not be applied here also. Steak (talk) 15:16, 5 November 2019 (UTC)
If something is normal, then it doesn't need to go in a constraint for a specific property. If you want to add a constraint to Elo rating (P1087), you have to make a case for why it matters for the specific property in a way it doesn't matter for other properties.
Wikidata works on the assumption that if a data user like dewiki which only wants to have data with retrieval dates, dewiki model can do the appropriate filtering and can load only data it wants. ChristianKl❫ 15:29, 5 November 2019 (UTC)
If someone wants to go through all statements of a property to ensure they are properly sourced, I don't see much of an issue to add such a constraint, preferably probably as a suggestion constraint.
The main problem with P1087 I realized only later seems to be that there is only one reference for it anyways (actually apparently there is another for earlier years). --- Jura 09:19, 6 November 2019 (UTC)

The appearance /physical attributes/ of drugs

I am reposting the following text from Wikidata_talk:WikiProject_Medicine#The_appearance_/physical_attributes/_of_drugs, since I got no reaction there.

I am working (with the help of members of a small non-wmf wikicommunity) on openaccess wikiarticles about various drugs. We are planning on incorporating information about various medical substances that are available through wikidata in the future. Through our project we already donated numerous photos of different drugs/drug forms to commons. For our users - medical students, nurses and doctors alike - are information about the physical attributes of the drugs very useful and we would like to add them to wikidata and afterward to our project. But I am not sure how to add this kind of information to wikidata, there are two barriers I would need advice with:

  1. the physical appearance of the drug form, such as "white, round tablets" are specific per manufacturer and the drug, is it viable to set up new wikidata items for the drug by specific manufacturers /say "Metformin Teva 500mg" - an item for tablet containing 500mg of metformin produced by Teva company/?
  2. are there any properties on wikidata already in place that could be used for the physical appearance description?

Thank you for your reply in advance. --Wesalius (talk) 16:25, 4 November 2019 (UTC)

2. Yes, we have for example shape (P1419), color (P462) or mass (P2067).
1. We already have items on compound level (like sildenafil (Q191521)) and product level (like Viagra (Q29006643)). You are proposing even one more deeper specific product level. As long as these items will be properly sourced and connected with "product level" it seems acceptable to me.--Jklamo (talk) 19:40, 4 November 2019 (UTC)
Does it suffice to stay on the product level and put multiple pictures there, distinguishing them by a qualifier value? --SCIdude (talk) 05:01, 5 November 2019 (UTC)
Thank you for your response Jklamo. I will discuss with the team if we actually need the deeper level, because as SCIdude suggests (if I understand it correctly), the different dosages of similar products (say Metformin Teva 500mg and Metformin Teva 850mg) might not need independent items, but be distinguished by a qualifier within an item for Metformin produced by Teva. --Wesalius (talk) 18:40, 5 November 2019 (UTC)
  • I think it's desireable to have Wikidata items for specific packaging like "Metformin Teva 500mg" when the new items are correctly linked to the existing items. The main reason that we currently don't have those items is that it's a decent chunk of work to create them.
I would prefer to have separate items for compound/product/packaging whereby the packaging subclasses the product. Independent items have the advantage that it's possible to be more specific. This would allow us to store information such as Global Trade Item Number (P3962) in Wikidata for the different packaging. That would allow it for someone to program an App that simply scans the barcode of a drug and then goes to the correct Wikidata item. ChristianKl❫ 10:01, 7 November 2019 (UTC)

Tobias1984
Doc James
User:Bluerasberry
Wouterstomp
Gambo7
Daniel Mietchen
Andrew Su
Peter.C
Klortho
Remember
Matthiassamwald
Projekt ANA
Andrux
Pavel Dušek
Was a bee
Alepfu
FloNight
Genewiki123
Emw
emitraka
Lschriml
Mvolz
Franciaio
User:Lucas559
User:Jtuom
Chris Mungall
ChristianKl
Gstupp
Geoide
Sintakso
علاء
Dr. Abhijeet Safai
Adert
CFCF
Jtuom
Lucas559
Drchriswilliams
Okkn
CAPTAIN RAJU
LeadSongDog
Ozzie10aaaa
Sami Mlouhi
Marsupium
Netha Hussain
Abhijeet Safai
ShelleyAdams
Fractaler
Seppi333
Shani Evenstein
Csisc
linuxo
Arash
Morgankevinj
Anandhisuresh
TiagoLubiana
ZI Jony
Viveknalgirkar
Pictogram voting comment.svg Notified participants of WikiProject Medicine ChristianKl❫ 10:03, 7 November 2019 (UTC)

a thought for longer-term: works, editions, and searching

This is just a thought for consideration... Literary works are supposed to have one item for the work in general and an item for every edition. Now, if there are items for every edition, searching and auto-completion become quite unwieldy, you have a long list and it can be hard to figure out which is the item for the work. Most of the time it's the work you'll be looking for. This only affects a few works so far, but it'll be a big nuisance if the literary part of the data keeps growing. Has anyone yet given a thought to solutions? Not necessarily to be implemented now, but to be mulled over. Like, here's something I imagine, I don't know if it'd be workable: suppose in the list that comes up for searching and auto-completion, items that are instances of "literary work" or subtypes ("Poem," etc.) would come up just as usual, but there would be a single differently-colored entry for "edition, translation, or version;" and this item would not be an ordinary entry, but instead, clicking on it would take you to a sub-list of all editions (but if there's only one, it would automatically select that). Levana Taylor (talk) 02:50, 5 November 2019 (UTC)

@Levana Taylor: Better have that discussion in Wikiproject Books. We already have a similar discussion about how distinguish edition and work in that discussion. Snipre (talk) 07:46, 5 November 2019 (UTC)
Yes, I figured this must have been discussed sometime. I will read and then repost. Thanks. Levana Taylor (talk) 15:00, 5 November 2019 (UTC)
It would be nice if searches allowed the user to restrict searches to works or editions. Libraries do this with their databases: having records clearly marked as "Author/Person" or "Work" or "Edition/Copy in the Library (Bibliographic)". I'm not sure how we could do that here, since the Wikidata collection of items includes many thousands of additional items that aren't books or editions or authors. --EncycloPetey (talk) 16:49, 5 November 2019 (UTC)
There is a way to do something like this but it's not very convenient in the UI - basically you have to add "haswbstatement:P31=Qxxxxx" in the search box, where Qxxxxx is the id of the class you want to restrict the search to. For example to find only items which are instances of "book" try something like "Paleontology haswbstatement:P31=Q571" in the search box. ArthurPSmith (talk) 19:39, 5 November 2019 (UTC)
@ArthurPSmith: There is a way to do something like this but it's not very convenient in the UI Unfortunately, that seems to be a common theme with Wikidata. --Trade (talk) 19:48, 5 November 2019 (UTC)
How is that unfortunate? It simply means that there are a lot of ways programming resources can be used to improve Wikidata and a lot of ways to grow. ChristianKl❫ 16:40, 6 November 2019 (UTC)

Data import pages/subpages

Some time ago, I asked for people interested in Category:In_Progress_datasets (Wikidata:Project_chat/Archive/2019/08#Category:In_Progress_datasets).

Apparently, there isn't much, if any, interest in working with these pages.

It seems that we mostly loose potentially interested contributors who try to create a subpage there, get a mostly pre-filled page and that then ends up being abandoned.

I suggest we archive the experiment and ask people to use the default channels. --- Jura 15:46, 6 November 2019 (UTC)

I agree that this space is not really working as it should at the moment. I think these pages could potentially be improved instead of deleted, by trimming down the content of the template that generates them. See my comment at Wikidata_talk:Dataset_Imports#Feedback_on_the_import_pages. But if no one wants to take care of that, we should not incentivize newcomers to use this space, since their efforts will likely go unnoticed. − Pintoch (talk) 18:54, 6 November 2019 (UTC)
Thanks for your feedback. I noticed your comment when cross-referencing this there. Seems we made similar observations. Earlier versions were actually much more readable (there was just one page), but even that didn't draw much attention. --- Jura 13:56, 7 November 2019 (UTC)

"Ramses" (Q1343144)

Hi, the above data item was extensively edited by an IP address, and removed a lot of stuff, added some other things. Other editors tried to fix it (like me) as the current description (South African hip hop artist) is not "Ramses Shaffy", Dutch singer. Can someone reset the whole thing to whatever it was before IP started editing (and I assume all was well)? I looked in the history, but am not BOLD enough to click the button. Thank you. Deadstar (talk) 09:24, 7 November 2019 (UTC)

✓ Done by @M2k~dewiki:. Thank you. Deadstar (talk) 11:19, 7 November 2019 (UTC)

article in response: follows or inspired?

Suppose that in a set of magazine articles I'm entering, a few months after one article appears, another by a different author is published which is a reply or response to the first. How do I indicate the relationship between them: follows (P155)/followed by (P156), or inspired by (P941)? --Levana Taylor (talk) 10:37, 7 November 2019 (UTC)

reply to (P2675) perhaps? Andrew Gray (talk) 11:27, 7 November 2019 (UTC)
Aha, that's it! I didn't know about that one. Thanks Levana Taylor (talk) 13:44, 7 November 2019 (UTC)

HQ or Administrative territory

National Gallery of Art (Q214867) as an art museum is both an organization and a building, so should it have a headquarters or be located in an administrative territory? They are mutually exclusive, and naming the architect requires that it be a building that is "located in [an] administrative territorial entity". --RAN (talk) 13:47, 6 November 2019 (UTC)

  • Probably needs to be split into two items, one for the building and one for the institution. - Jmabel (talk) 16:40, 6 November 2019 (UTC)
I'd prefer to challenge the 'mutually exclusive' constraint. If we consider a museum organisation (say, York City Museum) operating from a museum building (York City Museum Building), there is no logical reason why we should be able to say that the building is P131=City of York but not say that the organisation is P131=City of York. To that end, I'm agitating right now on P131 talk. Where we have so many organisations (especially schools, libraries, museums) where the org and the building are effectively represented in a single item, the constraint that orgs should have no P131 values causes harm for no obvious gain. --Tagishsimon (talk) 21:30, 6 November 2019 (UTC)
Conflating an organiztaion and building into one article is common practice on wikis, but certainly not a good idea for stuctured database. Two items are neede, as these are two different entities. As example National Museum (Q188112) and Main building of National Museum in Prague (Q43755714), note the different set of related properties.--Jklamo (talk) 09:52, 7 November 2019 (UTC)
I agree that they can be combined into one entry, unless they have multiple buildings or have moved from one building to another. For instance a church can be a building and a congregation that have different inception dates. But if they are the same, keep one entry. --RAN (talk) 15:35, 7 November 2019 (UTC)
Tagishsimon explained it well. Some people have a very narrow world view and are trying to enforce that by adding constraints. I removed this one. Multichill (talk) 20:34, 7 November 2019 (UTC)

Lexeme mistakes

The following Lexemes are about German nouns, which are capitalized without exception:

I hope someone can fix it. Greetings Bigbossfarin (talk) 18:36, 4 November 2019 (UTC)

  • @Bigbossfarin: Is there anything preventing you from fixing them yourself? U+1F360 (talk) 18:49, 4 November 2019 (UTC)
    Yes, the first thing to do, would be to find duplicates and merge them. I couldn't find a tool where you can type in Lexeme names and get a L-number. The second thing would be to change Lexeme names automaticly with a tool, a guess this could work with QuickStatements Greetings Bigbossfarin (talk) 18:59, 4 November 2019 (UTC)
Can't this be done simply using the search bar and selecting only Lexemes? That's usually how I look for them. - Sarilho1 (talk) 09:33, 5 November 2019 (UTC)
Not only the capitalization is wrong, but lexemes like mathmatiker (L72782) are simply wrong. Steak (talk) 14:26, 5 November 2019 (UTC)
  • Indeed, I can delete them and send them back to the curators for checking if they find more errors. Would that be OK? Wpbloyd (talk) 10:13, 6 November 2019 (UTC)
@Wpbloyd: Where do the items come from? If you mean "upload to Wikidata" with "send them back to the curators for checking" that feels a bit dubious to me. ChristianKl❫ 13:33, 6 November 2019 (UTC)
@ChristianKl: the terms comes from an Internal curated list of an architectural archive, so I meant to send the list back to the curators for checking that everything is fine Wpbloyd (talk) 13:49, 6 November 2019 (UTC)
@Wpbloyd: Then that sounds like a good approach. ChristianKl❫ 17:41, 7 November 2019 (UTC)
@Wpbloyd: For future uploads, would you check here on project chat or with the relevant WikiProject before uploading? We already had problem with your other upload and if you upload data you don't any basic checks, it deteriorates overall data quality. --- Jura 13:22, 8 November 2019 (UTC)

What was the library used for visualizing the result of wikidata sparql result?

I am wondering if you can point me out to what is the library used to visualize the query result of wikidata query. I am actually impressed on how the timeline was implemented. I would like to create a similar GUI for my project.  – The preceding unsigned comment was added by 126.140.210.87 (talk • contribs) at 11:56, November 7, 2019‎ (UTC).

The query service uses Blazegraph. The actual code running the UI etc. is open source and available on github as wikimedia/wikidata-query-rdf/. ArthurPSmith (talk) 16:32, 7 November 2019 (UTC)
Hmm, probably wrong repository - the GUI I think for the query service is at wikimedia/wikidata-query-gui. ArthurPSmith (talk) 16:34, 7 November 2019 (UTC)
Specifically, the timeline is implemented in TimelineResultBrowser.js and uses the vis library. (We should probably migrate to visjs at some point.) --Lucas Werkmeister (WMDE) (talk) 13:24, 8 November 2019 (UTC)

sport hunting

Is sport-hunting or big-game-hunting actually a sport? Or is it just an activity? It isn't a competitive sport like tennis, but is it a sport by Wikidata's definition? --RAN (talk) 13:11, 7 November 2019 (UTC)

There's no (by Wikidata's definition) in most cases. Wikidata generally tries to match what can by shown with sources. ChristianKl❫ 17:38, 7 November 2019 (UTC)
In Wikidata, "sports" are defined with a subclass tree under sport (Q349), and that item itself is a subclass of physical activity (Q747883). Thus, different types of sport are sort of more specialized versions of physical activity. I personally do not like this approach for several reasons, but this is what has evolved over time and some properties are relying on this subclass hierarchy.
In that context, neither big-game hunting (Q4904849) nor hunting (Q36963) are a sport by Wikidata's definition, although someone has tried to make the latter item a sport with an incorrect instance of (P31) claim. As both activities are not predominantly recognized as a sport, I suggest not to define them as sport in Wikidata. If necessary, we could have a separate item "sport hunting" which then subclasses both hunting (Q36963) and sport (Q349). --MisterSynergy (talk) 18:45, 7 November 2019 (UTC)
I ran into a similar issue on Fly casting (Q56634867). If something like comeptitive hunting actually has an item, skipping sport and going straight to competition (Q841654) is probably the solution, now that I think of it. Circeus (talk) 18:05, 8 November 2019 (UTC)

Automated addition of WikiJournal metadata to Wikidata

Bot request at Wikidata:Bot_requests#Automated_addition_of_WikiJournal_metadata_to_Wikidata

Currently, a lot of info of each WikiJournal article is stored in the v:template:article_info (essentially in infoboxes). It'd be ideal to be able to easily synchronise this over to wikidata (list of submitted articles ; list of published articles). We used to import metadata for published articles from crossref to wikidata via sourcemd, but that not working currently, and also crossref lacks a lot of useful metadata. Would it be possible to synchronise this so that it's imported into wikidata, then transcluded back over to the wikijournal page? This should also help to automate the tracking table that currently has to be updated manually. It'd similarly be useful to add editors from this page to wikidata (either to the journal item or to the item for the person as appropriate). T.Shafee(evo&evo) (talk) 09:31, 9 November 2019 (UTC) edited T.Shafee(evo&evo) (talk)

Completeness of Lists

How can we indicate that a set of relationships is complete? For instance, AK-74 (Q156229) has the conflict (P607) property. This list doesn't include every conflict that this entity is related to. The use case for this is to make exclusionary queries like "In which wars in the 21st century was the AK-74 (Q156229)? not used." possible. If a list can be marked either as "exhaustive" or "not a complete list", it would help deliver clear answers or inform the asker of any ambiguities that may exist.

--Weslima (talk) 17:14, 4 November 2019 (UTC)

  • I think adding a new unknown statement would indicate that there is more to the property that is not known... but I'm curious what others think. U+1F360 (talk) 17:22, 4 November 2019 (UTC)
  • Without Wikidata, would there be a way to find a reliable answer to your question? --- Jura 19:12, 4 November 2019 (UTC)

@Jura1: Good question. Maybe not for this specific property, but for some other types yes. That's yet another reason to indicate completeness (or lack thereof) of a set: To know if a "negative" question can be answered at all.  – The preceding unsigned comment was added by Weslima (talk • contribs).

Wikidata:WikiProject Movies/reports/TV episodes/complete attempts to do that for episode items. --- Jura 20:26, 4 November 2019 (UTC)
@Jura1: Wouldn't number of episodes (P1113) technically be duplicate data? How would that be accomplished if that property didn't exist? U+1F360 (talk) 20:32, 4 November 2019 (UTC)
The lists has 337 entries, the property 45,237 uses. Even if not all uses are for TV series, what does it duplicate? --- Jura 20:40, 4 November 2019 (UTC)
@Jura1: The column "number of episodes (P1113)" and "number of episodes (actual)" (which is a count of has part (P527) on instances of television series season (Q3464665)) are being compared to determine if Wikidata's list of episodes is "complete" or not. In the example that @Weslima: provided, there is no such "total" property (like number of episodes (P1113) for television series (Q5398426)). So what I'm saying is, is if number of episodes (P1113) didn't exist, how would someone be able to determine if the list is "complete" or not? What mechanism would they use? U+1F360 (talk) 20:52, 4 November 2019 (UTC)
I agree with Weslima that for the sample they gave, it isn't possible already outside Wikidata. --- Jura 20:59, 4 November 2019 (UTC)
@Jura1: what do yout think of my proposal to add an unknown statement (if it is known to be incomplete)? U+1F360 (talk) 21:01, 4 November 2019 (UTC)
I think expected completeness (P2429) when applied to conflict (P607) would do some that. --- Jura 06:52, 5 November 2019 (UTC)
@Jura1: I think we may be talking about two different things. I believe expected completeness (P2429) implies the completeness of the entire property (all usages of it) rather than some usages of it. For instance, what if in one instance, the list is complete, but in another instance, it is not? U+1F360 (talk) 14:20, 5 November 2019 (UTC)
@U+1F360: one could qualify P2429 statements for different P31/P279. Maybe for ABC it could eventually be complete, for others it can't really.--- Jura 19:52, 5 November 2019 (UTC)
@U+1F360: actually, even for ABC is not possible as "conflict" is too vague. --- Jura 18:05, 6 November 2019 (UTC)
@Jura1: right... that's why I think the completeness would need to be assessed on a per statement group basis. See phab:T237472. U+1F360 (talk) 18:08, 6 November 2019 (UTC)
I think that rather illustrates that it's sufficient to do that on a per property basis for this property. --- Jura 18:11, 6 November 2019 (UTC)
@Jura1: I don't understand. Let's say you have three uses of conflict (P607), one is complete, another is incomplete and can be completed, and another is incomplete and cannot be completed. Should we make three different properties and move the values between them when the use moves from one to another? U+1F360 (talk) 18:15, 6 November 2019 (UTC)
Can you find an item where it's complete? --- Jura 18:19, 6 November 2019 (UTC)
Well that's an important point, should it always be assumed to be incomplete (even if that is known to be untrue)? U+1F360 (talk) 18:21, 6 November 2019 (UTC)
Well, I'm curious about the sample(s). --- Jura 18:24, 6 November 2019 (UTC)
@Jura1: Here's an example I know about... I know (100% certain) that the has part (P527) on Universal Studios Florida (Q1880820) is complete (as far as, in this moment, maybe not historically), but I'm also 100% certain that the has part (P527) on Epcot (Q1052042) is incomplete (but can be completed). U+1F360 (talk) 18:36, 6 November 2019 (UTC)
@U+1F360: well, I was actually looking for a sample with conflict (P607) discussed earlier.
Still has part (P527) is somewhat comparable. How do you know it's complete? Ideally you would have a reference stating that there are 8 themed areas. Oddly, the references omits one mentioned in Wikipedia that has also an item (Q16932605), but isn't linked and somehow you created a duplicate for Q2623650. Still, let's agree we actually have all themed areas and can reference that. The nice thing about has part (P527) is that it comes with a companion property has parts of the class (P2670) which could hold such a statement. A query could than compare them, just like number of episodes. Still, if you compare with https://petscan.wmflabs.org/?psid=13384686 , you might want to create an item for the entrance that could also be at P527. So a "part of" listing, is rarely ever complete either. --- Jura 22:03, 6 November 2019 (UTC)
@Jura1: The list of themed areas is avilable in the source that describes all of them. While the source does not provide a count of how many there are, there is a defined set. Even if "entrance" and "exit" were part of it, it's not a themed area. My point is that this could happen with any property really, and there isn't a good way to express it other than the two work-arounds I provided in phab:T237472. If that's what we want to do, then that's fine, but it kind of seems like a pain when, ideally, the software should support making statements about the group of values. U+1F360 (talk) 22:12, 6 November 2019 (UTC)
Let me put it another way: parts are not limited to "themed areas" (you would incorrectly mark it as complete). has parts of the class (P2670) seems to a good way to express that we have all 9 or so themed areas. --- Jura 22:16, 6 November 2019 (UTC)
@Jura1: I understand completely. I'm saying that if any statements were allowed on the group it could be qualified that way (i.e. "complete" with a qualifier of "themed areas", etc.). U+1F360 (talk) 22:28, 6 November 2019 (UTC)
@Jura1: I'm not sure what's the point of arguing away individual cases. When our properties have a clear meaning, whenever it is sensible to create statements about entity-property-pairs, it is also sensible to express the non-existence of statements other than those mentioned (that's namely what completeness says). I don't think doubling the number of properties with murky count equivalents (like child (P40)/number of children (P1971), season (P4908)/number of seasons (P2437)), which lack a formal relationship, is a good way to go. Ls1g (talk) 08:53, 7 November 2019 (UTC)
The problem is that if we generate a claim about a group that can never receive a proper reference because it can't really be referenced we just full the mistaken assumption that the results are comprehensive now and forever. For P1971, you might want to read why it was created. --- Jura 12:14, 7 November 2019 (UTC)
Why should completeness be assumed to be static? For many regular statements, temporal qualifications are essential, I don't see why completeness should be interpreted differently (e.g., without temporal qualification, Bush would be still US president - completeness would only be static if the subject itself is static, e.g., band members of The Beatles).
Regarding subject-specificity and references below a few examples:
In my view the data model is really somewhat murky. We can express the odd cases of count "0" and count ">=1" or "+1" (?) as novalue/unknown on all properties, but beyond that resort to a completely different encoding via dedicated duplicated properties, one different for each base property, and without apparent relationship to their base version. Having instead generic count/completeness/incompleteness meta-qualifiers on regular properties would be much cleaner. Ls1g (talk) 14:42, 7 November 2019 (UTC)
────────────────────────────────────────────────────────────────────────────────────────────────────
For the above samples, I think completeness is assessed differently from one property to another. Even if one would add a group statement at a similar place, the criteria (and other statements to consider) can be different. eg. P50 is generally complete, at least if one takes in account "author name string". Is Q1299#P527 necessarily limited to band members? Will it still be complete after a year? What if a user replaces an item/adds an item/deletes an item?
Anyway, I think the question about completeness is frequently also if all instances of a given class are present. --- Jura 08:06, 8 November 2019 (UTC)
"completeness is assessed differently from one property to another" -> Makes sense as each property has a different meaning?
"P50 is generally complete, at least if one takes in account "author name string"" -> That's the point, only in few cases author (P50) is complete, in others recourse to author name string (P2093), with information loss (no disambiguated entities), is needed. Besides, we have over 9000 cases where we already now explicitly state incompleteness via novalues: [4]
"Is Q1299#P527 necessarily limited to band members?" -> Good question, maybe Wikidata:WikiProject_Music has thoughts what the semantics of the property are?
"Will it still be complete after a year?" -> Same as for regular statements, is Bush still US president? Temporal qualification is needed.
"What if a user replaces an item/adds an item/deletes an item?" -> Same problem as changing statement values, should qualifiers and references stay? That's why I'm arguing for a consistent solution that collocates statements and metadata, instead of murky novalues, unknown values, and semantically unrelated count-versions of properties.
"Anyway, I think the question about completeness is frequently also if all instances of a given class are present." -> Good point, unfortunately in Wikidata's entity-centric data model that seems even tougher, as it's a statement about an inverse property (P31) for the respective class. Ls1g (talk) 10:21, 8 November 2019 (UTC)
about: "Makes sense as each property has a different meaning?": if that needs to be done anyways, one could just as well define the criteria on the property itself.
P50/P2093/author unknown: personally, I think when any of these are present, completeness is (generally) there. I don't see an advantage of marking this in addition with some completeness marker. The author was found to be unkown: we note this on the item: that statement is complete. --- Jura 13:18, 8 November 2019 (UTC)
I think you dramatically underestimate the diversity of Wikidata, see the other examples above. An instance agnostic string like "participant (P710) is complete for all British rock concerts with >50k visitors since 1970 (except punk-rock), all diplomatic summits that were UN-licensed, except on cultural politics or in South America, and all naval expeditions, including submarine attacks, if they have at least three participants, except for solo-world-circumnavigations" will never even remotely match the reality of 100k subjects, nor can anyone keep it up-to-date with an ever-evolving KB, nor does it stand up to Wikidata's vision of machine-readable knowledge. Ls1g (talk) 10:33, 9 November 2019 (UTC)
I think you underestimate the ways Wikidata can be incomplete. --- Jura 10:53, 9 November 2019 (UTC)
Can't argue against fatalism, I'm out of the discussion. :-( Ls1g (talk) 13:40, 9 November 2019 (UTC)
Does COOL-WD cater to your purpose? Afaik, so far there is no way to express this kind of knowledge inside the Wikidata data model: The kind of statement you refer to is a statement about a pair of entity and property, so in Wikidata's entity-centric rendering, we would need to show it on properties itself, not as an additional value. Adding "unknown value" only allows to represent one side of the coin - incompleteness, but not completeness (and there is a third case, that it is unknown which of the two applies). Also as U+1F360 observed expected completeness (P2429) doesn't solve the problem, as it refers only to a property, but not to specific entities (conflicts may be complete for one subject, but not for another). Ls1g (talk) 15:47, 5 November 2019 (UTC)

Pictogram voting comment.svg Comment On all these completeness issues, do we have wikipages entry points ? I remembered to have written a couple of queries to assess the completeness of Wikidata wrt. the instances of certains classes, if we know the number of instances that exists, I know there is some tools like cool-wd. But I don’t think we have a WD:Completeness or wikiproject or help page that documents all what is done of the completeness notions/tools on Wikidata. Should we ? author  TomT0m / talk page 18:51, 5 November 2019 (UTC)

We should. For example, on german wikipedia we have de:Benutzer:Ephraim33/Projekt Vollständigkeit ("Completeness Project") that provides a great motivation to fill holes and look for (almost) complete topics. Steak (talk) 19:59, 5 November 2019 (UTC)
I had organized a session at WikidataCon 2017, could try to update and condense the content in the next days. I think phab:T237472 below really goes to the core, a non-hacky solution requires an extension of Wikibase. Ls1g (talk) 15:45, 6 November 2019 (UTC)

Pictogram voting comment.svg Comment I created phab:T237472 that I think covers the problem, a proposed solution, and the work-arounds discussed. Hopefully a more comprehensive solution can be created. U+1F360 (talk) 22:14, 5 November 2019 (UTC)

  • I think it's better to look into samples where completeness can actually be determined. --- Jura 18:05, 6 November 2019 (UTC)
  • Wikidata recently introduced ShEx and it might be desireable to have a tool that tells you for a given ShEx document how complete the corresponding data happens to be. ChristianKl❫ 09:05, 7 November 2019 (UTC)
Although ShEx can count its purpose is to prescribe schemata for classes, not for instances, so it does not help with the problem described above. Ls1g (talk) 15:43, 7 November 2019 (UTC)
Thinking more about the issue, it seems to me like the core issue for the described problem is that it's currently impossible to enter negative statements. If we would have a way to enter negative statements it would be possible to use a tool like the completness tools above. You could also use such a tool to analyse whether what's described in a given ShEx is complete. ChristianKl❫ 10:53, 8 November 2019 (UTC)

Do we limit how many images to hold in the image field?

Do we limit how many images to hold in the image field? How many is too many? Was it designed to hold a single authoritative image, or to hold many. Are we worried about visual clutter? --RAN (talk) 16:30, 6 November 2019 (UTC)

Would that count as "in use" by a Wikimedia Foundation project, or is it still potentially out of scope and eligible for deletion at Wikimedia Commons? RAN (talk) 17:44, 6 November 2019 (UTC)
If available it is nice to use nighttime view (P3451) or winter view (P5252) and at image (P18) the qualifier applies to part (P518) makes sense if multiple images are used. --GPSLeo (talk) 17:47, 6 November 2019 (UTC)
@Richard Arthur Norton (1958- ): non-usage doesn't seem to be a valid reason for deletion... am I missing something? U+1F360 (talk) 17:49, 6 November 2019 (UTC)
I see "not in scope, not in use" in deletion arguments. Not in scope according to Commons:Scope, translates into "not educational" which is purely subjective. The only objective criterion in Commons:Scope to avoid deletion is "in use". I am just thinking about long term archiving ever since Flickr began deleting images stored there when they dropped their 1TB of free storage. I was worried when Yahoo bought them, they have a history of dropping projects. Flickr was recently sold and if you did not pay the yearly subscription, you had to press a button allowing them to delete all but your most recent 2,000 images, if you wanted to view your account. I just noticed today that I can login without paying, and my photos are still there, but now I wonder about long term archiving. --RAN (talk) 18:23, 6 November 2019 (UTC)
@Richard Arthur Norton (1958- ): I would argue that, just because it's in use on Wikidata, doesn't qualify it as "in use" or "educational." I feel like this is a problem that commons should addresses. Though, I would also argue, as far as Wikidata is concerned, is there really a need to have more than one quality photo of every item? Most of our items don't have a image (P18). So for this project, I would say it would be better to have a breadth of images (lots of items with a single image), than a depth (lots of items with multiple images). I'm curious what the members of commons would think, but I imagine it would be a similar assessment. U+1F360 (talk) 19:01, 6 November 2019 (UTC)
@Richard Arthur Norton (1958- ): I'm not sure what Flickr has to do with the case (and as far as I know, no publicly-visible free-licensed images were deleted there). Most deletions on Commons are because of copyright issues, in which case being "in use" is no defense. Commons' notion of "educational" is pretty broad: overwhelmingly, when images are deleted for being out of scope, it's things like the umpteenth poorly composed photo of male genitalia (we have plenty, thank you) or someone's large collection of personal photos of themselves and their friends. As a rule, even things like wildly inaccurate maps or fictional flags tend to be marked as fictional or untrustworthy, rather than deleted. Do you have any examples of exceptions to that (things that were deleted on Commons as out of scope that would have been of relevance in Wikidata terms)? - Jmabel (talk) 21:05, 6 November 2019 (UTC)
I took pictures of the four tombstones of all the family members at a grave for a person with a Wikipedia article. All were nominated for deletion and deleted. The one for the person with the article was deleted but I was allowed to restore it. I recently had them all restored, only because I noticed them missing when creating entries for them at Wikidata. I am just worried that anyone can rationalize any picture not being used as "out of scope" since it is so subjective. "In use" is objective. --RAN (talk) 21:14, 6 November 2019 (UTC)
Images of tombstones should be stored in image of grave (P1442) and not in the normal image property. ChristianKl❫ 22:00, 6 November 2019 (UTC)
  • In general it's best just to have a single image (P18), which then shows up in queries/infoboxes/etc. Having more makes things a bit too random (which one gets picked?), and can cause problems with reuse, for example if some images have captions and others don't. Thanks. Mike Peel (talk) 18:49, 6 November 2019 (UTC)
The infobox displays the first image only, if you have multiple, and you do not want the first one displayed, you set the one you want displayed with "preferred rank" and that one is displayed. --RAN (talk) 19:50, 6 November 2019 (UTC)
  • In general I would stick to one, though sometimes it might be useful to have two or three with qualifiers (eg if we have a person with a picture from 2019 and a picture from 1979, which look very different). In this case, though, one should definitely be marked as preferred. Having four or five is probably a bad idea - create a Commons category and link it instead. Andrew Gray (talk) 19:37, 6 November 2019 (UTC)
    Or for a building - a scheme and a photo.--Ymblanter (talk) 19:39, 6 November 2019 (UTC)
we have schematic (P5555). - PKM (talk) 20:57, 7 November 2019 (UTC)


  • Do we have a good approach for images about 3D objects, notably sculptures/statues? Frequently single perspective isn't sufficient, but one needn't look at the entire Commons category either. --- Jura 12:00, 7 November 2019 (UTC)
Google Knowledge Graph displays multiple images, for instance type in "Billy Joel" or "Taylor Swift" and you get 7 images in the infobox. RAN (talk) 13:16, 7 November 2019 (UTC)
  • Restricting to one image is a logical choice. Once you decide to allow multiple images, there's really no limit to the number that you could add to a single item. Ghouston (talk) 11:09, 9 November 2019 (UTC)

Dumps as CSV-files

Hello,

I am not a programmer and dont know much about specific data formats. I personally dont have a program to open the file format of Wikidata Dumps. Is it possible to make the daily Dump accesible as a CSV-file. A few days ago I added Descriptions and some of them contain a qid in their Description. It wasnt easy for me to find them and correct them and I havent corrected all of them yet. The query times out and via the API I havent get more than I think 500 results. People who dont know much about programming often use Spreadsheets for processing data. Do you think there are ways to make Wikidata more Spreadsheet friendly. If you want that the motto of Wikimedia „Imagine a world in which every single human being can freely share in the sum of all knowledge." becomes real then I think Spreadsheet friendlyness here in Wikidata is important. At the moment not every one can use bigger amount of the data because not every one is a programmer. After Wikidata is more Spreadsheetfriendly the number of people who can use the data is bigger. I suggest to look that it is possible to get the data of a specific topic as a CSV-File and also the recent changes and the edits with a specific tag. When I get the data in a readable format then I can help create specific lists of it. If someone knows something about Macros in LibreOffice it were great if the one could help me create some Macros to make the list creation easier. -- Hogü-456 (talk) 21:45, 6 November 2019 (UTC)

@Hogü-456: If you want to find all descriptions in a specific format, you may query the wb_terms table at either quarry: (not suitable for huge number of results) or Toolforge.--GZWDer (talk) 22:05, 6 November 2019 (UTC)
@GZWDer: please do not use the wb_terms table, we are getting rid of it. --Lucas Werkmeister (WMDE) (talk) 11:22, 7 November 2019 (UTC)
I am regularly using spreadsheet for data cleaning/preparation. Usual workflow is:
  1. write sparql
  2. export results to tsv
  3. open tsv in notepad
  4. copy/paste results in spreadsheet
  5. process and prepare data for quickstatement v1
  6. copy/paste results to qs
  7. run batch
The downside of that approach is that one should know sparql, but I don't believe there is technically viable solution that would allow you to edit 60M rows in excel, so you have to do some filtering. Regarding your specific issue, here is a draft query, that will allow you to fetch first NNN problematic descriptions. You can fix them, wait data will be replicated and re-run query. Obviously it has some false positives (like "Gemälde von Quentin Massys"), but you can easily filter them out in spreadsheet Ghuron (talk) 10:02, 8 November 2019 (UTC)
60M lines sounds doable in a programmer's editor like vim, and I sometimes edit TSV in such an editor. A 4K monitor helps. --SCIdude (talk) 15:04, 8 November 2019 (UTC)
If you are capable to use vim, most likely you are able to learn sparql. OP point was that we have people, who can work in excel, but has no clue about programming. Right now their contribution is limited to UI Ghuron (talk) 05:36, 9 November 2019 (UTC)

how to record complex "position held" situation ... ?

I found the following paragraph in a bio of Zoë Barbara Fairfield (Q27578052): "Fairfield's lengthy SCM career included being chair and then secretary of the London women's committee (1898), organizer and general secretary of the Art Students' Christian Union (1902-1909), a regular member and officer of the British executive (1909 to at least 1929), assistant general secretary for women for the British SCM (1909-1929), British representative at the WSCF Conferrence at St. Beatenburg (1920), and general secretary of the auxiliary movement (1929-1933?)." Is it okay to just add the statement Zoë Barbara Fairfield (Q27578052) position held (P39) officer (Q61022630) / of (P642) Student Christian Movement of the United Kingdom (Q7627652), with the whole quote in the reference? The quote isn't available to data linking. I have been adding sketchy items (usually just a name) for "the London women's committee of the Students' Christian Movement" etc. etc.; if I did a proper job of it, I could spend days researching. This source doesn't go into details of what exact positions were involved in being "a regular member and officer of the British executive" and again, I don't have time to research it. My main problem is that I don't know how to record the exact job title for things like "assistant general secretary for women for Student Christian Movement of the United Kingdom (Q7627652)."--Levana Taylor (talk) 11:59, 9 November 2019 (UTC)

Add option to add comments to changes to items, similar to other Wikimedia projects

Dear all,

At present it seems (if i'm not mistaken) that one can effect a change to an item, but NOT explain etcetera this change in a comment which then shows in the History listing, like on all other Wikimedia projects. Of course there is a discussion page with each item, but it would be more practical to be able to review changes by reading the comments (if any) coming with the changes in the History of the item. (At this moment i refrain from making an (I think) useful change to an item, because there is no way to explain this to other users who might take issue. Perhaps the initial philosophy of Wikidata entailed, that comment and justification of changes would be superfluous here, but i would like to discuss this.) Enjoy the light if you can, Hansmuller (talk) 08:14, 9 November 2019 (UTC)

In some cases I wanted that option, too. For instance, when doing changes that to myself would have seemed weird at first I'd like to give a quick pointer for other editors like "this statement is now on item X" or "see discussion Y". Toni 001 (talk) 03:19, 10 November 2019 (UTC)

How to change the primary description?

Q16848534 has changed its name from Infusionsoft to Keap. How to update this? Magog the Ogre (talk) 15:17, 10 November 2019 (UTC)

I wasn't exactly sure what you're asking, but have updated the item to reflect the name change - hopefully my edits are self explanatory as an example --SilentSpike (talk) 16:00, 10 November 2019 (UTC)

Deleted items

Is it available the deletion registry? I can’t find a item, and I wish to know if it was deleted... --151.49.123.55 12:05, 10 November 2019 (UTC)

As far as I know Wikidata has no public deletion registry. If you want to be able to keep track of items you created it would make sense to register an account so that you can look into your contribution history. ChristianKl❫ 19:11, 10 November 2019 (UTC)
Shoudn't we have such a registry? Nomen ad hoc (talk) 19:23, 10 November 2019 (UTC).
Nomen ad hoc: We do. But if you're not registered and don't know the item's ID you wish to find, it'll be harder to track it. Esteban16 (talk) 19:29, 10 November 2019 (UTC)
Ah, indeed! Thanks. Nomen ad hoc (talk) 19:31, 10 November 2019 (UTC).

American football

In American football, are route (Q7371361) and Q48771339 the same thing? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:44, 16 November 2019 (UTC)

They are indeed the same. But (ick) very few people in the U.S. would call it "route (gridiron football)." -- Fuzheado (talk) 16:37, 16 November 2019 (UTC)
Merged. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:47, 16 November 2019 (UTC)
This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:47, 16 November 2019 (UTC)

Cycles in P279 dependencies

Hey, We should not circles in our dependency graph of P279 (because it's a hierarchy). Running an graph analysis tool says these three dependencies should be removed:

I'm almost certain the second one should be removed, for the other two, I'm not expert enough ontology to say what should happen in those. Please take a look Amir (talk) 18:52, 9 November 2019 (UTC)

The first pair should probably be subclasses of ataxia (Q213373) and coincident with each other but the sources need checking. I replaced the statements in the second pair with a has quality (P1552) on the first item. Of the third pair, information resource (Q37866906) needs a better description. Peter James (talk) 00:46, 10 November 2019 (UTC)
information resource (Q37866906) specifically says it’s an electronic resource, so I don’t think it’s a good choice here. “Document” is a subclass of four items (not necessarily a problem) and is matched to three concepts in the Getty AAT hierarchy. This suggests to me we haven’t properly settled on the scope of what a “document” is. - PKM (talk) 04:12, 10 November 2019 (UTC)
It was originally a subclass of Q36808483, which was merged to collection (Q28813620); its definition was changed by JakobVoss (talkcontribslogs). Information doesn't only exist in electronic documents; what is the source for this definition? Peter James (talk) 08:43, 10 November 2019 (UTC)
It seems related to information source (Q3523102). Ghouston (talk) 09:34, 10 November 2019 (UTC)
It looks like the first pair is using the dependency defined in a reference (which have conflicting views). Items for taxon names solve this issue by using "parent taxon" statements.--- Jura 10:27, 10 November 2019 (UTC)
I don't think it's complicated, the terms don't stand for specific diseases, but classes. cerebellar ataxia (Q154709) is any ataxia (Q213373) with cause in the cerebellum (Q130983), and hereditary ataxia (Q3731293) is any ataxia (Q213373) that is genetic disease (Q200779). There are cerebellar ataxia (Q154709) that are not genetic disease (Q200779), e.g. from alcoholic cerebellar degeneration (Q18816398), and there are hereditary ataxia (Q3731293) that are not showing in the cerebellum (Q130983), e.g. Sensory ataxia (Q4416286) as described in Hereditary sensory ataxic neuropathy associated with proximal muscle weakness in the lower extremities. (Q43190822). So the items in the first pair are independent under ataxia. --SCIdude (talk) 14:34, 10 November 2019 (UTC)
@SCIdude: This is not what is defined by external source. See here. Snipre (talk) 13:51, 11 November 2019 (UTC)
They have it from enwp, with reference a BBC article that states "There are genetic forms of the disease. In addition, some cerebellar ataxias can be caused by brain injury, viral infections or tumours." So enwp got it wrong, isn't it? --SCIdude (talk) 14:18, 11 November 2019 (UTC)

Add language to P1559 (name in native language)?

I discovered an error in Alulim (Q447370), which claimed that the native language of this personage's name was Sanskrit, instead of Sumerian (Q36790). However, when I attempted to correct it, Sumerian was not one of the permitted values. this Despite my attempts, I can't figure out how to add this value to the necessary field. Can someone either fix this, or explain how to fix it? (It might be preferable to show me how to fix it, since I'm finding several ancient languages not allowed here.) -- Llywrch (talk) 18:08, 10 November 2019 (UTC)

See Help:Monolingual text languages ChristianKl❫ 08:09, 11 November 2019 (UTC)

Wikidata weekly summary #390

Translation label.

I was technically unable to add the arabic field, so I added the arabic value to the Chinese field. Need help to solve it : Q74054559. Yug (talk) 15:57, 11 November 2019 (UTC)

Fixed; though why you had an issue is unclear. If it happen again, please raise a bug ticket on Phabricator. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:28, 11 November 2019 (UTC)
Only the existing languages and my spoken language were displayed. I did not find a way to create a field "arabic". Yug (talk) 18:15, 11 November 2019 (UTC)
@Yug: I ran into this problem a while ago. The solution is to go into your preferences under "Gadgets" and enable labelLister. Then there will be a tab at the top of the page, right next to "View History", labeled "Labels list". You can use the "edit" button at the bottom to add a label with an arbitrary language code (for example, "ar" for Arabic). Vahurzpu (talk) 18:32, 11 November 2019 (UTC)

Vegetarians & vegans

How do we indicate that a person, well known for such, is either a vegetarian or a vegan? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:25, 11 November 2019 (UTC)

@Pigsonthewing: lifestyle (P1576) vegetarianism (Q83364) (query) or lifestyle (P1576) veganism (Q181138) (query) is the usual approach, I believe. --TweetsFactsAndQueries (talk) 17:07, 11 November 2019 (UTC)

How can we state that a person/entity was named in a creative work

I think this is interesting cultural data to capture, for instance, it provides data on things like which fashion brand is name-dropped the most in rap.

I've come across a few possible properties, but find it hard to tell if there's a relational distinction between something being present vs just being name-checked. For instance, a character is actually present in a book, but if they say at some point that their favourite artist is Pablo Picasso (Q5593) then to me it seems like he is not actually present in the work, but is referenced. So there's a relation between both Picasso and the character to the work, but they're two different relations.

Here are properties/proposals I've looked at:

--SilentSpike (talk) 21:34, 4 November 2019 (UTC)

Name-dropping could be interesting for certain works (it seems to be a prevalent device in Hip-Hop). I think I would support an own property that is restricted to name-dropping of institutions, persons, brands, locations and similar entities.
As to the Picasso-case (I would not call it name-dropping in the strict sense) I would probably model it as in-narrative data: if there is an item for the character I would link from the character to Picasso via interested in (P2650). - Valentina.Anitnelav (talk) 10:58, 5 November 2019 (UTC)
  • Maybe we could try to experiment with a new property "mentions". --- Jura 11:03, 5 November 2019 (UTC)
Symbol support vote.svg Support I like that idea a lot Jura. It would allow us to clean up a lot of cases where "mentions" are confusingly put into "depicted by", "present in work", "named after", "main subject", etc. Moebeus (talk) 02:15, 6 November 2019 (UTC)
@Valentina.Anitnelav, Jura1: I considered proposing one, but struggle with determining to what domain it should apply. The musical work use case is simple, but it seems like there is desire for properties to be as generic as possible and I can see use cases for written works and audiovisual works too, but struggle to define how the property would apply alongside others such as P1441 especially if it is broadened to include non-fictional entities. --SilentSpike (talk) 13:30, 5 November 2019 (UTC)
present in work (P1441) can be already used on non-fictional entities, as long as they are part of the story (see for example RMS Titanic (Q25173) which was approved on the property's talk page).
I think the domain could include all works. It should be made clear that 1) the entity should be a named entity (so not "grass", "sky", etc.) and 2) that a more specific property should be used if possible. If an entity appears in the story of a work (like RMS Titanic (Q25173) in Titanic (Q44578)) present in work (P1441) should be used, "mentions" only if this is not the case (e.g. if a character mentions the Titanic in a dialogue - or if a rapper mentions the Titanic in a song). Also if a place is part of the setting of the work, narrative location (P840) should be used, "mentions" only, if it is - well - just mentioned (e.g. if a character speaks about this place or if a building is mentioned by the narrator as part of the scenery). If a creative work mentions another creative work in my opinion cites work (P2860) should be used. We should provide a complete list of alternative properties via see also (P1659) that should be taken into consideration before using the less expressive "mentions". - Valentina.Anitnelav (talk) 14:50, 5 November 2019 (UTC)
Personally, I think the main advantage of such a property would be that it would avoid that other more specific ones get cluttered with unrelated values. Maybe we need to create a few other more specific ones before creating this one. Any suggestions? --- Jura 08:39, 6 November 2019 (UTC)
No, I don't think that there is such a need. More specific properties could still be created after the creation of this property.- Valentina.Anitnelav (talk) 10:31, 6 November 2019 (UTC)
Of course they could, but this would avoid moving things around. It would be good to try to draft a comprehensive list of properties to use instead before we create this. --- Jura 10:35, 6 November 2019 (UTC)
Ah yes, I see. I think a "plot features event" property for events featured in a narrative would be useful. (E.g. The Tin Drum (Q899334) <plot features event> Defense of the Polish Post Office in Danzig (Q564388)). - Valentina.Anitnelav (talk) 11:31, 6 November 2019 (UTC)
I also considered whether this could be seen as samples work (P5707), but think there should be a distinction between an actual audio clip from the film being included vs lyrical content --SilentSpike (talk) 21:52, 6 November 2019 (UTC)
This is getting a bit offtopic, but in the German version of Grimm's Snow White the line is "Spieglein, Spieglein an der Wand"[5], which literally translates to "mirror, mirror on the wall". As a German-speaker I would thus suppose that it is a quote from Grimm's Snow White in translation, but it would be better to find a source that actually states that (why would you suppose that it alludes to the Disney version?). - Valentina.Anitnelav (talk) 10:07, 7 November 2019 (UTC)
Simply because culturally most people (these days) would first associate snow white with the Disney movie - including the original quote. However, I suppose it would still be in reference to the original story by proxy. In any case, you're right that it's a bit off topic and shouldn't prevent property creation. --SilentSpike (talk) 12:15, 7 November 2019 (UTC)
  • I think the label is fine, I would also support the immediate creation of above "mentions" property too (following your considerations re "named entity", etc.) if someone else would like to propose it - as I think I'm not the most well equipped to do so. --SilentSpike (talk) 12:15, 7 November 2019 (UTC)
  • I would take care of it during the weekend (this is not the easiest property to propose, given its history). But if you should find time to propose it: just go ahead :). You can take other proposals as a model and it can still be adapted to concerns later. - Valentina.Anitnelav (talk) 09:25, 8 November 2019 (UTC)
  • I think the label is too general. It's not immediately clear from the proposed label how this is different from "cites" or the reference part of statements in general. --- Jura 12:18, 7 November 2019 (UTC)
  • I'm not sure it's possible to distinguish this via the label alone because it's a more general superproperty of "cites work". I think it just needs to be made clear that specific properties are preferable where appropriate. I suppose "references work" would make the use a bit more clear. --SilentSpike (talk) 12:28, 7 November 2019 (UTC)

Proposal

@Valentina.Anitnelav, Jura1, Moebeus: Since there's a lot of discussion above I've started a new sub-section here to direct your attention to my proposal along these lines: Wikidata:Property_proposal/mentions_named_entity. Please feel free to edit if you think there are obvious improvements/clarifications to be made. Also please add more examples if there are other possible types of work that this property could be used on (a film perhaps). --SilentSpike (talk) 16:50, 8 November 2019 (UTC)

  • What type of work would this apply to? How many such statements should we have? Most works have a person or place index should these all be included? --- Jura 10:15, 10 November 2019 (UTC)
    • Some good questions - for which I can't say I have all the answers, but am glad to have the discussion and refine the proposal. I'll respond on that page for the sake of keeping proposal discussion contained and later archived. --SilentSpike (talk) 10:33, 10 November 2019 (UTC)
      • With "It would be good to try to draft a comprehensive list of properties to use instead before we create this." I had in mind that we do this before such a proposal. --- Jura 07:36, 12 November 2019 (UTC)

Preferred single values

From time to time I come across statements where just one value is given, but which has nonetheless preferred rank. I don't think that this makes sense. Would it be useful if a bot would set all those single values to normal rank? Steak (talk) 18:06, 11 November 2019 (UTC)

Probably. They are a trap for the unwary who adds a second value with normal rank and doesn't notice the preferredness of the existing value. --Tagishsimon (talk) 01:39, 12 November 2019 (UTC)
  • It might be a sign that a normally ranked statement was erroneously deleted.--- Jura 07:20, 12 November 2019 (UTC)

Should "imported from" be removed if a valid source is given?

Often, a statement is imported from a Wikipedia, and a reference imported from Wikimedia project (P143): <some wikipedia> is given. Then later, a valid source is added using for example stated in (P248). In such cases, I would think that the imported from Wikimedia project (P143) could be removed, but I am not aware of such efforts, e.g. by bots. Why is this not done on a large scale? Steak (talk)

Possibly there'd be edge cases of values imported from wikipedia to replace an existing value, where the importer didn't remove the 'good source' reference? The wrong citation would be removed. --Tagishsimon (talk) 01:42, 12 November 2019 (UTC)
Because there was never a decision to remove those references. So feel free to remove them when you see them. Snipre (talk) 03:27, 12 November 2019 (UTC)
Not sure.
Not too long ago, we made these "imported from" sources more detailed as users had trouble finding the corresponding information from the edit history.
I don't think it matters if someone users delete or overwrite one or the other imported from Wikimedia project (P143). --- Jura 07:24, 12 November 2019 (UTC)

A problem: how to record the source which was quoted erroneously by many secondary sources?

Wow, this is a bit of a complicated one. There are currently two items for French-born illustrators with the same name: Louis Huard (Q21555912) (1813-1874) and Q3262203 (?-1842). Information about the one who died in 1842 is very scarce. The more I try to find him, the more I suspect he doesn't exist: it's just a false death date reported for the other one. I'm not the only one who thinks so, see here, which cites a number of authoritative-sounding sources, although I haven't been able to read any of them. And I think I found where the error originated, too: in 1848, in Siret's "Dictionnaire historique des peintres de toutes les écoles" there is an entry for Louis Huard which contains the date "*1842". Now, the asterisk means that that is a floruit date! Could someone have simply misread Siret, and their mistake has been repeated ad infinitum since? How should all this be recorded in WD? Levana Taylor (talk) 01:02, 12 November 2019 (UTC)

Probably by merging the items, deprecating the 1842 date with a reference pointing to the 1848 publiction and a qualifier reason for deprecation (P2241) with value error in referenced source(s) (Q29998666). (I guess any and all of the wrong sources could be listed against the 1842 date.) It's probably outside the scope of wikidata, or at least this item, to be able to indicate the way in which this wrong source propagated. --Tagishsimon (talk) 01:36, 12 November 2019 (UTC)
Further discussion at Talk:Q21555912 --Levana Taylor (talk) 18:27, 12 November 2019 (UTC)

Merging Education in the Netherlands by municipality (Q56751655) and Education in the Netherlands by city or town (Q8407505)?

@Joshbaumgartner: Since I believe that the topics "Education in the Netherlands by municipality" (Q56751655) and "Education in the Netherlands by city or town" (Q8407505) covered a similar scope, I attempted a merge of the entries but found that there were two separate commons categories.

There was a discussion about merging the Commons entries at Commons:Commons:Categories for discussion/2019/09/Category:Education in the Netherlands by city and the entries have been merged. At the end I received the suggestion that there should be further discussions before merging the wikidata entries. What further steps should be taken before I move forward with an attempt to merge these entries?

Thanks WhisperToMe (talk) 22:22, 12 November 2019 (UTC)

@WhisperToMe: I removed the commons cats from Category:Education in the Netherlands by municipality (Q8407505) a few hours ago when I merged the categories on Commons. I am not sure what would be preventing the merge with Q56751655. Josh Baumgartner (talk) 22:29, 12 November 2019 (UTC)