Shortcuts: WD:PC, WD:CHAT, WD:?

Wikidata:Project chat

From Wikidata
Jump to: navigation, search
Wikidata project chat
Place used to discuss any and all aspects of Wikidata: the project itself, policy and proposals, individual data items, technical issues, etc.
Please take a look at the frequently asked questions to see if your question has already been answered.
Please use {{Q}} or {{P}}, the first time you mention an item, or property, respectively.
Requests for deletions can be made here. Merging instructions can be found here.
IRC channel: #wikidata connect
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2018/02.

Project
chat

Administrators'
noticeboard

Development
team

Bureaucrats'
noticeboard

Translators'
noticeboard

Requests
for permissions

Requests
for deletions

Property
proposal

Properties
for deletion

Requests
for comment

Partnerships
and imports

Interwiki
conflicts

Request
a query

Bot
requests

Contents

Xantus's Murrelet[edit]

We seem to be having a problem at Xantus's Murrelet (Q46338167), which User:Succu persists in repeatedly (five times, so far) trying to merge into one or another item about patently different concepts; or from which he removes cited statements. Given previous difficulties I and other editors have experienced when attempting to discuss similar matters with that user, I'm raising it here, and not on the item's talk page which presumably has no other watchers. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:16, 26 December 2017 (UTC)

And a sixth. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:34, 26 December 2017 (UTC)
Mind to count your own reverts too? The item was originally created for the eBird entry xanmur. This is about two species called en:Xantus's murrelet (= Scripps's Murrelet (Q3120531) and Synthliboramphus hypoleucus (Q1276043)). Then Mr. Mabbett added ABA bird ID (P4526) = xanmur, witch is referring only to the common name „Xantus's murrelet“ and a duplication of the value ARKive ID (P2833)=xantuss-murrelet/synthliboramphus-hypoleucus. Finally (after some reverts) he claimed taxon name (P225) = Synthliboramphus hypoleucus (=Synthliboramphus hypoleucus (Q1276043)) about this item. Maybe he could explain here, why and on what base he thinks this is a „patently different concept“. --Succu (talk) 21:00, 26 December 2017 (UTC)
I'm glad that Succu has confirmed that the item in question is about a different concept to the items to which he has variously redirected it (albeit he is confused as to why this is so; and about the edits I have made to the item). Perhaps he will now cease doing so? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:14, 26 December 2017 (UTC)
I'm confirming nothing. I asked for an explaination. --Succu (talk) 21:17, 26 December 2017 (UTC)
"The item ... about two species called en:Xantus's murrelet (= Scripps's Murrelet (Q3120531) and Synthliboramphus hypoleucus (Q1276043))". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:51, 26 December 2017 (UTC)
Hence the first merge. Is this item about two species? Would be nice if you could explain your viewpoint to other readers of this topic. --Succu (talk) 21:59, 26 December 2017 (UTC)
Your first merge was to an instance of Wikimedia disambiguation page (Q4167410). My viewpoint is that Q46338167 represents a different concept to any of those with which you have tried to merge it. I'm also sure "other readers" can read both the item's description, and the sources used. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:24, 26 December 2017 (UTC)
So no explaination at all, why your new item is a „patently different concept“. Different from other species? Repeat: Is this item about two species? Would be nice if you could explain your viewpoint to other readers of this topic. Looks like are unwilling to do so, Mr. Mabbett. --Succu (talk) 22:33, 26 December 2017 (UTC)
To make it easier fr you, is your new item Xantus's Murrelet (Q46338167) about:
  1. the two species Scripps's Murrelet (Q3120531) and Synthliboramphus hypoleucus (Q1276043)) supported by xanmur
  2. the common name „Xantus's murrelet“ supported by ABA bird ID (P4526) = xanmur
  3. the species name Synthliboramphus hypoleucus (Q1276043) supported by ARKive ID (P2833)=xantuss-murrelet/synthliboramphus-hypoleucus
If your answer is "all of them" (=current status) then please explain it to us. Thanks in advance. --Succu (talk) 22:58, 26 December 2017 (UTC)
No Succu, there's explanation aplenty. My reason for raising the matter here is to solicit third-party input. I won't be answering questions such as yours, based on false premises. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:24, 27 December 2017 (UTC)
Then please list my „false premises“ and explain me why they are wrong. But I don't think you have some real arguments. Otherwise it would be easy to you to give them. By the way: Do you think giving only a ISBN like 0198540329 is a sufficient source? --Succu (talk) 21:01, 29 December 2017 (UTC)
OK, I did another merge. --Succu (talk) 22:44, 30 December 2017 (UTC)
Mr. Mabbett? ISBN 0198540329 stands for what book? On which page does this ISBN supports your view? --Succu (talk) 21:14, 23 January 2018 (UTC)
It's a simple question, Mr. Mabbett. Mind to respond? ---Succu (talk) 21:54, 26 January 2018 (UTC)
I don't see a good reason to reply here and generally think it makes more sense to have such a discussion on the talk page by pinging relevant Wikiprojects. ChristianKl❫ 12:59, 27 December 2017 (UTC)
I agree with ChristianKl. I must admit I am completely mystified with what concept Andy Mabbett has in mind. Certainly the item as it now is, seems inconsistent with any way of expressing any concept ever included in Wikidata so far. - Brya (talk) 05:36, 28 December 2017 (UTC)
You're "completely mystified" and - according to your comment on the item's talk page, are "guessing" what it represents; yet you see fit to make changes to the item, which are unsupported by the sources used (and you offer no new sources). That's not a healthy way to proceed. I have again fixed your broken indenting. Wilfully mis-indenting your comments, having been told that doing so is harmful, and having been given advice on how to do so correctly, is disruptive. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:34, 29 December 2017 (UTC)
The problem is that there is very little offered in the way of sources. I see just the ARKive ID link, and given how much junk we already suffered from that source, it is a frail reed to lean anything on.
        And please don't "fix" my comments: you should restrict your religious [?] beliefs to your own comments. - Brya (talk) 11:38, 29 December 2017 (UTC)
There are at least three sources used on the item; none of which are from ARKive. Please stop posting falsehoods. And like I said; disruptive. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:48, 29 December 2017 (UTC)
ARKive ID (P2833)=xantuss-murrelet/synthliboramphus-hypoleucus states this is Synthliboramphus hypoleucus (Q1276043). ---Succu (talk) 21:01, 29 December 2017 (UTC)
BTW, same is true for your weblink to the entry at US ECOS. --Succu (talk) 21:10, 29 December 2017 (UTC)

I've just undone a seventh attempt by Succu to delete this item through a merger to an inappropriate target. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:28, 30 December 2017 (UTC)

Please argue here and do not revert blindly. --Succu (talk) 07:05, 31 December 2017 (UTC)
And an eighth... Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:02, 31 December 2017 (UTC)
Pictogram voting comment.svg Comment Perhaps it's time to find a source that they are actually different... Matěj Suchánek (talk) 21:10, 31 December 2017 (UTC)
You can find one on Xantus's Murrelet (Q46338167). HTH. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:18, 1 January 2018 (UTC)
Then the easiest way to settle the matter is to cite your "reference" here. --Succu (talk) 13:56, 1 January 2018 (UTC)
The American Ornithological Union checklists "are the official source on the taxonomy and nomenclature of birds found in this region, including adjacent islands." see here: http://www.americanornithology.org/content/checklist-north-and-middle-american-birds . Looking at the current checklist here: http://checklist.aou.org/taxa/ we have Synthliboramphus scrippsi (Scripps's Murrelet) and Synthliboramphus hypoleucus (Guadalupe Murrelet). Therefore I believe, officially Xantus's Murrelet has been split, I don't think what other authorities say is relevant. A species with name Synthliboramphus hypoleucus (Xantus's Murrelet) was deleted from the AOU list as per the 53rd supplement in 2012 http://americanornithologypubs.org/doi/pdf/10.1525/auk.2012.129.3.573 I'm not really a wikidata expert, but I would suggest the best course of action is to retain Xantus's Murrelet (Q46338167) but change the instance of (P31) from taxon (Q16521) to something which indicates that this is a formerly recognised taxon, but which has been deleted. I had a quick look but couldn't find an item that would describe that, but this must have happened before. Species are split all the time. I don't really think that Xantus's Murrelet (Q46338167) should be merged into Synthliboramphus hypoleucus (Q1276043) they are different. Just my twopenneth. JerryL2017 (talk) 15:23, 1 January 2018 (UTC)
Wikidata follows a NPoV policy, not a Single Point of View policy; for that try Wikispecies. So, the American Ornithological Union checklists are only one source, not THE source. Of course, it may be possible to start creating items based only on American Ornithological Union concepts, but this would be a fairly big departure from existing practice. - Brya (talk) 05:37, 2 January 2018 (UTC)
We do not model different taxon concepts this way. Thats why I merged the items several and was asking for a good reference to proceed. None was given. --Succu (talk) 16:04, 2 January 2018 (UTC)
Here is the defining reference that concludes Xantus's Murrelet (Q46338167) is 2 species: http://www.bioone.org/doi/full/10.1525/auk.2011.11011 based on that paper the AOU adopted that taxonomy as detailed in the 53rd supplement, http://americanornithologypubs.org/doi/pdf/10.1525/auk.2012.129.3.573 which I had already given above. However, given that not all sources have yet adopted this taxonomy, and based on what others have said here and what is stated in the wikidata taxonomy project guidance it would seem sensible to retain Xantus's Murrelet (Q46338167) for the time being, with the correct links to sources that are still using the former taxonomy. That said, there are issues with Synthliboramphus hypoleucus (Q1276043). This item refers to the "split" Guadaloupe Murrelet but has links to sources that do not recognise the split. It also includes the alternative name of Xantus's Murrelet, which is confusing. JerryL2017 (talk) 17:44, 2 January 2018 (UTC)
Rangewide population genetic structure of Xantus's Murrelet (Synthliboramphus hypoleucus) (Q29541111) is proposing a taxonomic opinion about elevating the two subspecies Synthliboramphus hypoleucus hypoleucus (Q47012916) and Synthliboramphus hypoleucus scrippsi (Q47012925) of Synthliboramphus hypoleucus (Q1276043) to species level. The American Ornithological Society (Q465985) was following the recommendation. I do not see Xantus's Murrelet (Q46338167) is expessing this. --Succu (talk) 18:37, 2 January 2018 (UTC)
Yes, whatever the intent is, execution seems sloppy. - Brya (talk) 04:01, 3 January 2018 (UTC)
Since Mr. Mabbett refuses to argue here I will merge both items once again. --Succu (talk) 19:27, 5 January 2018 (UTC)
And if you do, absent a consensus here, you will be reverted again; for the reasons already given. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:01, 5 January 2018 (UTC)
I expected this. Other people try to argue. Your are not. What a pitty for you. Hopefully you do not miscount your reverts. --Succu (talk) 22:10, 5 January 2018 (UTC)
And this revert of him. --Succu (talk) 07:21, 6 January 2018 (UTC)
And this revert of him. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:35, 6 January 2018 (UTC)
Obviously you are unwilling to give a reasonable answer here. Probably you can't and are defending the item only because you've created it. --Succu (talk) 21:02, 6 January 2018 (UTC)
No progress here from Mr. Mabbetts side. --Succu (talk) 21:01, 8 January 2018 (UTC)
Now at AN. --Succu (talk) 20:59, 9 January 2018 (UTC)
ok from the point of view of a taxonomist. Reading the paper from 2011 linked above by @JerryL2017: the two taxa in question were considered subspecies, however overlap and do not interbreed in sympatry. By the definition of a subspecies this is not possible, hence they should be species and have been recommended as such by the paper also. As such from this viewpoint you have two species and should have two items one for each. Any other refs, unless you find one that refutes this primary ref with data not opinion, are irrelevant. I see no reason for any further argument. The nomenclatural act has been made, follow it. Where the common names go whatever, they are vernacular names and not relevant to the concept of the species. That is my view on this so I would suggest fixing the pages to reflect this and as for the IOC, ummm they are not a primary taxonomic reference so why would you be adamant about it. Cheers Scott Thomson (Faendalimas) talk 21:46, 13 January 2018 (UTC)
All major bird checklist (including IOC) followed this viewpoint. The "official" english common name of Synthliboramphus hypoleucus was changed from „Xantus’s Murrelet“ to „Guadalupe Murrelet“. --Succu (talk) 22:57, 13 January 2018 (UTC)
Xantus's Murrelet (Q46338167) represents a concept, described by the three reliable source used on that item, which we can refer to, for the sake of brevity as "A". You are saying that a different source refers to the concept "B". The Wikidata model, as I understand it, is that to concepts should be represented by different items, (with, if applicable, mutual "said to be the same as" properties). However, If your contention is that "A" and "B" are the same concept, but with different attributes, then the Wikidata model is to include properties with values stating both attributes, cited to their respective sources. What the Wikidata model does not do, is to pretend that the (reliably-cited) concept "A" does not exist. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:17, 13 January 2018 (UTC)
Name it (the concept)! --Succu (talk) 23:23, 13 January 2018 (UTC)
Umm..... I am seriously not getting this. Ok three reliable sources, I looked on the page I did not see anything I would consider as reliable. Please correct me if I am wrong, do you have separate pages for the scientific name and the vernacular names?? I do not understand why you would do that, but no matter. The vernacular name needs to follow the scientific name according to the most recent primary sources. No I do not consider the collective political opinions of eBird or the IOC nd simalar checklists as primary sources. When the species was split I imagine this did require modification of the accepted common names. As such you would use the primary refs that state this coupled with the research paper that did the nomenclatural act to move the common names accordingly. By the way, a taxon is not a concept it is an hypothesis, the concept is the theory of how species are differentiated, ie the grounds for calling something a species. That is for example the Biological Species Concept. However a species is a hypothesis circumscribed under a concept of the primary authors choosing. So please what are we getting t here cause your paragraphh Andy makes no sense. You are reporting science, it needs to reflect science. Cheers Scott Thomson (Faendalimas) talk 23:48, 13 January 2018 (UTC)
OK, I will try and summarise and make some recommendations. Up until about 2012/2013 the ornithological world recognised 5 species in the genus Synthliboramphus (Q287293) . One of these species had the latin name Synthliboramphus hypoleucus, common name Xantus's Murrelet. This is Xantus's Murrelet (Q46338167). In 2012, in the paper cited above, it was proposed that the 2 recognised sub-species of Xantus's Murrelet (Q46338167) (Synthliboramphus hypoleucus scrippsi (Q47012925) and Synthliboramphus hypoleucus hypoleucus (Q47012916) be considered separate species and that Xantus's Murrelet (Q46338167) be deprecated. This hypothesis was accepted by the American Ornithological Union (they have a scientific committee which decides on such matters) and their taxonomy was updated so that the genus Synthliboramphus has 6 recognised species including: Synthliboramphus hypoleucus, common name Guadalupe Murrelet (Synthliboramphus hypoleucus (Q1276043) AND Synthliboramphus scrippsi, common name Scripps's Murrelet (Scripps's Murrelet (Q3120531)). Subsequently a number of other sources have also adopted this new taxonomy (and I make no comment about how reliable these might or might not be) e.g Avibase ID (P2026), IUCN taxon ID (P627), eBird ID (P3444) and others. However a number of others, that are variously cited in wikidata's items have not adopted the new taxonomy. How do I know they have not adopted the new taxonomy? Because when you look at it in detail it does not include Scripps's Murrelet (Q3120531). Examples of these include: NCBI Taxonomy ID (P685)), ITIS TSN (P815) and WoRMS-ID (P850) although there are probably others, I haven't yet reviewed all sources. So, why have these sources not adopted the new taxonomy recommended by experts in the field? I can see 3 reasons: 1) They don't agree with the hypothesis that Xantus's Murrelet is actually 2 separate species - I believe this is unlikely in this case) 2) They haven't updated their taxonomy (highly likely) and over time will probably update their data. 3) They have a scientific requirement to retain Xantus's Murrelet in their taxonomy because of pre-existing data that was linked to Xantus's Murrelet and not either of the sub-species. A good example of this is eBird ID (P3444) where their taxonomy has Guadalupe AND Scripps AND (by their nomenclature) an entity called Scripps/Guadalupe Murrelet (Xantus's Murrelet). So, my conclusion is that wikidata should retain Xantus's Murrelet (Q46338167), some sources are still using the former taxonomy and some bona fide sources WILL ALWAYS have an entity that is best described as Xantus's Murrelet. However, having said that there are some outstanding questions we should resolve 1) For Xantus's Murrelet (Q46338167) what is the best instance of (P31). is it taxon or some other concept that better captures its status. 2) The existing wikidiata items should be updated to ensure each item is linked to sources that best reflect that item i.e. Synthliboramphus hypoleucus (Q1276043) should NOT link to sources that have a taxonomy that does not inlcude Scripp's Murrelet. 3) Other data such as images and synomyms should be updated to make it less confusing. (NOTE: if we have agreement on this I am happy to go and make these changes IF I have assurance that they won't just all be reverted). 4) Consideration should be given within the wikidata taxonomy project for additional avian taxonomy properties (some key sources are not available as properties, which isn't helping here) 5) I've seen various comments on here that some sources are "unreliable" if there is a consensus that they are unreliable then why do we retain them? 6) IF we can resolve this case, consideration should be given as to how this is more widely applied within wikidata - for instance it is very common practise across many data recording schemes to "combine" species into groups when they are difficult to identify. I personally think wikidata should be able to reflect that, we are a broad data resource not a wildlife taxonomy. Thanks, I hope this is useful and moves the discussion on a little. JerryL2017 (talk) 09:58, 14 January 2018 (UTC)
Maybe just a slip of a pen, JerryL2017: „the genus Synthliboramphus has 6 recognised species“ but I count only five... --Succu (talk) 23:16, 14 January 2018 (UTC)
Sorry Succu, you are correct.

────────────────────────────────────────────────────────────────────────────────────────────────────ok what I have been trying to say is the taxonomic view of the situation, then it is for wikidata to determine how best to reflect the science. In answer to your questions from my point of view. This is a wonderful example of why common names are a pain, unprofessional and honestly useless. I agree with you @JerryL2017: that the issues of different nomenclatures between sources is most likely lack of updating.
1. There are three common names available for 2 species the names Xantus's Murrelet and Guadalupe Murrelet both apply to the scientific name Synthliboramphus hypoleucus however the former is now considered depreciated, and Scripp's Murrelet which applies to Synthliboramphus scrippsi. When the species was split the current common name goes with the species that retained the original combination. There is no justification in retaining the common name Xantus's Murrelet for Synthliboramphus scrippsi, or honestly at all. The name Xantus's Murrelet does not technically apply to a taxon anymore, it is considered depreciated, it is at best an outdated name of historic value only.
2. Agreed, removal of sources that are outdated in their nomenclature will avoid confusion, or if they must be stated have it as "stated in and as" so the the page is generally set up with the current nomenclature but make note of any departures from it without supporting them.
3. Agreed, all information possible should be updated, including where necessary file names and metadata for images. I would not revert anything. Cannot speak for others. But I think if we hammer out an agreed position I believe people are professional enough to follow it.
4. Avian taxonomy should honestly be following the ICZN code, which they do not. Further they have made recent efforts to dictate to other fields of taxonomy that their viewpoint should be followed, to a massive backlash. However, this is not our problem, we present the science we do not revise it. If you feel the need for further avian properties please elucidate these.
5. Your guess is as good as mine. I think there is a generalised tendancy in projects like this to grab every online reference possible, unfortunately with little consideration of the quality of what is presented and no fact checking. Basically the equivalent of google says this, it must be true. Again I think this is also unprofessional, I think sources that are questionable should be examined by wikidata taxonomy project for validity and if rejected they are removed.
6. The act of combining species into groups is I think beyond the realm of a database. This is done through analysis of the given issue. Wikidata should be presenting the data, with reliable and good sources. As best as possible the primary taxonomic literature, in the scope of this issue, the thing that can come out of this is a better discussion on what is a good resource, the acceptance that complex cases need to be analyzed using only primary references, and that in taxonomic issues the relevant codes are the primary determinant on availability and validity. That is, if a name is published in accordance with the Code it is to be accepted as valid or refuted, a point the avian taxonomists breach the code on repeatedly.
Cheers Scott Thomson (Faendalimas) talk 15:14, 14 January 2018 (UTC)

Yes, it was established early on that in the real world there are/have been two circumscriptions for Synthliboramphus hypoleucus. That is not a problem. There appear to be several problems:
  • Is this wider/older circumscription notable enough to rate a separate item of its own?
  • Is this wider/older circumscription indeed what Andy Mabbett intends with Q46338167, given that he has already denied this. If not, what does he intend?
  • Is it worthwhile discussing if Wikidata should have items for concepts denoted by a standardised common name set by some bird organization? These clearly exist, but are they notable enough?
  • Given that bird organizations use deviant 'scientific names' with rules of their own, should we have a property for that? Something like "Avian scientific name" or more general "deviant taxon name, used by special interest groups" (to include butterflies)? Clearly, it is not a good idea to put non-Code-compliant names in P225.
Brya (talk) 18:10, 14 January 2018 (UTC)
I agree with your analysis @Brya: specifically:
Is this wider/older circumscription notable enough to rate a separate item of its own? I do not think so.
Is this wider/older circumscription indeed what Andy Mabbett intends with Q46338167, given that he has already denied this. If not, what does he intend? My impression was that this is what is being suggested here, I also acknowledge Andy has denied this, but I have no idea what the purpose of this is in that case.
Is it worthwhile discussing if Wikidata should have items for concepts denoted by a standardised common name set by some bird organization? These clearly exist, but are they notable enough? I do not think they are notable in any great degree, unless they are a highly notable list. I would encourage the avoidance of confusion as a priority.
Given that bird organizations use deviant 'scientific names' with rules of their own, should we have a property for that? Something like "Avian scientific name" or more general "deviant taxon name, used by special interest groups" (to include butterflies)? Clearly, it is not a good idea to put non-Code-compliant names in P225. I prefer something along the lines of your second option, since it can be applied outside Aves (Birds) this can apply to Amphibians also. But I definitely agree anything that is non code compliant should be avoided in almost any circumstance. Cheers Scott Thomson (Faendalimas) talk 20:43, 14 January 2018 (UTC)
All concepts are still in use at Wikimedia projects: de:Lummenalk is about the old concept, en:Guadalupe murrelet is about the new concept. What e.g. is species:Synthliboramphus hypoleucus is about remains unclear to me. So how to deal with #2) Where should we place a) outdated Wikimedia articles, b) outdated external identfiers (in case we can judge they are) And yes, this thread needs some insights by Mr. Mabbett. --Succu (talk) 23:01, 14 January 2018 (UTC)
The Wikispecies account is about the species in question as part of this, we do not worry so much about common names there as its not really what we are about. Vernacular names get added by people occasionally as they see fit, I ignore them as best as I can. If someone wants to add the english common name they can. Cheers Scott Thomson (Faendalimas) talk 00:54, 15 January 2018 (UTC)
I do not care much about common names, but I care about references. The Wikispecies article has only a reference to the origninal combination Brachyramphus hypoleucus (not mentioning it at all). So it's hard to know about which taxon concept this entry is. :( --Succu (talk) 19:30, 16 January 2018 (UTC)
Fair enough, sorry I do not do the birds. I would not know the relevant refs. When I do turtles I already have pretty much all the literature, so is easier for me. However, since species:Synthliboramphus scrippsi also exists, then the other species can only be considering the new combination. Cheers Scott Thomson (Faendalimas) talk 00:17, 17 January 2018 (UTC)
But this only a guess. Even the genus species:Synthliboramphus has no reference to a current taxonomic treatment... --Succu (talk) 20:38, 19 January 2018 (UTC)

Any suggestions how to resove this „probem“? Mr. Mabbett is not responding. --Succu (talk) 22:04, 30 January 2018 (UTC)

I will update wikispecies with the appropriate refs to show the two currently accepted species, and with a common name for each, with the relevant references (if anyone has them please send them to me) I need the original descriptions of both species and the treatment that recognises them as currently valid species. If you wish then you can use this to model your database entries on this. That is up to you. My suggestion is to follow what has largely been discussed here, in the absence of any other explanation. So I suggest you have data entries for each of the two species with the currently accepted common names for each taxa, with the original refs. I further suggest that you could delete the entry for the now depreciated common name and list it only as an alternative, older, no longer used name for the species Synthliboramphus hypoleucus in older treatments. Cite the paper that split them as species for justifying this. I would suggest calling the two data entries by the scientific name with the common names as description. This way the taxa are clearly defined and it can be noted the common names are less clear. Just my suggestions, your call. Cheers Scott Thomson (Faendalimas) talk 22:17, 30 January 2018 (UTC)
Please refer to my reply to you (so much for not responding!), above, time-stamped "23:17, 13 January 2018 (UTC)". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:13, 30 January 2018 (UTC)
Your usual gibberish, Mr. Mabbett. You didn't responded to some question directed to you (what concept, give a page). --Succu (talk) 22:57, 31 January 2018 (UTC)
OK, I will merge both items again. --Succu (talk) 19:07, 5 February 2018 (UTC)
If you do, I will revert you, because nothing here refutes my reason for doing so previously. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:16, 5 February 2018 (UTC)
Why would you revert? I thought consensus was what was followed here. Cheers Scott Thomson (Faendalimas) talk 12:56, 6 February 2018 (UTC)
So, why would you revert, Mr. Mabbett? Is nothing here refutes my reason for doing so previously an argument? --Succu (talk) 22:51, 7 February 2018 (UTC)
As you can see, my arguments are laid out above. As a courtesy to our fellow editors, I see no need to repeat them. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:16, 7 February 2018 (UTC)
As a courtesy to our fellow editors [...], Mr. Mabbett? An „interesting argument“. I'm one of your fellow editors. I don't think you have an answer to my questions, because you are avoiding to give a unequivocally answer for weeks now. --Succu (talk) 22:10, 8 February 2018 (UTC)
Done. --Succu (talk) 22:55, 8 February 2018 (UTC)
This was again reverted by Mr. Mabbett with the comment „per project chat“. --Succu (talk) 20:21, 14 February 2018 (UTC)
Looks like we have to wait till eternity, to get an explaination by Mr. Mabbett. --Succu (talk) 18:40, 20 February 2018 (UTC)

Despite the above, Succu's latest reason for reverting me was "no argument given". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:10, 24 February 2018 (UTC)

Where above? So what's this item about? You didn't tell us. --Succu (talk) 13:32, 24 February 2018 (UTC)
Still no explaination, by Mr. Mabbett. --Succu (talk) 21:53, 24 February 2018 (UTC)

Plants of the World Online database[edit]

Plants of the World online (at [1]) looks to be an important database in future for plants. So could it be set up please? An example of where it is needed is Malva acerifolia (Q47519412) – see discussion page for the exact reference. Peter coxhead (talk) 11:06, 30 January 2018 (UTC)

That should not be a problem, although your example is testimony of a wrong attitude. - Brya (talk) 11:45, 30 January 2018 (UTC)
Started the ball rolling. - Brya (talk) 12:06, 30 January 2018 (UTC)
These values are already stored in IPNI plant ID (P961), with the PotW link as a third-party formatter URL (P3303). For the above example, the P961 value is 561509-1 which gives a PotW URL of http://www.plantsoftheworldonline.org/taxon/urn:lsid:ipni.org:names:561509-1 Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:20, 30 January 2018 (UTC)
@Pigsonthewing: Andy, please see my last comment at Wikidata:Property proposal/Plants of the World online. My only interest is to get Plants of the World online included in appropriate Wikidata items so that can be made to show up in articles when {{:en:Taxonbar}} is added. Please help to get this done in whatever way is appropriate. Peter coxhead (talk) 22:14, 2 February 2018 (UTC)
Before seeing your comment here, I had just written over on en.Wikipedia: "Taxonbar can be made to display a link to the PotW site, using values from Wikidata property P961". You don't need any change to Wikidata for that. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:51, 3 February 2018 (UTC)
That is an approach that could be adopted, provided one does not mind that it works for only part of the cases. - Brya (talk) 03:40, 10 February 2018 (UTC)
Please provide an example of a case where it does not work. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:36, 10 February 2018 (UTC)
This was done allready at Wikidata:Property proposal/Plants of the World online. --Succu (talk) 19:17, 10 February 2018 (UTC)
No such example is provided on that page. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:59, 10 February 2018 (UTC)
It is. And tons of plants not covered by IPNI. --Succu (talk) 22:02, 10 February 2018 (UTC)
Please give an example - here - of some of the "tons" of plants not covered by IPNI, which use IDs matching the definitions in the property proposal. And, if I'm wrong, prove it: give the former example here, too. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:59, 11 February 2018 (UTC)
Succu provided urn:lsid:ipni.org:names:128853-1 as an example missing from POWO; Peter coxhead provided urn:lsid:ipni.org:names:503872-1. ArthurPSmith (talk) 16:00, 12 February 2018 (UTC)
Those are examples in the wrong direction; the model proposed (giving a valid POTW URL as a reference, to indicate that a page exists on POTW) would clearly not be used in such cases. They are not "plants not covered by IPNI". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:01, 12 February 2018 (UTC)
No such examples, then. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:16, 17 February 2018 (UTC)
Wrong direction: Your intention was to spare us a property because this property could be "remodeled" via third-party formatter URL (P3303). This is untrue. --Succu (talk) 21:58, 17 February 2018 (UTC)
False. And I asked you to "Please provide an example of a case where it does not work". Still no such examples. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:18, 18 February 2018 (UTC)
Probably only a faux pas (Q1398885) at your side, Mr. Mabbett? --Succu (talk) 22:21, 20 February 2018 (UTC)e

Wider issue[edit]

I have held off from creating this property because of the heated debate (which is apparently still on). I think we should just have one general discussion about these cases. When an identifier is shared by multiple databases, but these databases do not have the same coverage of that identifier, what do we do?

  • Either we create two separate properties, holding the same values but with different formatter URLs (and different coverage obviously)
  • Or we find another way to indicate that an identifier is available in one of the databases (Andy suggested to use references like this).

This is a fairly general problem that was raised in other proposals (such as Wikidata:Property proposal/Google Arts & Culture entity ID) so it would be worth settling it once and for all… Should we have a RFC or something like that? Or is it overkill because the consensus for one solution or another already clear somehow? − Pintoch (talk) 19:04, 24 February 2018 (UTC)

A similar issue arises at Wikidata:Properties for deletion#eFlora properties, where we currently have two properties, and potentially twenty or more, for a single set of identifier values. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:07, 24 February 2018 (UTC)
@Pintoch: We have properties for small datasets (some hundreds of usages). We have a lot of (nearly) unused properties. Creating another property shouldn't that problematic. The usage recommendations of third-party formatter URL (P3303) is a bit fuzzy. If another database uses the same set of identifiers all is fine to me. But what about sub-/supersets? Assuming that third-party formatter URL (P3303) is intended to be used as an alternative reference this won't work. Supersets will return 404 errors (POWO). Same is true if we want a direct link to a describing site (eFloras) to make it avaiable as a reference. Mr. Mabbett, as the property proposer of third-party formatter URL (P3303) could probably help to sort out this. --Succu (talk) 23:41, 24 February 2018 (UTC)

Cleaning up our upper class tree[edit]

Starting point of this discussion : Reasonator upper classes of « series » as of 2018-01-31 14-12-59

Or in a textual form :

Classification

name description
Entity something that exists
Object technical term in modern philosophy often used in contrast to the term subject
Abstract object object with no physical referents
Concept mental representation or an abstract object or an ability
Mathematical concept abstract entity in mathematics
Mathematical object abstract object in mathematics
Class collection of sets in mathematics that can be defined based on a property of its members
Metaclass in knowledge representation, a class of classes
Type kind or variety of something
Group summarizes entities with similar characteristics together

We learn that a group, say the group of people that will talk about this subject, is « an object with no physical referents ». Which is absurd. Assuming good faith, nobody wanted that in the first place. This is the result of a complex series of edits and merge by different people.

But it’s a big problem that we let that happen. Let’s try to explain and dig a little bit. First attempt to explain :

Items such as group (Q16887380) lacks external proper reference and are edited/merged and so on a lot, by a lot of contributors, some of whom have been blocked for toying around and so on. Everybody seem to edit them as he or she want, in good faith or probably to highlight this problem for some of them, without actually starting a discussion to settle the problem with everybody. This creates a big mess. Nobody actually can tell what this item is supposed to mean beyond its label.

I see several problems here :

  • this item is linked to a wikipedia article. So far so good. But … this article may lack the precision on defining the concept of group to avoid the mess on top of our class tree. In this case, it’s a « simple » article in english that rely on « common sense » to define a group as a physical object whose parts are also physical objects. There is no way this definitions fits with any kind of abstract object. There may exists groups of abstract objects, however, there is for example a mathematical theory named « group theory » (group (Q83478)), but the relationship between it and physical group, if any, is rather abstract …
  • As a consequence, it may not be a good idea to use the wikipedia article as the main reference for the definition of the « group » concept. But creating more precise items or items with different definitions of « group » put them at risk of being merged back into this one, or trolled. Back to step one and the mess. For example Fractaler trolled the concept of group to assimilate this to the concept of « class ».
  • We could decide that the statements about that item gives some sort of definition of those items. For example, we should decide that one of the item labelled « group » is about groups of physical objects, by putting in it statement suggesting that groups are physical object themselves, leading to put statements like
    < group > subclass of (P279) View with SQID < physical objects >
    and other as we extend Wikidata languages like defining that a group parts are physical themselves. I tried this approach, but it did not work as it seems Wikidatians removed some, added of them, or merged items as they feel right, resulting in nobody knows what the initial intention was, including myself :/

But we actually need different items for different group. For example for some reason we at that point have a statement

< series (Q20937557) View with Reasonator View with SQID > subclass of (P279) View with SQID < group (Q16887380) View with Reasonator View with SQID >

. A series is supposed to be a group. By the current definition this entails that a specific series, say the « Friends » TV show, is a group of physical objects. This is obviously wrong. If a « group » class is to be a superclass of a « series », then it should not be the item of groups of physical object but a group of something more abstract. There seem to be a bibliography about the nature of audiovisual artworks, by a random google search on « film ontology we can find bibliography on the topic which points to whole books.

To avoid these problems, I think we should study some top level ontologies and see what we can do with them. So we have a way to link our top level classes definition those items to external and precise definitions that cannot be merged with items with a different definition. By Wikidata’s nature however, I think we cannot commit to one specific upper ontology (Q3882785). There is a neutrality of point of view that must allow us to represent and use different ontologies in Wikidata. This entails that, in my opinion, we have to use higher level classification tools like metaclass (Q19478619) to class the different classes themselves and to keep track of which item use definitions from which ontology, however.

Following the problem of the diversity of definitions, there is also a potential problem with sitelinks. On these concepts as wikipedia have philosophical articles about the history of those concepts, the different ways we can define them and so on. This entails that the philosophical view ontological realism, in very short the point of view that an ontology is the description of objects of the real world, is hard to apply. Instead for some concepts in Wikidata, are we commited to take a position of ontological idealism (Q33442), for example if we try to describe the concept of luminiferous aether (Q208702) we must take the point that what is real in this concept is the theory our ancesters had in their mind, and not reality itself ? The « universality » (the sum of all knowledge) of our projects does not make the task easy. Maybe it’s possible to just deprecate the statements about out of fashion or disproved concept ?

One source of mixing up of our upper class tree, I think, is the mixing up of concepts coming from topics such as « type theory » type theory (Q1056428), a mathematical theory close to set theory (Q12482) that defines some mathematical concepts of « classes » of « types » and the class concept used in Wikidata. Not to mention type system (Q865760) in programming who can be viewed as applied type theory. Wikidata has the power to gives its description of « type theory » or « class » in those domains. We’re also using our concept of « class », maybe in a more loosely defined way than in mathematical axiomatic theories like Von Neumann–Bernays–Gödel set theory (Q278770). The problem arise when we mix up our class concept with some mathematical concept of class or type. Our concept of class happens to become a subclass of « mathematical object », which messes up our tree a little more … I think this is a big variation of the problem of use–mention distinction (Q2577553) (use/mention). We’re confusing our usage of the concept with our description of these concepts, which create a kind of self-reference (Q1129622) loop. I don’t think we want that as this creates a lot of mess and a lot of confusion. That’s why I created items (before being aware someone conceptualized the use mention problem, which is a big help to be taken seriously :) ) like « Wikidata class ». They seem to have disappeared (merged?) over time as some contributors did not understood the problem and purpose.

For programmers, I think some concepts of programming languages type system like « Generics in Java (Q379273) » or other related concepts of generic programming (Q1051282) in other programming languages like type class (Q1375130) or template (Q1411845), which have been proven useful in programming are for interest for us to solve the problem of the « series » item above if we find the right inspiration. They allows to define classes of classes, some kind of metaclasses in a loose sense. This should give a hint that our abstract « series » item, that could hypothetically represent a series of abstract and concrete object, may exists and be useful but be put out of the superclasses of the « TV series ». «TV series» may be an instance of it, not a subclass. A hint that we’re on such road is the use of the « of » qualifier.

I think we should not conflate the concepts used for type systems in computing, datatypes, with our own (informal) type system concept to avoid this. I think we should find ways to reflect the complexity of the different views on the world while efficiently reflecting the world without worrying too much about those difficulties, and I hope this text will be of any help in this purpose, and is clear enough. Please feel free to ask anything and share your thoughts.

--Micru (talk) 21:46, 24 August 2014 (UTC) Tobias1984 (talk) TomT0m (talk) Genewiki123 (talk) Emw (talk) 03:09, 9 September 2014 (UTC) —Ruud 16:15, 9 December 2014 (UTC) Emitraka (talk) 14:32, 14 October 2015 (UTC) Bovlb (talk) 19:10, 21 October 2015 (UTC) Peter F. Patel-Schneider (talk) 22:21, 23 October 2015 (UTC) ArthurPSmith (talk) 15:51, 5 November 2015 (UTC) --Daniel Mietchen (talk) 20:53, 3 January 2016 (UTC) --Harmonia Amanda (talk) 22:00, 27 February 2016 (UTC) --Lechatpito (talk) --Andrawaag (talk) 14:42, 13 April 2016 (UTC) --ChristianKl (talk) 16:22, 6 July 2016 (UTC) --Cmungall Cmungall (talk) 13:49, 8 July 2016 (UTC) Cord Wiljes (talk) 16:53, 28 September 2016 (UTC) DavRosen (talk) 23:07, 15 February 2017 (UTC) Vladimir Alexiev (talk) 07:01, 24 February 2017 (UTC) Pintoch (talk) 22:42, 5 March 2017 (UTC) Fuzheado (talk) 14:43, 15 May 2017 (UTC) YULdigitalpreservation (talk) 14:37, 14 June 2017 (UTC) PKM (talk) 00:24, 17 June 2017 (UTC) Fractaler (talk) 14:42, 17 June 2017 (UTC) Andreasmperu Diana de la Iglesia Jsamwrites (talk) Finn Årup Nielsen (fnielsen) (talk) 12:39, 24 August 2017 (UTC) Alessandro Piscopo (talk) 17:02, 4 September 2017 (UTC) Ptolusque (.-- .. -.- ..) 01:47, 14 September 2017 (UTC) Gamaliel (talk) --Horcrux92 (talk) 11:19, 12 November 2017 (UTC) MartinPoulter (talk)


Pictogram voting comment.svg Notified participants of WikiProject Ontology (could have written that there but I think this should be more visible.) author  TomT0m / talk page 15:32, 31 January 2018 (UTC)

I don't think it's as bad a situation as you are conveying here. It sounds like "group" may need some cleanup. But for example, the class list you provide above doesn't seem to match what I see directly in wikidata: "series" is a subclass of "group", which is a subclass of "type", which subclasses "entity". That seems relatively simple, and while I am not certain if the "group"/"type" relation is the best, it makes some sense. Everything with abstract entities (just look at the issues with books & editions etc.) is hard to think about, so I generally encourage people to work on the areas of our ontology that are closest to physical reality, where things are easier. What I think is more concerning is the overuse of instance of (P31) for abstract concepts, when subclass of (P279) is the better relation. ArthurPSmith (talk) 17:01, 31 January 2018 (UTC)
I’m sure « groupe » is not a subclass of type. Say a
< sheep herd > subclass of (P279) View with SQID < herd >
< herd > subclass of (P279) View with SQID < group >
, and
< Bob’s herd > instance of (P31) View with SQID < sheep herd >
. Then if we have also group subclass of type, then « Bob’s herd » is a type. But it’s not, it’s a herd. The type in there is « sheep herd », as there is many examples of sheep herd. If we take « http://dbpedia.org/ontology/Group » as the dbpedia concept (as in an informal group of people), we get that it is « An Entity of Type : Class », that is Group rdf:type owl:Class. If we loosely take « class » as a synonym of « type » this means that group is an instance of type, definitely not a subclass (the relationship would be https://www.infowebml.ws/rdf-owl/subClassOf.htm ). Sometimes the use of instance of (P31) for abstract concepts is legitimate, for metaclassification (see User:TomT0m/Classification or
On the other hand a query such as
select distinct ?class  { 
  [] wdt:P31/wdt:P31/wdt:P31+ ?class .
} limit 20
Try it!, which searchs classes who have instance that have instance and so on in at least 3 levels returns concept (Q151885) View with Reasonator View with SQIDclass (Q5127848) View with Reasonator View with SQIDmetaclass (Q19478619) View with Reasonator View with SQIDformal ontology concept (Q19868531) View with Reasonator View with SQIDphilosophical concept (Q33104279) View with Reasonator View with SQIDdesignation for an administrative territorial entity (Q15617994) View with Reasonator View with SQIDdescriptive item used as unit (Q22302160) View with Reasonator View with SQIDfirst-order metaclass (Q24017414) View with Reasonator View with SQIDsecond-order metaclass (Q24017465) View with Reasonator View with SQIDthird-order metaclass (Q24027474) View with Reasonator View with SQIDtype of software (Q28530532) View with Reasonator View with SQIDWikidata metaclass (Q19361238) View with Reasonator View with SQIDtype of fruit (Q28149961) View with Reasonator View with SQIDform of government (Q1307214) View with Reasonator View with SQIDterm (Q1969448) View with Reasonator View with SQIDproduct lining (Q3084961) View with Reasonator View with SQIDclassification scheme (Q5962346) View with Reasonator View with SQIDeconomical concept (Q29028649) View with Reasonator View with SQIDtriad (Q29430681) View with Reasonator View with SQIDsystem (Q58778) View with Reasonator View with SQID
It takes a long time to compute. That may indicate the query is hard to compute, or that there is not a lot of results, maybe a bit of both (it timeout if we want more results). Nothing really scary even if there is dubious stuffs. author  TomT0m / talk page 19:00, 31 January 2018 (UTC)

instances of instances of instances[edit]

Here's a version of User:TomT0m's classes query with some sampled chains included, to make them easier to assess. The query is very big, because P31 is one of the most used properties there is, and we're asking for its instances table to be intersected with its instances table, and the results then again with its instances table. Even though the engine efficiently pipelines that, starts working on the second and third stages while the first stage is still running, and quits once it's found enough instances, that's still a huge request. Any clever ideas as to how to streamline the query very welcome. Query seems to work well enough now. Slightly tweaked to include the count of distinct values of ?x for each ?class Jheald (talk) 02:27, 1 February 2018 (UTC)
SELECT ?n ?x ?xLabel ?c1 ?c1Label ?c2 ?c2Label ?class ?classLabel 
WITH  {
   SELECT ?x ?c1 ?c2 ?class WHERE { 
       ?x wdt:P31 ?c1 .
       ?c1 wdt:P31 ?c2 .
       ?c2 wdt:P31 ?class .
   } LIMIT 40000
} AS %classes
WHERE {
  {
    SELECT (COUNT(DISTINCT (?x)) AS ?n) (MIN(?x) AS ?x) ?class WHERE {
       INCLUDE %classes 
    } GROUP BY ?class 
  }
  INCLUDE %classes .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } .
} ORDER BY DESC(?n) ?x ?c1 ?c2 ?classes
Try it! -- Jheald (talk) 20:01, 31 January 2018 (UTC)
Note this is quite similar to the metaclass lists I have added in the Problems area of our Ontology project: Wikidata:WikiProject Ontology/Problems/3rd order metaclasses by subclass for example. While some of these are legitimate, many clearly should not be high-order metaclasses. ArthurPSmith (talk) 20:04, 31 January 2018 (UTC)
Clearly there are some oddities, and I think User:ArthurPSmith is quite right that, as a rule of thumb, one should ask whether subclass of (P279) works instead of instance of (P31) for relations between abstract entities, and prefer it if it does make sense. Abstract items as a rule are quite likely to be classes, because one can very often expand an abstraction into a group narrower abstractions by adding some further distinguishing characteristic. Jheald (talk) 20:21, 31 January 2018 (UTC)
The example of living thing group (Q16334298) just below proves that this rule of thumb may be responsible of some oddities. I’d prefer to point people to stuffs about the « token type distinction » as a rule of thumb, because it’s clear which object are eligible to this rule and which are not. A more robust rule of thumb is « take an example of concrete object or event that is an instance of one of the class. Is it also an instance of the second one ? ». If one can’t find a concrete example of a concrete token of a class, then it should seek on the ontology project help and do nothing, he’s in a rare case :) author  TomT0m / talk page 20:34, 31 January 2018 (UTC)
@TomT0m: The "A instance of (P31) B and B subclass of (P279) C requires that A instance of (P31) C" rule is of course fundamental. But what I was meaning, following up the comment of Arthur, is that if B is an abstract thing, then it very often will be a class (because narrower abstractions within it can be devised), and so probably ought to be a subclass of something. Therefore IMO it probably makes sense to start by seeing whether B subclass of (P279) C makes sense, and then consider different possible entities A (concrete if possible), and ask whether they disprove the proposition. I think trying that probably generally makes a better starting point than starting from B instance of (P31) C as an initial hypothesis, and considering what would be its consequences. Jheald (talk) 21:34, 31 January 2018 (UTC)
Amongst wich we can find living thing group (Q16334298), probably because of the « group » class is (was) a subclass of type or class. I solved this. We should think about the status of abstract group like groups of film or TV series by the way … my point of view is that an abstract artwork, not a painting for example, is a class of experience. It’s a virtual stuff in the deleuze sense ( « This virtual is a kind of potentiality that becomes fulfilled in the actual. », see https://en.wikipedia.org/wiki/Virtuality_(philosophy) or http://www.oxfordreference.com/view/10.1093/oi/authority.20110803095349177 ) A movie is a potential actual saturday night experience, a video game is instantiated every time someone plays it. That makes a video game a subclass of experience ( qualia (Q282250) ? « Conscious experience » ? I don’t know) author  TomT0m / talk page 20:25, 31 January 2018 (UTC)
@TomT0m: I don't think I agree. There are things which are facet of (P1269) a videogame that are a subclass of experience; but the totality of a videogame I do not think is well described that way. (On the other hand, we probably get there by saying that a videogame subclass of (P279) game and game subclass of (P279) experience, which (curiously) I think I would be happier about -- but that might be a slightly different sense of the word experience.
That meaning of experience is probably not a subclass of qualia -- to me qualia is quite a narrow technical term, and should be reserved to identify more point-in-time sense-perceptions like taste, the quality of what is seen, as so forth. Jheald (talk) 21:49, 31 January 2018 (UTC)
@Jheald: facet of (P1269) is quite vague. I would not be happy to build something on such a vague definition that is not as far as I know used in any external ontology, this is sloppy. :) Take a 3D model of a character. I think it defines as well a class of experiences, what you experiment when you watch the model when it is rendered of your screen. Then it definitely make sense to say
< Gordon Freeman 3D model > part of (P361) View with SQID < Half life >
, meaning the experience of seeing Gordon Freeman is part of the experience of Half Life. I like to read Functional Requirements for Bibliographic Records (Q16388) with that in mind :) (there is a relationship between what they call « item », the DVD of the game you bought, and the abstract work, but which one ?) The view of « video game » as a subclass of game is also interesting, you can see that the logical rules that governs the game are analog to the code of the video game « Rule is Law » Lawrence Lessig. Also if a video game is a class of concrete experience, or if all soccer games are instances of « soccer », then the ability to subclass « soccer » with classes like « 1998 workcup football games » encompass the need of a metaclass to identify precisely that all football games. I’m reading the book « buiding ontologies with basic formal ontologies » atm, and I see here that « football » seems to be what Barry Smith and its coauthors calls a « universal », whereas « 1998 workcup football games » is a « defined class », very enriching and practical criteria. Having a metaclass for games like « soccer » would be a very practical way to differentiate those kind of class. I now see that « work » is probably a subclass of universals. We could label each video game a work, but we could not label « 1998 workcup football games » a « work » (there seem to be videos and slide about BFO on the web which explains universals in BFO, http://ncorwiki.buffalo.edu/index.php/Tutorial_on_Basic_Formal_Ontology video and slide 1).
OK for qualia, just an attempt. author  TomT0m / talk page 22:46, 31 January 2018 (UTC)
On Jheald query: the query timeouts every time here, except if we set a very low limit such as « 100 » on the subquery. Found something interesting however, see the « concept » subsectionI don’t really think it’s useful to get the full path, once we get the class it become easy to find a representative path, and breaking in on the right place potentially breaks many other with the same destination I don’t find anything about the « with / include » construction in sparql, is this a blazegraph idiom to ease the subquery writing ? a link ? author  TomT0m / talk page 20:52, 31 January 2018 (UTC)
@TomT0m: Yes, the named subquery syntax is an extension to the approved SPARQL standard, a very useful one that has been implemented by multiple separate vendors. Blazegraph's page on it can be found here. Jheald (talk) 21:56, 31 January 2018 (UTC)
@TomT0m: Just re-ran it and it worked for me, finding 37 cases in 11.3 seconds. Make sure you're using the version with the named sub-query and the limit increased to 40,000 -- this is much more efficient than the one I originally posted. Jheald (talk) 21:19, 31 January 2018 (UTC)
Update I added a column to the query to add a count of the instances ?x for each main ?class at the other end of the tree, and it's really helpful - it puts the results into a much sharper focus. Almost all of the instances belong to the first 4 cases of ?class: first-order metaclass (Q24017414) (29,026); concept (Q151885) (5273); music term (Q20202269) (4589); and Wikidata metaclass (Q19361238) (700). There are oddities within these (is a fable (Q693) really an example of a figure of speech (Q182545) ?) but these are top-level classes that do make sense in a list of this kind.
Beyond this, each case accounts for only a handful of instances. Worryingly, a number of these seem to be due to vandalism, eg Thirteenth Amendment to the United States Constitution (Q175613)instance of (P31)  Indiana (Q1415), Araceli Gilbert (Q4783519)instance of (P31)  love (Q316), CPU-Z (Q1024234)instance of (P31)  Azerbaijan (Q227)
YesY Now done, for all cultivars of apple. Would be good if somebody knowledgeable could sort out the relationship between pome (Q41274) and fruit of Maloideae (Q145150) Jheald (talk) 10:46, 1 February 2018 (UTC)
hydrogenated water (Q11549076) and deuterated ethanol (Q1101193) both changed to be subclass of, per ArthurPSmith below. Jheald (talk) 12:15, 2 February 2018 (UTC)
YesY Done. Jheald (talk) 12:15, 2 February 2018 (UTC)
YesY Dealt with. Just The Hardbitten Heretic (Q19560964) remaining now, instance of (P31) anonymous (Q4233718) is probably not quite right, but not sure what it should be. Jheald (talk) 12:05, 2 February 2018 (UTC)
  • Similarly for country: tinyurl.com/y8t59m3b. 13 examples.
YesY Dealt with. Mostly by User:Oravrattas (thanks!). But looks to be a favourite recurrent target for stupid edits. Jheald (talk) 12:05, 2 February 2018 (UTC)
(to be continued) Jheald (talk) 10:08, 1 February 2018 (UTC)
@Jheald: Nice work. I agree with all of your suggestions above except I would note that WikiProject Chemistry has been discussing "chemical compound" a bit and working on a new way to model chemical species, which is definitely not settled yet. But deuterated ethanol (Q1101193) definitely should still be a subclass of ethanol (Q153) (for example). ArthurPSmith (talk) 14:57, 1 February 2018 (UTC)

A simpler query to find instances of instances of physical objects, generalizing a bit the idea, gives a few hints: https://query.wikidata.org/#%0Aselect%20%3Fitem%20%7B%0A%20%20%3Fitem%20wdt%3AP31%2Fwdt%3AP31%2Fwdt%3AP279%2a%20wd%3AQ223557%20.%0A%7Dlimit%20100%20 universities or college that are instance of other universities for example :) a mistake, a faultly import or a misuse as « part of ». author  TomT0m / talk page 20:29, 1 February 2018 (UTC)

@TomT0m: A bit less simple, but I think this variant gives quite useful perspective: tinyurl.com/y7t8zyrt. Some of these should definitely be cleaned up, and changed from P31s to P279s. Jheald (talk) 20:58, 1 February 2018 (UTC)
@TomT0m: Actually, more useful is probably this tinyurl.com/ybkpsoxb, counting on ?c1 instead of ?class. Jheald (talk) 21:06, 1 February 2018 (UTC)

Ontological status of « Concept » and « death »[edit]

Related to the text in introduction, it seem that « death » is an instance of « concept », and that Death of Caylee Anthony (Q1056362) is an instance of « death ». It make sense if we consider that the article is about the event of the death of someone, it make less sense if this is about a case, as the frwiki article is entitled (« Casey Anthony’s case »). Quite a common problem, though no big deal The statements about death (Q4) are way more interesting. It’s both

  • an instance of « concept » (This seems like an example of « ontological idealism », we describe concepts that exists in our head), but in the end this does not seem a really informative statement as if everything exists in our heads, basically anything we can imagine is a concept
  • an instance of property (Q937228) (this one puzzles me) and
  • …_an instance of event just a few moments ago https://www.wikidata.org/w/index.php?title=Q4&diff=625305390&oldid=623107991 - it’s the easiest to deal with, if it’s an event it is clearly a subclass of events as there is many concrete events of death.
  • a subclass of state, and a subclass of event, and a subclass of process
  • a subclass of end of existence (Q23956356), an item with unclear status where it seems the usual suspects have been toying around looking at the history, an item with an unclear status

It’s not really surprising as there seem to be many definitions of death ( http://www.europsy.org/ceemi/defmort.html in french for example). Enwikis also introduce the topic : https://en.wikipedia.org/wiki/Death#Problems_of_definition I think that we should at least have items for the moment of death, which is more like the transition beetween the « living state » and the « dead state », and for the state of a dead body (cadaver (Q48422) for which we have an item). We even have article parts for https://en.wikipedia.org/wiki/Decomposition#Animal_decomposition decomposition of bodies, death is fascinating it seems …

Thoughts ? author  TomT0m / talk page 21:23, 31 January 2018 (UTC)

instances of « Term »[edit]

advanced emotion (Q16748888), complex emotion, is an instance of term. And also a part of « theory of emotions » (and « love » is an instance of it). I remember encountering those cases a lot in the beginning of Wikidata, it seems that there were some « idealists » back then. This make sense if « complex emotion » is considered a conceptual entity that we arbitrary choose, and if the goal of Wikidata is to describe how we think about it. From a « realist » perspective, « love » is something in the real world and this item is a description of it. The goal of science is to understand this real things by building the best possible descriptions of it. So I’d tend to think the item about love should embrace these descriptions and describe love. And not that our « love » item refers to a concept or term that science uses to model love … This would mean that the only instances of terms are the items about word … for example about the lexical entity « love », a task for wiktionary.

Also interesting but misleaded, the fact that there is a statement « complex emotion » part of « theory of emotions ». This seems like an idealist perspective as well : this is a term that is used by the theorist to describe emotions, amongst other terms in emotion theory …_i’d tend to think the right property is something like study of search to link the theory or field that describes the objects of the real world in question.

Thoughts ? author  TomT0m / talk page 21:57, 31 January 2018 (UTC)

There is a similar issue with Latin phrases such as sine loco (Q11254169). Should these be <instance of> term, <language> Latin, or rather the associated meaning? - PKM (talk) 00:07, 1 February 2018 (UTC)

Objects[edit]

I think in a number of cases items are set as <subclass of> object (Q488383): technical term in modern philosophy often used in contrast to the term subject when they should be <subclass of> no label (Q17553950): no description. I mostly see this down the hierarchy (clothing items should be no label (Q17553950), no?). But I admit that the higher up the hierarchy we get, the more uncertain I am of my grasp of ontological first principles. - PKM (talk) 00:41, 1 February 2018 (UTC)

Further: Here are comparative hierarchies for "clothing".

  • Wikidata: entity > object (philosophy) > abstract object > concept > result > goods > product > clothing
  • Getty AAT: object > furnishings & equipment (hierarchy name) > costume (hierarchy name) > costume (mode of fashion) > clothing

Frankly, the Wikidata hierarchy makes no sense to me whatsoever (aside from the fact that not all clothing items are products). AAT is useful but not definitive - their hierarchies do not always imply that an item is a subclass of its parent; often the relationship can be better modeled as <facet of>. My person preference would be something like object > physical object > clothing <facet of> costume. Perhaps there's a class between physical object and clothing, to correspond with AAT's "furnishings & equipment", but I wouldn't know what to call it (furnishing (Q31807746) is different, parallel to clothing in the hierarchy). - PKM (talk) 01:02, 1 February 2018 (UTC)

@PKM: Which item do you precisely denote as « clothing » ? There may be many aspect on this notion that is actually covered by items. The « AAT tree » seems to be topical to me, in the sense it describes a way to class topics of interest and their « subtopic » relationship, not real world object like instance of (P31). To compare to Wikipedia, it seem like more a « portal inclusion » representation than a class hierarchy, or a « parent category » relationship. I’d personally hardly be interested in this, but if we have something like that one day this definitely not should be represented with subclass of (P279)_who is not intended to be a random hierarchical property. Unsurprisingly seeing Getty AAT http://www.getty.edu/research/tools/vocabularies/aat/, it’s not an ontology but the thesaururs … those are imho in the domain of the wiktionary, not on Wikidata. The goal of a thesaurus is to represent the terminology to describe a domain, not to represent the domain itself. I guess this is an example of « ontological idealism ». The approach of Wikidata is to describe the object of the domains directly through our items statements (aka. ontological realism), rather than describe the terminology used by expert to represent a domain. If however we have a sturctured wiktionary (and there is thesaurus’ in wiktionary) with a structured thesaurus, it will be possible to link the description of the terms to the description of real world object they are supposed to represent that Wikidata holds … author  TomT0m / talk page 15:42, 1 February 2018 (UTC)
We certainly have some items that are <instance of> some subclass of clothing - mostly museum objects - but in general we have a massive class tree of types of clothing (shirt, dress, trousers, kimono) in a structure informed by the AAT and the Europeana Fashion Vocabulary (Q29016777). If structured vocabularies are not appropriate sources for building hierarchies for the objects that make up material culture (and which are heavily represented in Wikipedias), then I can't imagine what is. We're not merely defining terms - clothing items are associated with ceremonial activities, cultures, and time periods; are made of materials using methods and processes; can be named after persons or places, invented by individual designers, and are depicted in works of art. Our current class tree and items set are far from perfect or finished, but we have a WikiProject and an approach. - - PKM (talk) 20:04, 1 February 2018 (UTC)
@PKM: This seems like a good start, but there seem to be problems with your class tree related to my points above. For example, while a lot of items are about types of clothes, there is in the tree items like wasp waist (Q1283782)_is not really about the clothes that allows to build the style, but about the style itself. This item should, in an ontological perspective and not a nomenclatural one, be classified as a « silouhette » type, a different subclass tree that the cloths class tree. There may be relationship between the silhouette, or the clothing style, and the types of cloths that someone wear to bear this style, but this is probably a candidate for a property creation, something like « style allowed by cloth », or to paraphrase the wikipedia article. It’s probably a good idea to ping WikiProject Ontology for a review of this approach before starting the work, to ensure the approach is consistent in the whole project … Starting from a thesaurus probably needs a bit of processing to convert it from a consistent ontological perspective across Wikidata. And there is whole ontologies dedicated to consistency between ontologies, see upper ontology (Q3882785), so ontologists take that seriously :) Actually this is part of the whole purpose of having ontologies in the first place. This is the reason we can know a class tree is incorrect and know what to do to clean it :) start from well defined concept, what is a real taxonomy, … so I’m happy to have started this discussion and hope we can cooperate in a constructive manner :) author  TomT0m / talk page 20:53, 1 February 2018 (UTC)
@TomT0m:Yes, you are absolutely right about "silhouette" (and princess line (Q10638846) belongs there). There is also cut (Q11626671): style or shape of a garment, and the way its structure hangs on the body which may be the same concept. (And making "cut" a subclass of costume component was probably wrong, but at least it's findable.) I'd love help from the Ontology project on costume and fashion. - PKM (talk) 00:51, 2 February 2018 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── I've started a conversation about making changes to this ontology here - comments encouraged. - PKM (talk) 20:28, 17 February 2018 (UTC)

Index mineral[edit]

Top of the charts tinyurl.com/ybkpsoxb for TomT0m's latest query is index mineral (Q12409135). Now it definitely isn't instance of (P31) geographic region (Q82794). But it would be nice to be able to say that it characterises a geographic region (Q82794). I also think that saying it is part of (P361) mining (Q44497) is a bit questionable. To me the relationship ought to be something more like field of application mining (Q44497).

Any suggestions? Jheald (talk) 21:38, 1 February 2018 (UTC)

@Jheald: A google translate of one of an article on one of the instances of is https://translate.google.com/translate?sl=auto&tl=fr&js=y&prev=_t&hl=fr&ie=UTF-8&u=https%3A%2F%2Ffa.wikipedia.org%2Fwiki%2F%25D8%25A7%25D9%2586%25D8%25AF%25DB%258C%25D8%25B3_%25D8%25A8%25D8%25A7%25D8%25A8%25D8%25A7_%25D8%25B1%25D8%25A6%25DB%258C%25D8%25B3&edit-text= .
It’s the wikipedia in Farsi and from what I can guess the article is about a mineral sample that was used to study the rocks of that area. In that sense « index mineral » I’d say it’s a subclass of sample (Q485146). I’d say an « index mineral » instance « is used to study » a geological area. In turn, the mining industry uses geology to see which places are interested for them. But the « index » might be different when specialized for the mining industry. For the mining industry, maybe we can create a property with domain « mine » and range « mineral ».
From what I understand from the enwiki article, I’d say there is several aspect in this : the types of mineral that can be used as index mineral. For example « chlorite zone » is a subclass of « metamorphic zone » which has the index mineral type « chlorite ». I’d say, to be pedantic, that there is a process in geology called « metamorphic zone mapping ». « metamorphic zone mapping » imply finding sample of index mineral types in an area.
If « geology » as a science is the study of the earth, then « metamorphic zone mapping » is a part of geology.
< « metamorphic zone mapping » > uses (P2283) View with SQID < « index mineral » >
. A specific index mineral like no label (Q5760003) is probably an instance of some index mineral type that has been used for the process of mapping its metamorphic zone. The metamorphic zone is part of the earth crust under some geographic area, I guess.
instance of (P31) geographic region (Q82794) is obviously wrong indeed author  TomT0m / talk page 12:33, 2 February 2018 (UTC)
Interesting stuffs on prospection and sampling in mining : https://en.wikipedia.org/wiki/Mining_engineering#Mineral_exploration .
mentions :

User:Tobias1984 User:PePeEfe Your name ;) ... Pictogram voting comment.svg Notified participants of WikiProject Geology, to w:Wikipedia_talk:WikiProject_Geology and to w:Wikipedia_talk:WikiProject_Mining. author  TomT0m / talk page 12:55, 2 February 2018 (UTC)

@TomT0m: I would think that more straightforwardly index mineral (Q12409135)subclass of (P279)  mineral (Q7946) -- the item is about the substance itself, not a portion of it.
More interesting is if/how we should use has quality (P1552), use (P366), used by (P1535), (others?) to express what it is about this class that distinguished its items from more general minerals. Jheald (talk) 13:14, 2 February 2018 (UTC)
  • « the item is about the substance itself » : Sorry, I don’t understand what you mean. If you look at the farsi article, the index has even a name, « Daddy boss », so clearly it’s about a rock sample. On the other hand, it’s clearly true that « Daddy boss » is an example of some chemical compound, so an instance of substance. This is consistent with the frwiki definition of chemical substance : « Une substance chimique, ou produit chimique (parfois appelée substance pure), est tout échantillon de matière » (a chemical substance or […] is any sample of matter … ». But the use for the mine rocks may not be consistent with the definition in the enwiki article.
  • « what it is about this class that distinguished its items from more general minerals » You mean its instances of its subclasses ? This makes a big difference. If you find a sample of an index mineral in a rock, you learn something about this rock’s history because you know this substance can only be created in the rock history in certain conditions. In the mineral this means that we may have some properties « created from <other mineral> at pressure <pressure value> », but this is true for any mineral. I think there is nothing intrinsic in the quality of being an index mineral. They are interesting depending on the context. I’m inclined to think, after a little thinking, that what we are interested into when we define index minerals is the types of minerals one could find in a rock sample, and not in the instances of those type by themselves. So « index mineral » may be a metaclass. We would have , not index mineral (Q12409135)subclass of (P279)  mineral (Q7946). The definition would then be « mineral type whose instances are searched to determine the degree of metamorphism a rock has experienced » which is definitely consistent with the enwiki article imho. This also does not mess with the subclass tree of minerals by mixing their classification by intrinsic quality with their usage in science.
  • I don’t think has quality (P1552) is relevant here. It’s more a property of « metamorphic zone (Q2690925) » to contain minerals of that type or not. use (P366) : they definitely uses anything. A process or person make use of something. used by (P1535) : maybe. The problem is « to what end » ? then we find a process again, like « metamorphic zone characterization ». A part of science. author  TomT0m / talk page 15:51, 2 February 2018 (UTC)

I think index mineral (Q12409135) must be subclass of (P279) mineral (Q7946), part of (P361) metamorphic zone (Q2690925) and studied by (P2579) petrology (Q163082) or something similar, never geographic region (Q82794) or mining (Q44497). --PePeEfe (talk) 16:47, 2 February 2018 (UTC)

User:PePeEfe and some others have this right, it is a subclass of minerals. Using index minerals is a geological technique, but that would be something more like "index mineralogy" and not the same as index mineral. It is not connected to geographic region, or mines or mining. Graeme Bartlett (talk) 01:59, 3 February 2018 (UTC)
@Graeme Bartlett: Correct me if I’m wrong, but being an index mineral is not an instrinsic property of the mineral, but rather a feature of « index mineralogy ». In that sense, it seems that if it’s a subclass of mineral, the actual instances of this class have no interest by themselves. Imagine you are practicing index mineralogy. Knowing you found an instance of index mineral, as a subclass of mineral, does not gives you any information. Rather what you are interested into is « which index mineral did you find ». Meaning « what is the class of mineral you found that we name « index mineral » » ? In that sense, I think it’s more useful to consider « index mineral » as a metaclass, a class of classes (of mineral instances). « which index mineral did you find » can be reformulated « which instance of (the class) index mineral did you found ?». The answer is then « I found biotite ». author  TomT0m / talk page 11:26, 3 February 2018 (UTC)

Both instance and subclass of the same item[edit]

This one is really simple :

select ?item where {
  ?item wdt:P31 ?class;
        wdt:P279 ?class .
}

Try it!

… and has a scary number of hits : 856741 It was inspired by looking at the results of the last one (thanksJheald (talkcontribslogs)).

A query to find the most problematic maybeclasses is more reassuring

select (count(?item) as ?num) ?maybeclass where {
  ?item wdt:P31 ?maybeclass;
        wdt:P279 ?maybeclass .
 # SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
} group by ?maybeclass 
  order by desc(?num)

Try it!

Most of the problem concerns only a few concepts like protein, genes and color.

num maybeclass
405628 protein (Q8054) View with Reasonator View with SQID]
374042 gene (Q7187) View with Reasonator View with SQID
38422 non-coding RNA (Q427087) View with Reasonator View with SQID
35060 pseudogene (Q277338) View with Reasonator View with SQID
1067 transfer RNA (Q201448) View with Reasonator View with SQID
560 small nucleolar RNA (Q284416) View with Reasonator View with SQID
215 single-day race (Q2912397) View with Reasonator View with SQID
97 fish dish (Q18679149) View with Reasonator View with SQID
88 ribosomal RNA (Q215980) View with Reasonator View with SQID
83 small nuclear RNA (Q284578) View with Reasonator View with SQID

author  TomT0m / talk page 16:49, 2 February 2018 (UTC)

@TomT0m: Strangely enough, User:GoranSM asked about almost exactly this in the Wikidata group on Facebook just last night: [2].
I don't think anyone had any definitive reply, but for genes and proteins one possibility is that one might expect families of related similar genes and proteins across evolutionarily similar (or even not-so-similar) species, and maybe that is part of what is going on here. If one only had one example so far, one might be combining the item for the gene and for the gene family. Also there's the question of how one represents a gene that may have variants, even within the population of a single species. Jheald (talk) 17:17, 2 February 2018 (UTC)
That’s weird that the problem is not solved yet, there is concepts like allele (Q80726) who should be of some help, plus existing gene ontologies. Let’s assume an « allele » is the class of all DNA fragment with a same genetic sequence. A gene is the superclass of all its alleles. Seems quite simple to me.
« Gene » is then the metaclass of all those superclass. « Allele » is the metaclass for all single DNA fragment defined by a single sequence. author  TomT0m / talk page 17:40, 2 February 2018 (UTC)
  • @TomT0m, Jheald: Note that Wikidata:WikiProject Ontology/Problems/instance and subclass of same class has been running via Listeria for quite a while - I've been working on some of the simpler items on there for a few months, but there are a lot of issues to sort out. The problem with genes, proteins etc. is that the ProteinBoxBot folks have been inserting both instance and subclass relations automatically for everything they add, I don't think there's been much discussion there about which is better, and there are probably some things that depend on one or the other relation that we wouldn't want to break without some community discussion of how to proceed. ArthurPSmith (talk) 17:52, 2 February 2018 (UTC)
    • @ArthurPSmith: Community coordination is the key not to make a mess. To take good decisions, decisions should be based on ontological arguments like I try to do. Overwise the result may be like flipping a coin, which is a mess if we flip the coin several times. A comment of my proposed model, as a consequence ;) ? author  TomT0m / talk page 18:06, 2 February 2018 (UTC)
      • Mmm forgot this discussion before starting my adventure below … I’d suggest to replace P31 in their queries with P279*, as the only statement I remove are when there is already a subclass path to « biological process ». Or to use Module:PropertyPath to do the same substitution in infoboxes if needed, the function « path.match » can do the trick (but the module and its dependencies have to be be copied to enwiki) author  TomT0m / talk page 21:18, 3 February 2018 (UTC)

Colors[edit]

  • Let’s call « red » the subclass of light with a red color spectrum, or a family of spectrum. Let’s do the same for blue, white and so on. « red » is then a subclass of « light », as is blue, white and so on.
  • Let’s call « color » the metaclass for classes of light defined by a certain spectrum. « red », « blue » and so on are all instances of « color ». Red may be subclassed, this means the family of spectrum of the subclass is a restriction of the family of the more general « red ».

Let’s define the « color » property as « property who takes it value over instances of the « color » metaclass ». An object is of color « red » if when lighted by light color it reflects lights of a spectra conform to the spectra defined for « red ». Or if it emits it directly if it’s active.

Any questions or problems with modelling stuff that way ? author  TomT0m / talk page 18:01, 2 February 2018 (UTC)

All three of us participated in Wikidata:Requests for comment/Are colors instance-of or subclass-of color but that ended up inconclusive. Treating "color" as a metaclass with individual colors as instances of "color", while being subclasses of one another sounds like a good solution. However, how do you handle the colors "white" and "black" with your approach? Or even "brown" or "grey"? ArthurPSmith (talk) 18:53, 2 February 2018 (UTC)
@ArthurPSmith: In the absence of the RfC process to reach a conclusion, I guess we have to use the best idea we have. I don’t think « white » is a problem. The scientific approach to light decomposition is spectroscopy (Q483666). Call the result a « spectra ». while colors like will have just some ray in the spectra, white will have rays everywhere or so (the « full » spectra). Spectral analysis allows to define a class of spectra that maps to the white color. The spectra for « gray » are probably similar to the one for white, but with less intensity. Black maps to the empty spectra. author  TomT0m / talk page 12:36, 3 February 2018 (UTC)
So I think you are saying that "black" would be a subclass of every color, and every color would subclass "white" - including "gray" (and in general darker shades would be a subclass of lighter shades?) ArthurPSmith (talk) 16:30, 5 February 2018 (UTC)
@ArthurPSmith: No, as the spectrum of some red do not fulfills the criteria of beeing a white, it lacks a lot of other frequencies (at least some primary color). Say we represent colors are RGB, « white » will be those colors whose 3 components are above a high threshold. A red would not qualify for this criteria as it would be low on at least one component. A subclass instance should qualify for all criteria of the superclass, so a red is not a white. It would more be like a « part of » relationship between white and the other colors, as you can add a red (light) with some other color (light) to get white (additive color (Q353267)). author  TomT0m / talk page 16:44, 5 February 2018 (UTC)
@TomT0m: Ok I think I see what you're getting add. Each specific color (instance of "color", like "red", "white", etc) is associated with a collection of possible RGB values (or some reasonable other mechanism for specifying color). Subsets of one of those collections correspond to more refined color labels, which are subclasses of the more general ones. That is, a color item here does not generally correspond to a single specific RGB value (r, g, b) but some kind of region in rgb-space that "looks like" the color; say for "red" maybe something like {r,g,b|r > 2(g+b)}? But perhaps not so precisely specified, with somewhat fuzzy edges...? Is there anything that you think would qualify as an "instance of" red then, though? A specific RGB value? Or the color of a particular pixel in an image, would that be an instance of red? ArthurPSmith (talk) 19:00, 5 February 2018 (UTC)
@ArthurPSmith:_Colors instances would be, as said previously, actually light rays. The actual instance of « color » is the light emitted by the red pixel on my screen. This pixel has the property of being red as he emits light of that kind. So there should be only few instances of a color in Wikidata, something that is close would be cosmic microwave background radiation (Q15605) (don’t know if there was anything visible in it :) ) or closer solar radiation (Q17996169). Note that as the RGB space is not the only space in which we can describe colors, there is not only one possible characterization but several more or less equivalent. And that we don’t actually have to provide a precise characterizations, we should ensure ourselves there is one. author  TomT0m / talk page 21:12, 5 February 2018 (UTC)

ProteinBoxBot and biological processes[edit]

@ProteinBoxBot, Andrawaag, Sebotic, Gstupp: Correct me if I’m wrong, but this kind of edits https://www.wikidata.org/w/index.php?title=Q2355306&diff=615506970&oldid=610494690 especially the one with instance of (P31), are incorrect. They are source with « the gene ontology » but such a page https://www.ebi.ac.uk/ols/ontologies/go/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FGO_0009888 does not show any « type » statement, but « is a » statements which are equivalent to « subclass of ». The namespace « biological process » is not an instance of statement as well. I’m cancelling those statements for misuse of the source and because this causes a lot of classes to be instances of one of their class of superclass. This is incorrect. 20:13, 3 February 2018 (UTC)  – The preceding unsigned comment was added by TomT0m (talk • contribs).

Hi @TomT0m:, first off, sorry for not responding to this earlier, but I don't think the Mention template is working, as none of us received any notifications. Please use Ping instead. In regards to using "instance of" to indicate the semantic type, or classes being instance of one of their superclasses: This has been discussed . many . times . and as far as I can tell (?), no consensus has been reached. We use the instance of statements to help with queries as they make them simpler and faster. We also use them in applications built on top of Wikidata such as Wikigenomes.org and Wikidatascape. As these are referenced statements, we'd appreciate if you didn't delete them while ontology discussions are still ongoing.
Regarding namespace vs. type: The namespace in this case is the ontology that these terms are defined in. The [http:// geneontology.org/page/ontology-structure GO documentation] explains how there are three GO ontologies that are "is a"-disjoint, meaning that no is_a relations operate between terms from the different ontologies. And so all terms that are subclass of biological process are a biological process, and we use instance of to capture this.
On using instance of a first-order metaclass (Q24017414) instead of the root node to indicate type (mentioned here, and probably elsewhere), we are definitely open to doing this if this is what the community decides is the best practice. Gstupp (talk) 02:12, 7 February 2018 (UTC)
@Gstupp:_This is not totally a question of community consensus as instance of (P31)_and subclass of (P279)_have analog properties in external ontologies and come with definitions for their intended use, Help:BMP for example is one of the oldest help page. As true our projects work with consensus, I think on the Wikidata case we have a commitment to definitions for items not to be ambiguous for example, and community should be careful with them not to make a mess. In the case of proteins, I think it’s not harmful at all to use a newly created biological process type (Q47989961).
Maybe, if we take the metaclass path, to make it useful we may try to restrict its use to some kind of « universal » notion (Universal (Q6497530), see the wikipedia article(s)). For example « Einstein’s digestion processes » is a subclass of « digestion process » but is not a universal as it refers to a real world instance. « digestion process », as a whole, is a subclass of process on the other hand that may be composed of many subprocesses, but that qualify to be a « universal class ». But the class multicellular organismal process (Q22299433), upper in the class tree, may not really be useful as it is subclassed by many very different processes that qualifies to be universals themselves. Maybe only « universal leaf » of the « biological process » class should be classified as instance of biological process type (Q47989961) ?
My personal impression on this is that there is nothing to lose to make things clearer and to rely on practical definition. Community concensus has its limits on this as « local consensus » of a wikiproject may contradict the consensus of a different one, (here the ontology wikiproject) and the real consensus is gained when both wikiproject agrees. author  TomT0m / talk page 11:54, 7 February 2018 (UTC)
@TomT0m: There are two issues here. First, there is this mass deletion of well-referenced and well thought-through statements without ample notice. Our processes are well thought through also because we build projects and use cases on top of wikidata. To give an example, http://www.wikigenomes.org/ fully rely on similar structured and referenced data in Wikidata. We also maintain a series of example queries actually being used by the community. So even if that data is "a mess" as you call it, cleaning it up indiscriminately, can actually break working applications that rely on Wikidata. So I would kindly request you to carefully consider this when you are changing underlying well-referenced data models in Wikidata.
The second issue is indeed the ambiguity in the ontological space. It is amusing to see that for example, a Wikidata item can have both an RDF: type property and a P31 property.
wd:Q42 a wikibase:Item ;
wdt:P31 wd:Q5 .
Similarly, subclass of (P279) has an alias "is also a", which suggests some level of synonymy with instance of (P31). This a simple example that demonstrates that there remain some systemic issues in how ontological relationships are modelled in wikidata. With this in mind, it is a bit unfair to be so strict on well-intended efforts like ours. To move forward I would like to suggest to make/keep it a community process. This means that we reinstate the statements that have been deleted. We need to figure out if and if so where changing the data model breaks an application and use-case, while we start a remodelling discussion/effort to fulfil the ontological issues they allegedly create.
BTW I also didn't receive a notice when you mentioned me. Please use ping if I need to look at some issue. --Andrawaag (talk) 17:24, 7 February 2018 (UTC)
@Andrawaag: Per Template:Reply to bath ping and mention are redirects to it, so I don’t think this is the issue. I had also a not notification issue of this thread earlier with the ping template, was pinged but did not was notified. May be a notif bug. That said, I had second thoughts after removing the first claims, and did not continue the job. I’ve seen the bot have been rerun and is re-adding the statements so it’s just a matter of time. This is in no way a satisfying solution. I don’t really understand that the whole model was based on an assumption that instance of (P31) was equivalent to subclass of (P279) when we have Help:BMP for such a long time, and that people don’t really wonder why there is two properties if they are essentially the same. Note that I still thinks that the sources, although you did in good faith not realize it, are not correctly used and that the statements are then not corrects, by my previous argumentation. See for example Help:BMP or User:TomT0m/Classification (this is an essay of mine but it is solid, I wrote the documented enwiki article on the notion of metaclass in the semantic web and knowledge representation in the process of writing it) I hope my arguments about the sources not supporting them, however, convinced you so we can agree to a better solution (one has been proposed just above. What do you think about this ?
On
wd:Q42 a wikibase:Item ;
wdt:P31 wd:Q5 .
That’s because Wikibase data model is represented in RDF, but Wikibase data model essentially represents items, that are a collection of statements about the subject of the item. If there is a meaning to a statement (say, a P31 statement) at the level of a wikibase data model it is « according to [the sources of the statement], the subject of this item is an instance of [the subject of this item]». That way, a statement can be for example deprecated the statement is not believed to be true anymore but the source still says this (see Denny’s explanation on this : https://blog.wikimedia.de/2013/06/04/on-truths-and-lies/ ). Or to deal with inconsistencies easily, if you have two date of births for a person this is a problem for logical reasoning in RDF (Principle of explosion (Q60190)), if you have just a statement that says that someone says it’s true, no inconsistency follows. It imply that if we want to do reasoning with the data we do not forgot the references however, so instead of a deduction « the person died at 30 » from its date of death and its date of birth, the deductions become « according to [the source for the birthdate 1 and the source for the date of death], he died at firty » The may be completed with other hypothesis according to other sources or combination of non deprecated sources. This does not imply that we as a community can’t try to make sense on the content of the statements’ collection that Wikidata is. In that sense, P31 only means what we as a community decide, it was defined as an analog of « rdf:type » only because someone thought because we needed one, but there is no formal link between rdf:type and P31. And there won’t be any, this ensures that we don’t mix up the « collection of statement » level the Wikibase data model defines and the level of the meaning of the statements - of course the Eiffel tower is not a collection of statements, and the subjects of the statements we have on the Eiffel tower is the Eiffel Tower, not its item. We decided we’d use P31 in Wikidata the same way rdf:type is used in rdf. The same way we decided that our « date of birth » property would carry statements about the date of birth of a person. This is a problem however if actually the reference does not support the statement. author  TomT0m / talk page 18:45, 7 February 2018 (UTC)

TomT0m: Our use case here is only to simplify queries and uses of items in external applications by having a statement on each item indicating the item's type, without having to perform a query through P279*. We are happy to do this any way the community decides, whether it is with instance of the root class, or a first-order metaclass, or any other way, as long as it is consistent and widely agreed-upon. Please let us know if this is the case. Thanks Gstupp (talk) 20:35, 7 February 2018 (UTC)

@Andrawaag, Gstupp: my message on wikiproject molecular biology does not seem to attract attention. I don’t know what that mean, is this project actually inactive ? that would mean the consensus can be decided by the three of us, but I don’t think it’s the case. Where do you think it’s the right place to start a discussion ? And if you have no idea, cam we consider this to be a tacit consensus ? author  TomT0m / talk page 13:53, 10 February 2018 (UTC)
@Gstupp, TomT0m: I support the first-order metaclass solution, as I mentioned here. Wikidata doesn't have explicit semantic type, like in Unified Medical Language System (Q455338), but we can represent its relation by using instance of (P31) with some class or metaclass. This is very useful not only to simplify query but also for human to quickly understand what this item is. I don't like to trace the subclass of (P279)* chain one by one. --Okkn (talk) 06:33, 12 February 2018 (UTC)
@Okkn, TomT0m:: I just want to reiterate that this decision affects many (all?) aspects of how items are represented in Wikidata and it should be decided on a system-wide basis instead of being applied to specific types of items. We want to avoid flipping things back and forth and having inconsistencies. As far as I can tell, this issue has not been resolved and is used differently across wikidata. Is red an instance of or subclass of color? Is British Sign Language (Q33000) a language or "language class" ? Is atheism (Q7066) a religion or religion class? Is human p53 a protein or a protein class? Right now, we manage a large quantity of many types of items (diseases, genes, drugs, chemicals, RNAs, sequence variants, etc.) we want to keep them structured in a consistent manner. I don't think that this decision should be made with only "biological processes" in mind or only within wikiproject molecular biology, and should take into account everything that would be affected. Gstupp (talk) 22:54, 12 February 2018 (UTC)
@Gstupp, TomT0m: The problem is that the usages of some root items (eg. disease, color, language, etc...) are still ambiguous and controversial ("instance of (P31) X" vs "subclass of (P279) X"). However, "what can be a value of instance of (P31) in one item" (class?) and "what can be a value of subclass of (P279) in the same item" (metaclass?) are clearly and logically different, so once we separate these two distinct concepts in the root concepts, most of the troubles may not happen. I know ProteinBoxBot team is maintaining a huge number of items, and especially because of that, I think it is important to show a model. If you adopt metaclass solution, is there any inconvenient point? --Okkn (talk) 23:35, 12 February 2018 (UTC)
@Gstupp, Okkn: Some cases are more difficult than others, so I hardly support the idea that we must make only progress wide steps to settle every case, or we’ll stay stuck is the current situation forever. However in that case I’m pretty confident that we’re on a topic in which the type–token distinction (Q175928) (please read the enwiki article if you did not do it yet) is pretty easy to apply. We clearly have a real world object level and a type of real world object (/events/processes) type level. « proteins » are a type of real world objects. Such types can be subclasses of each over if relevant, and we have a clear criteria to decide : if any real world object of some « real world object type » are also real world object of the other one, we’re on such a case. However making sense of such a « real world object type » being an instance another « real world object type » is a problem, as according to type–token distinction (Q175928) only real world objects are instances of « real world object type ». The confusion only can increase if we have chains of « instance of » with « real world object type ». Note that the database that protein bot box imports from do not do that. The metaclass solution solves this by adding a « real world object type type » level. It removes a problem by adding clarity, and nobody has been able to raise a problem with this. We’ll be confident that if we found an instance of the class « protein » or « biological process », it will actually refer to a real world molecule or a real world process. And never a type of proteins or process. However unlikely that we have an article about a molecule instance, we’ll appreciate to never ask ourselves if it’s the case or not, we have a general principle to decide independently from this question. While there is a case for doing things this way, nobody has seem to raise a significant argumentation on keeping stuffs the way they are. author  TomT0m / talk page 11:06, 13 February 2018 (UTC)
@TomT0m, Gstupp: Not only protein (Q8054), but also other chemical substances may have some problems. What do you think of , , , , and . What's the difference between and them? --Okkn (talk) 14:52, 13 February 2018 (UTC)
@Okkn: See Wikidata:WikiProject Chemistry/Proposal:Models and the talk page for some discussion of these issues. Many of the current relationships of this sort in chemistry are wrong, as you suggest. However note chemical element (Q11344) is defined in wikidata as a first-order metaclass (Q24017414) so it is an example of the metaclass approach. ArthurPSmith (talk) 15:43, 13 February 2018 (UTC)
@TomT0m, Okkn: Just to answer two questions above (what is the inconvenient point of changing, and what's the argument for keeping things the way they are), I think Greg is raising a few points here. First, we previously discussed this with the community and the current solution represented the consensus at the time. Second, we've invested time and effort into implementing that model, both in terms of our bots and our downstream applications. Third (and most importantly), we are happy to make all the changes necessary, but only if there is reasonable Wikidata-wide consensus that there is a better solution. Without evidence of that consensus, we run the risk of someone else arguing for yet another solution in six months, which would result in us spending even more time modifying our bots and applications instead of pursuing our team's broader mission (loading high quality biomedical datasets, demonstrating the integrative queries that Wikidata enables, promoting its use in the biomedical community, attracting contributions from domain experts through applications built on Wikidata, building bot automation infrastructure, exploring data modeling and detection of constraint violations using ShEx, etc.). I hope that clarifies our perspective here... Best, Andrew Su (talk) 17:16, 20 February 2018 (UTC)
@Andrew Su, Okkn, Gstupp: Requiring a whole community consensus to solve a bug in the model seems waay outbalanced with way the initial consensus was achieved. Do you have a link ? to the initial discussions ? Actually a real problem in Wikidata is the risk of fossilization of bad design initial decisions for datas that become widely used by external tools, which leads to solve bug in the model really hard. But I don’t think this should stop us to solve design bugs or making sources lie by affirming they support claims they actually don’t ! This is a serious bug. Several users says they are OK to make the change we suggest here, which is quite simple. I guess you are way more in the position of attract the attention of the data users that I am because I just don’t know who they are, nor if they would really be and embarrassed by the change (substituting a couple of qids in a couple of places does not seem like a revolution). RfCs in Wikidata usually fails to attract attention, and discussions on molecular biology project did not exactly attract the crowd. I fail to see a way out in here when the only blocker to make the change seems to be your answer and it’s not even clear what you mean. I don’t really know what question can be posed to the whole community for you to be satisfied. author  TomT0m / talk page 09:43, 21 February 2018 (UTC)

Series ordinal - P1545[edit]

When you read what the "series ordinal" is about, it is "position of an item in its parent series, generally to be used as a qualifier". When you consider its use in combination with USA governors, you will find that there is no Obvious position, that it is often arbitrary. Particularly when you consider the number range for the governor of South Carolina, it is not only about governors and it is not only about USA governors. Consequently this property is abused.

The reason why I object is that I have been harassed because I do not value this property as significant. So there are a few scenarios: the first is to be more relaxed and talk/be a lot less aggressve. The second is that another property is considered, one that acknowledged the arbitrariness of what the number is used for. Thanks, GerardM (talk) 18:48, 6 February 2018 (UTC)

Maybe you should talk with others instead of talking about others. Sjoerd de Bruin (talk) 18:53, 6 February 2018 (UTC)
Really? I do not mind to talk with people when there is a reasonable tone and a reasonable request. This has been absent in this latest altercation. When the facts are considered it is about a property that is obviously abused. All the more reason to consider an alternative. Thanks, GerardM (talk) 19:39, 6 February 2018 (UTC)sontemos
I'm not sure I understand what point you're trying to make. The property seems pretty clear to me. The first person to hold the position is P1545: 1, the second 2, etc. Can you clarify which situations you find arbitrary, or abuse of the qualifier? --Yair rand (talk) 19:59, 6 February 2018 (UTC)
I don't know if this is what Gerard is talking about, but I notice P1545 is used as a qualifier on positions in two different ways, one as you note is if the position is a unique position, to indicate the order of this person in the sequence of holders of the position. But the other meaning is when the position is part of a numbered list (such as members of a state's delegation to the US house of representatives) - so the district 3 representative would have a P1545 = 3 qualifier. Those two uses should probably be separated somehow in future, maybe another property is needed. ArthurPSmith (talk) 20:27, 6 February 2018 (UTC)
This appears to be a continuation of the discussion at Wikidata:Administrators' noticeboard#GerardM (talk • contribs • logs). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:01, 6 February 2018 (UTC)
  • Maybe you could point out a better method to record that Q5725737 is presented as the 117th governor [3]?
    --- Jura 20:25, 6 February 2018 (UTC)
Why should I. The property is abused it is not a straight forward sequence. I do not care for anyone to be a specific number when there is no method to that madness. Thanks, GerardM (talk) 11:53, 11 February 2018 (UTC)
  • The use of the qualifier is consistent with the use in the field and relevant constraints. If there were different ways to use such ordinals, we could store them as separate statements, but apparently this isn't even an issue here.
    --- Jura 04:51, 16 February 2018 (UTC)
I do not know the specifics of this argument, but I do know that assigning a number is contentious, unless there is a canonical numbering system by that political office at their official website. Some historians do not count a second non-contiguous term as a new holder of the office, and some do. So we have John Smith as the 40th holder of the office, and another historian counting John Smith as the 40th and the 45th holder of the office. Some numbering systems count interim holders of the office, and some systems do not count them. Is this an argument like that? I create lists of mayors and run into this problem all the time. I am currently working with the research librarian for Long Branch, New Jersey to create a canonical list for them. --RAN (talk) 02:58, 19 February 2018 (UTC)

Please can we enable FormWizard on Wikidata?[edit]

Hi all

I would really like to use FormWizard on Wikidata to create a very easy to use standardised form for creating entries on the Wikidata:Data Import Hub. Please can it be enabled? Unsure if I make a request here or on Phabricator.

Thanks

--John Cummings (talk) 20:04, 6 February 2018 (UTC)

  • Symbol support vote.svg Support. Seems useful; and harmless. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:15, 6 February 2018 (UTC)
  • Pictogram voting question.svg Question: I wonder if this gadget is still getting maintained. The Phabricator project isn't showing much activity and the project page hasn't been updated over almost a year. Sjoerd de Bruin (talk) 21:26, 6 February 2018 (UTC)
    • @Sjoerddebruin: A WMF staff member is the maintainer and Wikimedia Foundation are using FormWizard on the new Wikimedia Resource Center (when you want to add a new resource) along with several pages on the WMF grants space, it seems likely that it will be fixed if it breaks. --John Cummings (talk) 22:21, 6 February 2018 (UTC)
  • Symbol support vote.svg Support Sounds useful. Richard Nevell (talk) 22:48, 6 February 2018 (UTC)
  • Symbol support vote.svg Support - NavinoEvans (talk) 14:35, 7 February 2018 (UTC)
  • Symbol support vote.svg Support - makes sense to do this.Battleofalma (talk) 15:05, 7 February 2018 (UTC)
  • Symbol support vote.svg Support - seems preferable over your phab proposal for the same. --- Jura 19:55, 7 February 2018 (UTC)

✓ Done, configuration files can be created as subpages of Wikidata:FormWizard/Config. Sjoerd de Bruin (talk) 09:59, 14 February 2018 (UTC)

Thanks so much @Sjoerddebruin: :), John Cummings (talk) 11:25, 19 February 2018 (UTC)

Ships: Copy data from Commons to Wikidata[edit]

Images of 17 534 ships are indexed in Wikimedia Commons by the unique IMO number, but only 3976 IMO numbers are set in Wikidata. To manually create an entry for each ship not existing in Wikidata would take several months, but would be easy to do with QuickStatements. However, I don't want to create entries for ships that already have Wikidata entries (without the IMO parameter already set). The questions are then:

  • How can I find out which of these Commons categories contain images that are used in Wikipedia articles (and then probably will have a Wikidata entry without the IMO parameter set)?
  • Is there any other way to figure this out? --Cavernia (talk) 14:43, 7 February 2018 (UTC)
From your 13 558 candidates for new items, a first set to exclude would be items that already have a Commons category (P373) mapping to such a ship. From a second-level category scan on Petscan (or an SQL query) you should be able to get the names of the relevant categories on Commons. This could then be compared to the Commons categories for all of the ships on Wikidata with no IMO number, to see whether you can match some more. (Of course, somebody may already have done this -- this may be where the existing IMO numbers come from). Jheald (talk) 15:11, 7 February 2018 (UTC)
The chain Commons category -> image -> Wikipedia article -> Wikidata item may also all be possible to extract all from SQL, but that would need somebody with better knowledge of the SQL tables and their contents than I've got. The first and last links are definitely possible from SQL, I don't know about the middle one. Jheald (talk) 15:14, 7 February 2018 (UTC)
I hope this query will help you. Interestingly, the amount of such categories is very close to the amount of already set IMO numbers. Matěj Suchánek (talk) 16:31, 7 February 2018 (UTC)
Very nice!
Although seeing a query like that always reminds me just why I like SPARQL so much :-) Jheald (talk) 16:51, 7 February 2018 (UTC)
@Jheald: Thanks, I've already done your first tip, but the opposite way, by extracting all items with Commons Category starting with "IMO" and using QuickStatements to set IMO number for the items without IMO number set. The thing is that several items are not using the IMO category, but the ship name subcategory, i.e. c:Category:Canberra (ship, 1961). It would still be possible to determine by using CatScan to extract the subcategories and then compare to an extract of Commons category (P373) mappings. Still, many items have images in Commons, but the image is not set, nor the Commons Category.
@Matěj Suchánek: Thanks for the query, but it doesn't seem to work according to my intention. The query detects some articles containing images from the category, but appearantly not the ship articles. Example: For IMO 5059953 the query finds the Q48249 (Falklands War) and Q7394859 (STUFT), but not Q1032840 which is the main item of the ship Canberra. --Cavernia (talk) 17:44, 7 February 2018 (UTC)
I see the problem. Some database colmuns have title with underscores, some with spaces. I updated the query, the number doubled. Matěj Suchánek (talk) 19:31, 7 February 2018 (UTC)
Thanks, it works better now. --Cavernia (talk) 20:43, 7 February 2018 (UTC)

Now when I've found the way to determine which entries not to import, it must be decided how to import the entries:

  1. Many ships have several names, should the old names be entered into name (P2561) or be added as aliases?
  2. The year the ship was built (finished), is mostly entered into service entry (P729), but in some cases start time (P580) is also used. Before creating more than 10 000 items, it should be decided which property to use
  3. The flag state is possible to extract from one of the online ship databases, but I can't find a way to import more than the current or last registered flag state. Should we import the current flag state or leave it open until we find a way to import the historical flag states in the correct order?
  4. It is possible to import gross tonnage, length, width and draft when using QuickStatements, but then a ±0 will be added to indicate the tolerance. I don't like it, but I can't figure out how to get rid of it when using QuickStatements.
  5. Should gross tonnage (P1093) be imported with the unit gross tonnage or without unit?
  6. It would be practical to add labels, descriptions and aliases to the most common languages for shipping like English, French, German, Dutch, Spanish, Portuguese, Italian, Danish, Swedish, Finnish and Norwegian. There are different standards in how to arrang the name. I.e. SS France is SS «France» in Norwegian and France in German. Could it be that these standards are listed anywhere? --Cavernia (talk) 12:15, 8 February 2018 (UTC)

1) Put all the names in as many statement. The most recent one is to be put a preferrered rank, the older one a normal rank with if known the begin and end date put as statement qualifiers author  TomT0m / talk page 12:28, 8 February 2018 (UTC)

I'm resolving ship names from Commons by extracting the subfolders which includes the ship names (of course not all of them) and the date the ship was built. More names (and years) might be included in the category description, but there is no common standard for this (compare 7229502 and 9238404) and it is difficult to harvest with a script, and mostly the names are all upper case letters. Getting the data into Wikidata is actually the easy part here, harvesting and securing the quality of the data to feed into it is the hard part. However, I don't know if it's possible to define rank when importing data by using QuickStatements. --Cavernia (talk) 12:56, 8 February 2018 (UTC)

The script has now been tested: Q48336558 Q48336574 Q48336589 Q48336606 Q48336621 Q48336635 Q48336649 Q48336664 Q48336681 Q48336697. Comments are appreciated before running the rest of it. --Cavernia (talk) 20:26, 10 February 2018 (UTC)

@Cavernia: That's looking really nice. A couple of tweaks might be whether to make the description slightly more detailed, eg "Yacht, built in 1987" or even "61-metre yacht, built in 1987" rather than just "ship", and to add the alternative names to the aliases field, ie "Majestic", "Il Vagabondo Again" and "IMO 1001984"; also perhaps to make it instance of (P31) a more specific class than ship (Q11446). But maybe you were planning to add all of that. The most important thing is to identify the different ships and add items for them, and it looks like your script is doing that really well.
Once it has completed, it may be worth trying to identify what other items for ships we may have on WD, that look as if they could have IMO identifiers, but do not as yet. Do you have a rough idea of how many items we currently have in that state? Jheald (talk) 20:33, 11 February 2018 (UTC)
Regarding the aliases, it would be easy to include "IMO 1001984", as I have this information collected. I'm working on a solution to include alternative names as aliases. When it comes to the description, I have so far not succeeded resolving the ship type, but still trying, then it would be possible to include it in the description. Adding "built in 1987" is a good idea and easy to implement in the script.
When it comes to instance of (P31), for me it seems as natural to use ship (Q11446) for all ships as to use human (Q5) for all people. Instead I miss a property called Ship type. I don't know if this has already been discussed.
My guess is that there are about 1000 ships that have WD items and have an IMO number which is not identified in WD. I think I've added 1000-1500 IMO numbers by template harvesting, scripting and manual entry the last month, and more will be added before running the script to generate as few duplicates as possible. IMO is probably the only unique parameter for ships, and by importing IMO numbers into Wikidata it will make it a lot easier to detect duplicates. --Cavernia (talk) 21:22, 11 February 2018 (UTC)
  • I'd add (all) names of the ships with official name (P1448). If you are unsure about the language, use "und" as code. Not sure if it's standard that Norwegian labels include brackets («»). [4]. Both points could also be addressed after an initial upload.
    --- Jura 21:33, 11 February 2018 (UTC)
Very few ships have official name (P1448) set, and in the cases it is set it is mostly with from and to years. This is information I don't have retrieved (yet), my opinion is that this property should be entered manually or automatically imported if it is possible to extract complete data from any external database. In Norway we include the brackets for ship names, mostly also a prefix describing the ship type, like MS «Granvin» where MS means motorskip--Cavernia (talk) 22:49, 11 February 2018 (UTC)
Regarding property Ship Type this proprty was proposed here Wikidata:Property proposal/Ship type but not done as Consensus is against having any specific type property Breg Pmt (talk) 21:44, 11 February 2018 (UTC)
It seems like this proposal should be reopened since the decision is based on a misunderstanding. There is a major difference between ship type and ship class (like the difference between occupation (P106) and family name (P734) for people). --Cavernia (talk) 22:49, 11 February 2018 (UTC)
Yes, but surely all ships in a particular ship class (Q559026) will have the same ship type (Q2235308), so just make
<ship> instance of (P31) <ship class> subclass of (P279) <ship type>
So one has, eg:
HMS Cumberland (Q1558665)instance of (P31)  Type 22 frigate (Q922727), Type 22 frigate (Q922727)subclass of (P279)  frigate (Q161705), frigate (Q161705)subclass of (P279)  warship (Q3114762)
-- the same as we do for railway engines, aeroplanes, individual cars, etc. etc. Jheald (talk) 23:08, 11 February 2018 (UTC)
But contra this, see note below regarding vessel class (P289). Jheald (talk) 00:26, 12 February 2018 (UTC)
  • For extracting ship type, it looked like some information (eg "Yacht") was available on the page the IMO number is linked to. Also, it looked like there might often be a Commons category set to indicate it. Jheald (talk) 21:50, 11 February 2018 (UTC)
Yes, ship type is included in MarineTraffic, but the site doesn't allow me to harvest data by using a script. However, I found another site that allows me to do that, so now the script just have to run for some hours to complete this. --Cavernia (talk) 22:49, 11 February 2018 (UTC)
I was able to extract ship type for about 20 % of the ships with images in Commons. Better than nothing.... --Cavernia (talk) 10:05, 13 February 2018 (UTC)
Found another database which contains most of the ships, it also includes former names, home port, shipyard and class society. Do we need the latter as a WD property or does it already exist? --Cavernia (talk) 13:38, 13 February 2018 (UTC)
  • To comment on name question: names should be available in a way other than as aliases. Aliases are helpful as they make it searchable, but the names should be available in a structured way as well. I think official name (P1448) is preferable over name (P2561). Start and end dates can be added later, when/if known.
    --- Jura 23:59, 11 February 2018 (UTC)
  • See also Wikidata:WikiProject_Ships/Properties, in particular Wikidata:WikiProject_Ships/Properties#Individual_ships. If there are ambiguities there, (eg your questions above), it may be worth checking in with the talk page there, then updating the page to document what you think is the best way forward.
The page recommends using vessel class (P289) rather than instance of (P31) to indicate 'vessel class'. I don't really understand why this is considered necessary or useful, but it has survived two deletion discussions, in 2013, and again in 2015. Jheald (talk) 00:24, 12 February 2018 (UTC)
At that time, before 'arbitrary access' was available in Lua, there may have been a case for keeping P289 to allow infoboxes to give a ship class (where available), and also a ship type. I don't think that should be a problem any more (pinging @Mike Peel: ?), so it may well now make sense to nominate P289 again. Jheald (talk) 00:35, 12 February 2018 (UTC)
If this was a military register the described solution would be great, but many civil ships don't belong to a specific class. Another challenge is the low number of ships in each class. For the Norwegian monitors there are 2 classes, the first contains three ships, the other only one ship. At the moment we have 2856 ship classes containing 9309 ships, mostly military vessels. Practically, ship classes means little for most users or readers, but they will understand the difference between an oil tanker, a tugboat and a passenger ship. --Cavernia (talk) 09:58, 13 February 2018 (UTC)
Agreeing with Cavernia there is qiute a difference between military vessels having vessel class (P289) and civil ships, who do not hav classes. Breg Pmt (talk) 10:43, 13 February 2018 (UTC)
@Jheald: Since you pinged me, I thought I should reply, but personally I don't have a strong opinion here at the moment. There are points against it being in P31 - you end up having to navigate a whole tree to figure out what the Wikidata item is fundamentally about (i.e., going from knowing it's a class of something to finding out that it's a ship), and if you try to say 'type of ship: <P31 values here>' then you can sometimes end up with odd results (e.g., "type of ship: ship"). However, it is easier to include that in more general infoboxes such as the Wikimedia Commons one. Mostly I'd say that it's best to be consistent in the approach across all types of things if possible. (and BTW, migrating ship info from Commons to Wikidata is a great idea that should definitely be done!) Thanks. Mike Peel (talk) 22:55, 16 February 2018 (UTC)
Also, looking at @Cavernia's first example above, Bad Girl (Q48336558) links to commons:Category:IMO 1001192, but that also has a subcategory of commons:Category:Bad Girl (ship, 1992). In general, there can be multiple ship-name categories for each IMO (if a ship has been renamed), and that ideally needs to be reflected here in a way that means that each commons category has a sitelink. Any ideas on how to do that? Thanks. Mike Peel (talk) 23:01, 16 February 2018 (UTC)
Yes, I'm aware of this, some of the existing items links to the IMO category, some items to the ship subcategory. My preference is to link to the IMO category, as there will always be only one IMO category reflecting one Wikidata item for each ship, independent of how many different names the ship have had. --Cavernia (talk) 09:02, 17 February 2018 (UTC)
Please avoid to use the term « category » here, as in the Wikiverse it refers to a Wiki category. Wikidata classes are not at all categories in this sense. Pleuse read User:TomT0m/Classification for an introduction on how classes can be (and are) used in ontology projects. Imho we should just avoid creating a « ship class » property, because such classification systems just work. In fact, there is prior art in wikidata a few years ago as an effort to delete such specialized properties. This allows to avoid to take a community decision like « should we create a property for US military ship class » all the time, whereas the need to classify stuffs is so present in every field of knowledge. It’s enough in a lot of cases to allow in queries to find the instances of the subclasses of « ship », and to use a more specialized class as the instance of (P31) statement. Find all the ships (objects) in Wikidata is just as simple as querying select * { ?ship wdt:P31/wdt:P279* wdt:Q11446 } . Simple enough for me, and applicable way outside of the ship field, generalize easily to any vehicle without worrying to much how the automobile guys have organized their fields. If they followed the same principles of course. author  TomT0m / talk page 11:30, 17 February 2018 (UTC)
@TomT0m:: I'm talking about categories in Commons, not in Wikidata. In your comment and the discussion you are referring to it seems that you don't understand the difference between ship type and ship class. This confusion is why I think we need a new property for ship type. --Cavernia (talk) 09:26, 19 February 2018 (UTC)
@Cavernia: No I don’t. Ship classes are types of ships (the converse is not necessarily true). The plan exposed in my classification page is to use a metaclass ( « ship class » ) to discirminate ship types that are ship classes by explicitly those ship-classes by classify them as ship classes ( ), while also acknowledging they are a simple type of ship - , just as there is types of everything else. This seems to me much like the difference between a car model and a more generic car type (SUV for example, there is many SUV models). author  TomT0m / talk page 10:14, 19 February 2018 (UTC)
No, ship classes are not types of ship. I have explained this earlier in the thread. --Cavernia (talk) 11:34, 19 February 2018 (UTC)
But they are both groupings of ships. A ship-type is a sub-group of ship; a ship-class is a narrower sub-group of ship. The way this is expressed, or would be with any other sorts of thing, is to have
<a ship> instance of (P31) <a ship class> subclass of (P279) <a ship type> subclass of (P279) ship (Q11446)
<a ship class> instance of (P31) ship class (Q559026)
<a ship type> instance of (P31) ship type (Q2235308)
ship class (Q559026) metasubclass of (P2445) ship type (Q2235308).
If the ship is not a member of a ship class, one simply has
<a ship> subclass of (P279) <a ship type> subclass of (P279) ship (Q11446)
It's easy enough for a query (or an infobox) to go up the chain, and if there is an item in the chain that is instance of (P31) ship class (Q559026) return it as the "ship class", and if there is an item in the chain that is instance of (P31) ship type (Q2235308) return it as the "ship type". Jheald (talk) 12:34, 19 February 2018 (UTC)

How to qualify Olympians who've represented more than one country?[edit]

Looking at Barbara Jezeršek (Q746844), she competed on behalf of Slovenia (Q215) in 2010 Winter Olympics (Q9674) and 2014 Winter Olympics (Q9678), but then for Australia (Q408) in women's 15 km skiathlon at the 2018 Winter Olympics (Q47035493). I would expect to be able to tag them with an "on behalf of" or "participating for" or something, but I couldn't find an appropriate Property to use. Thoughts, anyone? — OwenBlacker (talk) 23:32, 13 February 2018 (UTC)

country for sport (P1532).--Jklamo (talk) 09:41, 14 February 2018 (UTC)
As a qualifier of the participant of (P1344) claim, please.
However, in spite of this solution being widely accepted, it is not ideal either. Technically, Olympic participants are members of a team, and this team may or may not represent a country. Typically one team represents one country, but there are many exceptions. Example are: Russian competitors at the 2018 Winter Olympics, or the refugee team at the 2016 Summer Olympics. It would be much cleaner if we had a qualifier member of sports team (P54) with a value item such as "Slovenian team" or "Australian team" instead of country for sport (P1532) with values Slovenia (Q215) or Australia (Q408), and the value item of that P54 qualifier was linked to the country the team actually represented—if there was any. —MisterSynergy (talk) 10:26, 14 February 2018 (UTC)
Yes, I should have been clearer, I did mean for the participant of (P1344) claims. The country for sport (P1532) solution is probably the best we have right now, though I'm inclined to agree with MisterSynergy. That said, arguably Olympic Athlete from Russia (Q28155263) is the country for sport (P1532) value for the Russian athletes competing in Korea right now… — OwenBlacker (talk; please {{Ping}} me in replies) 12:50, 14 February 2018 (UTC)

How do you set up in Wikidata a medal winner in the Olympics[edit]

Example Carl Hellström (Q1805994) he took

see sports-reference.com - Salgo60 (talk) 05:58, 18 February 2018 (UTC)

You can see Sven Kramer (Q111320) how they did it with a Dutch skater, but that method does violate constraints. Mbch331 (talk) 08:41, 18 February 2018 (UTC)
Strictly speaking: the medal itself is not the award, thus it should not be used with award received (P166). I’d just replace that qualifier with a ranking (P1352) qualifier. —MisterSynergy (talk) 09:37, 18 February 2018 (UTC)

You can participate to Google Summer of Code and help improving Wikidata[edit]

Hello all,

As every year, Google Summer of Code support student developers all around the world to work on projects. Wikimedia as an organization is part of the mentors.

A list of projects is already available, you can also add your own. One of these projects is signed statements for Wikidata: developing the technical feature that will allow institutions to donate verified and sourced information and increase the quality of the data.

If you want to participate or become a mentor, feel free to check the information on the pages linked above.

Thanks, Lea Lacroix (WMDE) (talk) 11:48, 14 February 2018 (UTC)

  • Is "signed statements" the only one for Wikidata? Wonder how that would work with ranks.
    --- Jura 09:00, 15 February 2018 (UTC)
    • There is phab:T138708 with details. —MisterSynergy (talk) 09:34, 15 February 2018 (UTC)
    • I read that, but I don't think it addresses it. statement disputed by (P1310) on a signed statement would even be stranger.
      It would be good if I/we came of with some other Wikidata related proposals. Maybe something that offers a basic step that isn't too complicated to do and some potential to go beyond that.
      --- Jura 09:50, 15 February 2018 (UTC)
      • To my understanding of the phab, "signed statements" as they are presented there are actually "signed claims" (i.e. the signature verifies mainsnak and qualifiers including some related information about the subject and the object item, but not the references and ranks of the statement). This means you should be able to change ranks of a "signed statement" without breaking the signature. —MisterSynergy (talk) 09:57, 15 February 2018 (UTC)
        • So ranking would be disconnected from the snak. I suppose it could work, but I'm not sure if ranks are easily understood.
          --- Jura 10:01, 15 February 2018 (UTC)
          • AFAIR there was some research lately which found that concepts like ranks and snaktypes are indeed poorly understood. Another problem with the "signed statments" approach could be that we use qualifiers to qualify the mainsnak value (this is how they were supposed to be used; they are something that is intrinsic to the entity described), but also to qualify the statement as a whole (think of statement disputed by (P1310), or reason for deprecation (P2241)). The latter doesn’t really fit into the "signed statements" concept, but it is somewhat disputed anyway. —MisterSynergy (talk) 10:10, 15 February 2018 (UTC)
            • The proposal does mention that changing the qualifiers would break the signed snak. So this wouldn't be a problem.
              reason for deprecation (P2241) could be moved to the reference section if it would be visible by default. statement disputed by (P1310) seems in need of a different approach.
              --- Jura 10:32, 15 February 2018 (UTC)
    • Yes signed statements is the only idea I added because mentoring takes significant time and effort. Ranks indeed do need to be improved but this is independent of signed statements and requires more design research than coding so is unfortunately not suitable for GSoC. --Lydia Pintscher (WMDE) (talk) 17:34, 15 February 2018 (UTC)
      • I was a bit worried about the impact on ranking, but if it's distinct it might not matter. Most people eventually figure it out. It's just not usual in Wikipedia infoboxes.
        As an additional one: maybe a (more) Wikibase specific diff view to compare to two items or two versions of the same item, for use in https://www.wikidata.org/w/index.php?diff= . Obviously the coding effort on that might be significant.
        --- Jura 17:49, 15 February 2018 (UTC)
        • Are you thinking about something like http://tools.dicare.org/wikidata-diff/ or something else? --Lydia Pintscher (WMDE) (talk) 19:43, 15 February 2018 (UTC)
          • A feature that compares two by showing a compact version of differences only and identical content + differences. The current diff function seems closer to MediaWiki text than an comparison of statements (+labels/etc.).
            --- Jura 04:45, 16 February 2018 (UTC)
    • I added it to the list. Two others that might be interesting are https://phabricator.wikimedia.org/T149905 and https://phabricator.wikimedia.org/T139912 . All three would need to have the technical steps presented in further detail.
      --- Jura 05:53, 21 February 2018 (UTC)
      • @Lydia Pintscher (WMDE): the addition got revert. Would you have a look if one or the other of these three could be added? For WMDE development, I suppose an interesting factor could also be that it provides a check if the codebase is sufficiently accessible to new contributors.
        --- Jura 10:59, 22 February 2018 (UTC)
        • I fear the first one is not large enough for a GSoC project. With the second one my fear is that it is not well defined and a newcomer would have a hard time figuring out a solution that'd work :/ --Lydia Pintscher (WMDE) (talk) 07:27, 23 February 2018 (UTC)

Surname[edit]

Can we add "surname" to the list of default fields that show up for "instance of=human". "Given name" is one of our default fields to fill in, but oddly not family_name (surname). This is why the field seems to go unused in so many entries. --RAN (talk) 04:40, 16 February 2018 (UTC)

There is no "default", it happens autonomously. If there are enough people with a surname (at least 6.9%), it will shown up. The more people have the property, the higher you will see it. At the moment, it's the 17th property in the list (you can test on Apor (Q773907)). Matěj Suchánek (talk) 09:06, 16 February 2018 (UTC)
At least 6.9% of what? Breg Pmt (talk) 10:06, 16 February 2018 (UTC)
There are 4 million of people in Wikidata. If at least 6.9% of them has surname, it will be suggested for the rest of them (unless there's a known interference with external identifiers). Matěj Suchánek (talk) 10:20, 16 February 2018 (UTC)
how many persons do then have surname? And whats the 10 most popular? Is it worth running a Query :) Breg Pmt (talk) 11:47, 16 February 2018 (UTC)
Smith, Li, Jones, Williams... tinyurl.com/yan6quny Jheald (talk) 11:57, 16 February 2018 (UTC)
@RAN: You may want to add this suggestion to Wikidata:Suggester ranking input. Deryck Chan (talk) 11:28, 16 February 2018 (UTC)
How much could safely be added by bot, eg from the DEFAULTSORT on en-wiki, to push that number of uses up? (Currently 531,563 out of 4,136,741 humans = 12.85%) ? Jheald (talk) 11:51, 16 February 2018 (UTC)

@Jheald: Runnig Query  ?surname wdt:P31 wd:Q101352. I find 235932 surnames. (Using limit 400000) is that correct? Breg Pmt (talk) 12:13, 16 February 2018 (UTC)

@Pmt: That would be the number of different surname items we have.
To find the number of times the property is used, try
SELECT (COUNT(*) AS ?count) WHERE {
  [] wdt:P734 [].
}
Try it!
-- or look it up on the property's talk page.
(The number quoted above is slightly smaller, because there I also required ?item wdt:P31 wdt:Q5 -- ie no surnames of fictional people). Jheald (talk) 12:28, 16 February 2018 (UTC)
Also of interest is the number of family names as yet unused on any person: about 166,000: tinyurl.com/yclf7omz Jheald (talk) 12:32, 16 February 2018 (UTC)

For given name (P735) it took quite some time till it got suggested for items lacking it. When it eventually did, the downside was that people filled it with random items as values, as appropriate values hadn't been made for some names. For surnames, the problem might even be larger. It's likely that frequent surnames (in general) are even more frequent in Wikidata as some additions started out with these. On the opposite end, I think someone made items for all surnames of football players ..
--- Jura 12:31, 16 February 2018 (UTC)

Ittakes less than 10 seconds to create a new given_name entry for a missing one, it only needs instance_of=family_name. I probably add a dozen last month. --RAN (talk) 03:01, 19 February 2018 (UTC)
The suggestion data will soon be updated, then family name will get a good boost (above alma mater, I think). Sjoerd de Bruin (talk) 14:47, 16 February 2018 (UTC)

With Reference to the discussion here I would like to point to the proposal patronym or matronym Wikidata:Property proposal/Person#patronym or matronym (en) – (Please translate this into norsk bokmål.). I.E not all persons do have surnames. Breg Pmt (talk) 16:09, 16 February 2018 (UTC)

Cordillera Azul Antbird, Myrmoderus eowilsoni[edit]

Q46624807 has the English common name "Cordillera Azul Antbird" and the scientific name "Myrmoderus eowilsoni". The former commemorates Cordillera Azul National Park, Q264948; the latter E.O. Wilson, Q211029,

When I created the item, I added statements indicating the etymolgy of each of these names; citing sources and giving quotes (" We select the English name to draw attention to the little known but biogeographically important and biodiverse mountain range that contains the type locality of the species." and " We name Myrmoderus eowilsoni in honor of Dr. Edward Osborne Wilson to recognize his tremendous devotion to conservation and his patronage of the Rainforest Trust, which strives to protect the most imperiled species and habitats in the Neotropics and across the globe. (English)", respectively).

For some reason, User:Succu has twice ([5], [6]) removed the cited etymology from the latter of these names. (I say "for some reason", as the only explanation given was the edit summary "per chat disk".)

I have, naturally, restored it. Repeatedly removing cited data with no cogent explanation is clearly unhelpful to the project, and to our users. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:57, 6 February 2018 (UTC)

And while I was writing the above, did so third time ([7]), with the edit summary "please do not remove a valid source, thx" - despite removing data cited to a valid source in the same edit. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:59, 6 February 2018 (UTC)
And now a fourth time ([8]), with edit summary "??!!". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:06, 6 February 2018 (UTC)

Pictogram voting info.svg Info Wikidata:Project_chat/Archive/2017/07#Editwar_at_Desmopachria_barackobamai_(Q30434384). --Succu (talk) 20:10, 6 February 2018 (UTC)

Thanks for the reminder. That's another example of you edit warring to remove cited data on the origin of a specific (both senses) name. I've duly restored it there, too. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:13, 6 February 2018 (UTC)
Sigh: „duly restored“. OMG. --Succu (talk) 20:17, 6 February 2018 (UTC)
...and I have been reverted there also ([9]), again with the loss of cited metadata. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:19, 6 February 2018 (UTC)
Beyond „I have“ and „I was“ do you have some additional arguments to the discussion I reminded you above? --Succu (talk) 21:23, 6 February 2018 (UTC)

Redux[edit]

I've restored the above, unresolved, topic from this month's archive, because we have a similar issue to the one originally raised (i.e. not the sp. nov. matter which side-tracked it; hence now collapsed) at Draba kananaskis (Q47507633), where User:Succu persists in removing a cited qualifier of taxon name (P225) which describes the etymology of the specific name. I raised the same issue last year, but that too petered out without resolution. It is simply not tenable to store the etymology of such names at item level, because that fails when an item can have different names/ labels in different languages, or where the scientific and vernacular names have different roots (see the 'Kentish Plover' example in last year's discussion). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:27, 16 February 2018 (UTC)

What is a „cited qualifier“? BTW: The edit history of Draba kananaskis (Q47507633) is revealing. --Succu (talk) 22:03, 17 February 2018 (UTC)
Wikidata:Project_chat/Archive/2017/07#Summary?, was the résumé. --Succu (talk) 22:12, 17 February 2018 (UTC)
Do you have a credible data model, that caters for the use-cases given above, other than the one which you keep undoing in your reverts? If so, what is it? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:14, 17 February 2018 (UTC)
Could you please refine (or rephrase) your Use case (Q613417)? What information do you want to extract say with a SPARQL query? BTW: What is a „cited qualifier“? --Succu (talk) 20:45, 20 February 2018 (UTC)
As you can see, my arguments are laid out above. As a courtesy to our fellow editors, I see no need to repeat them. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:19, 22 February 2018 (UTC)
You offered your opinion but not a use case. Just another discussion where you are either unwillingly or unable to argue in a comprehensibe way. What a pitty for our project. --Succu (talk) 21:07, 22 February 2018 (UTC)
So, you offer no credible data model, then; just ad hominem. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 00:29, 23 February 2018 (UTC)
A data model about what? Without use cases it's not possible to develop suggestions. --Succu (talk) 19:22, 23 February 2018 (UTC)

Only one bank (Q2897058) for no label (Q17553950)[edit]

from Wikidata talk:WikiProject Rivers#Only one

How to claim, that an no label (Q17553950) is located on right bank (Q27834918)/left bank (Q27834806) of watercourse (Q355304)?

Thanks in advance. - Kareyac (talk) 08:35, 17 February 2018 (UTC)

@Kareyac: I see a dozen of examples like Lawrence County (Q502737) :
Is it what you need?
Cheers, VIGNERON (talk) 18:44, 18 February 2018 (UTC)
@VIGNERON: Thanks, afraid not sure, direction relative to location (P654) shows position according to the compass. In my case I want to say "the Musée d'Orsay (Q23402) is located on the left bank (Q27834806) of Seine (Q1471)". - Kareyac (talk) 19:51, 18 February 2018 (UTC)
@Kareyac: then maybe something like Cheers, VIGNERON (talk) 20:30, 18 February 2018 (UTC)
OK, I‘ll follow your advice. - Kareyac (talk) 20:49, 18 February 2018 (UTC)

New users[edit]

Noting that there are an unusual number of new users who have edited Guntur (Q3120966) today and are making other questionable and/or unsourced edits on other items. I haven't sent any of them welcome messages or stuff like that. Jc86035 (talk) 10:33, 17 February 2018 (UTC)

I'm guessing this is an editathon of some sort? Mixture of helpful and unusable edits (contributions links: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20). I found them with Special:RecentChanges since they all seem to have created user pages with Babel tags. Jc86035 (talk) 10:43, 17 February 2018 (UTC)

@Krishna Chaitanya Velaga: What happened here? Mahir256 (talk) 02:55, 18 February 2018 (UTC)
@Mahir256, Jc86035: Yes, they're part of edit-a-thon conducted yesterday. I'll warn them about this. Thanks for the ping. Krishna Chaitanya Velaga (talk) 05:41, 18 February 2018 (UTC)
@Krishna Chaitanya Velaga: Is there a page about this editathon somewhere (even if not in English)? Mahir256 (talk) 05:46, 18 February 2018 (UTC)
@Mahir256: There isn't any specific event page. We conducted it as a part of the training sessions for m:IMLD-ODD 2018 Wikidata India Edit-a-thon. But the activity can bee seen at https://outreachdashboard.wmflabs.org/courses/WikiProject_India_on_Wikidata/Wikidata_Workshop_at_VVIT_--_Feb_2018. Krishna Chaitanya Velaga (talk) 05:49, 18 February 2018 (UTC)
@Krishna Chaitanya Velaga: Noting that some of the issues I found were: test edits on live items; removal of correct data points; possibly incorrect changes in data structure (e.g. 1); removal of "imported from Wikipedia" references and other sources; addition of sources without title/date/access date/author; creation of duplicate items; linking to incorrect items (like Paper (Q1402686) instead of paper (Q11472)); lack of awareness of the notability policy; addition of textbooks and various other things without any identifiers like ISBN. Jc86035 (talk) 08:29, 18 February 2018 (UTC)
@Jc86035: Sure, I'll note that. Thank You. Krishna Chaitanya Velaga (talk) 08:32, 18 February 2018 (UTC)
@Krishna Chaitanya Velaga: I hope you will be able to review all of the edits that have provoked @Jc86035:'s concern and revert those that are obviously wrong or have caused the issues he described. I will see if I can do the same at some point, and I wonder if Jc86035 can do the same himself at some point as well. Mahir256 (talk) 09:38, 18 February 2018 (UTC)
@Mahir256: Of course, I'll check the edits, correct the wrong ones. Krishna Chaitanya Velaga (talk) 09:47, 18 February 2018 (UTC)
@Mahir256: I've already reviewed some of the edits (and nominated a couple of items for deletion), and sent two of the editors welcome messages (1 2). Jc86035 (talk) 11:31, 18 February 2018 (UTC)

Looking for examples of alternative business models for organisations considering open licensing[edit]

Hi all

I'm compiling a guide for UN agencies on the steps to implement open licensing.

The piece I'm really missing is alternative business models to those which require traditional copyright. This include publishing books, images and other multimedia licensing and also data.

I'm finding data the hardest of these to find information for.

If you know of any existing compilations of information and/or any examples of organisations which have working business models please brain dump below and I will organise it.

Thanks very much

John Cummings (talk) 16:15, 17 February 2018 (UTC)

Many books get written to create a reputation for the author. A good reputation can lead to consulting gigs and speaking engagements. Or selling flamethrowers ;)
Ubers decision to release data under an open license is noteworthy: https://www.engadget.com/2017/08/31/uber-movement-traffic-data-website-launch/ . It's Creative Commons Attribution Non-Commercial but even that's still open data. ChristianKl❫ 11:59, 18 February 2018 (UTC)
Theoretically the cost would drop of software programs used by corporations to translate text if the text translations were free. I find it scandalous that the WHO charges royalties for translations of their medical terms. Jane023 (talk) 09:16, 21 February 2018 (UTC)

Help with subspecies[edit]

My sources indicate that eri silk (Q5385913) is produced by Samia cynthia ricini, a subspecies of Samia cynthia (Q1462214). How do I indicate this on the this taxon is source of (P1672) statement? Do I need a separate item for the subspecies? I haven't done much work with biology items. Thanks! - PKM (talk) 22:01, 17 February 2018 (UTC)

I created Samia cynthia ricini (Q49125191). --Succu (talk) 11:47, 18 February 2018 (UTC)
Thank you! I’ll hook everything together. - PKM (talk) 20:22, 18 February 2018 (UTC)
...And it looks like Samia cynthia ricini is a synonym for Samia ricini (Q1995352). In Commons, UniProt, Encyclopedia of life. I'll merge them. - PKM (talk) 21:06, 18 February 2018 (UTC)
I reverted changes your at Samia cynthia ricini (Q49125191) and added the original combination. --Succu (talk) 21:11, 22 February 2018 (UTC)
Oh, okay, now I see how that works. :-) - PKM (talk) 22:09, 24 February 2018 (UTC)
Good to read that. ;) --Succu (talk) 22:10, 24 February 2018 (UTC)

'blank weapons'[edit]

Does anyone know what's going on with Q222405 and Q15407649? I thought 'arme blanche' in French was synonymous with 'bladed weapon' and so I am not sure we're getting it right here. arwiki and ruwiki have both, but I don't read either so I don't know what to think about that. svwiki and fywiki are on the latter, but just eyeballing it they could also fit on the former. 82.31.82.76 11:44, 18 February 2018 (UTC)

In ru.wiki articles Холодное оружие = cold weapons = weapons without explosive/pyrotechnical, pressurised gas or electrical mode of action; mainly, but not necessarily a melee weapon (also includes e.g. throwing weapons). Клинковое оружие = blade weapons = cold weapons (as described earlier) with a blade inseparable from the handle. Wostr (talk) 14:57, 18 February 2018 (UTC)
Not all 'arme blanche' are blades, a baseball bat (Q809910) is an 'arme blanche' but not a blade. Cdlt, VIGNERON (talk) 18:30, 18 February 2018 (UTC)

Usage of "member state of ..." items[edit]

We have a bunch of "member state of ..." items. Some of them are used in instance of (P31) as duplication of same info in member of (P463) (I think this is a relict of the times without member of (P463)), some of them not. I think that it is better to keep that info just in member of (P463) and instance of (P31) with "member state of ..." item is unnecessary duplication, also it is good to keep number of values in instance of (P31) as low as possible.--Jklamo (talk) 14:48, 18 February 2018 (UTC)

We don't have to get rid of those items. The (sketch of) approach proposed in Template:Implied instances allows to keep them while (not) using them in instance of (P31) statements.
Another point : keeping the number of instance statements low is easily achievable by keeping the most specific class in the hierarchy and deleting values that are its parent classes. author  TomT0m / talk page 15:17, 18 February 2018 (UTC)
As an example, for member state of Mercosur (Q6814224) this gives
<strong class="error"><span class="scribunto-error" id="mw-scribunto-error-0">Lua error in Module:Class at line 35: bad argument #1 to 'concat' (table expected, got string).</span></strong>
Try it! (after I added a "has quality" statement in it). The query finds Mexico (Q96) amongst others. Will work in the near future to include all the explicit instances of such classes in the query results (and explicit instances of the parent classes which have the statements defined in their child classes parent to our class of interest, and I think we could get something quite flexible. author  TomT0m / talk page 15:44, 18 February 2018 (UTC)
@Jklamo: I agree with you. If a statement like exists, is redundant and useless (though of course, is a valid statement). Is there any good way to prevent "member state of ..." from being used in instance of (P31) and to induce editors to use member of (P463) instead? --Okkn (talk) 15:37, 18 February 2018 (UTC)
This query might help to find redundant statements. --Pasleim (talk) 10:44, 20 February 2018 (UTC)

Copyright start date[edit]

Would there be any interest in having the date that copyright status begins as a field for newspapers and magazines. Currently you have to search here for the publication and it tells you the start date for copyrights (as best can be discerned to date). That information could be pulled into WikiCommons and WikiSource by a template with standardized wording. See for example: here for a hand-written example for the Asbury Park Press which did not file for renewal and The Jersey Jornal which did. Some publications had a more extensive copyright clearance search performed and gaps in renewals were found for individual issues, see Time magazine as an example. We would not be able to have a single date for Time magazine. --RAN (talk) 19:28, 18 February 2018 (UTC)

The template reading the Wikidata date would add this statement to the category for the articles in Commons and in Source:--RAN (talk) 02:46, 19 February 2018 (UTC)

Articles published in the Jersey Journal are in the public domain prior to February 9, 1929. All articles starting on that date are currently under active copyright.

Fusion problems with German wiki[edit]

Can someone fusion en:Category:Gynaecology (Q7028220) with German de:Kategorie:Gynäkologie und Geburtshilfe (Q8970781) ?

Can someone fusion en:Category:Sexual ethics (Q30674430) with German de:Kategorie:Sexualethik (Q17303953) ?  – The preceding unsigned comment was added by 178.11.14.47 (talk • contribs).

I'm not sure the first pair of categories cover exactly the same range of topics. I don't know much about medicine, but the German category seems to cover not only gynaecology, but also de:Geburtshilfe - which links to en:Midwifery but seems to also cover en:Obstetrics. --Kam Solusar (talk) 23:44, 18 February 2018 (UTC)
Comment: The German Wikipedia (and the Czech also, IIRC) uses a dual category system, with one set of categories using the technical / scientific term, and one set using the vernacular German. So it will not always be possible to merge category data items for the German Wikipedia. --EncycloPetey (talk) 02:53, 19 February 2018 (UTC)

Modeling (textile) of (place)[edit]

Here's a situation I'd like feedback on: the Wikipedia articles associated with Rajshahi silk (Q7286431) and Thai silk (Q6580701) seem to logically combine three topics:

My first thought is that these articles are mostly about the textiles, and I have tentatively modeled them as <subclass of> no label (Q47469120): no description.

I am wondering if there is a better way to model these - perhaps as <subclass of> sericulture (Q864650), or even using the rarely used Wikipedia article covering multiple topics (Q21484471)?

Does anyone have thoughts on this? - PKM (talk) 22:32, 18 February 2018 (UTC)

  • This is a somewhat general comment as my knowledge about silk (and notably) Thai silk is rather limited. Looking at w:Thai silk, sericulture (Q864650) or even a more general "silk production in (place)" might be a good fit. That said, ideally Wikidata would have a separate item for all concepts mentioned in that article, notably "Thai silk" and its types mentioned at w:Thai_silk#Types_of_Thai_silk. Depending on the (place), the actual article might just cover these and not the entire process. So the sitelinks on these items might not necessarily be on a single item.
    --- Jura 08:45, 20 February 2018 (UTC)

Adding property proposal to list of proposals[edit]

Hello all, not sure if I did something wrong. I used the "create request page" link to create my property proposal. It's showing up at Wikidata:Property proposal/court but not in the actual list of proposals, and I can't seem to find any guidance on what to do. ohmyerica (talk) 02:36, 19 February 2018 (UTC)

@Ohmyerica: There is usually a large red notice to the effect of "You have not transcluded your proposal...Please do it." when you initially create your proposal--I don't know why this is not showing up. I have added it to Wikidata:Property proposal/Generic. Pasleim's bot will pick up on it around 11:15am EST and add it to Wikidata:Property proposal/Overview. Mahir256 (talk) 02:45, 19 February 2018 (UTC)
It didn't show up due to the selective deletion of several fields of the template by the creator. Sjoerd de Bruin (talk) 11:10, 19 February 2018 (UTC)

Tour not working[edit]

The first tour seems not to work. The statements tour looks fin, with a pop-up appearing.

All the best: Rich Farmbrough11:21, 19 February 2018 (UTC).

Thanks for noticing. I just tried the first tour, it works for me, the pop-up appears (after at least 5 seconds though). Lea Lacroix (WMDE) (talk) 16:17, 19 February 2018 (UTC)

API to find all properties used for a page/list of pages?[edit]

For example, for lion (Q140), I'm trying to make the api output all of the properties under "Identifiers", i.e. "Encyclopedia of Life ID" and below. I wasn't able to find this easily in the documentation. — Tom.Reding (talk) 15:02, 19 February 2018 (UTC)

With the MediaWiki API you can not query for only identifier statements. But you can query for all statements (action=wbgetclaims&entity=Q140) and then performing a filtering on the output. --Pasleim (talk) 10:34, 20 February 2018 (UTC)
Don't know if it works for your use-case, but you could do SPARQL query to simply get all identifier properties and then use data as array for filtering. --Edgars2007 (talk) 10:42, 20 February 2018 (UTC)

Wikidata weekly summary #300[edit]

A couple of queries[edit]

Hi, I have been editing Wikidata for the past three to four months. I have a couple of quries at this point:

--Krishna Chaitanya Velaga (talk) 07:43, 20 February 2018 (UTC)
For second - I would say it is common knowledge (in most cases). For first - hmm, it depends :) In most cases - the most precise one. As you can see, battalion (Q6382533) has subclass of (P279)=military unit (Q176799). But this doesn't work for humans. We don't put celebrity (Q211236) as P31 value, although it has P279=human. --Edgars2007 (talk) 08:16, 20 February 2018 (UTC)

Epidemic[edit]

Hoi, is there a good example of an item for an epidemic outbreak. The impact they have can be major for developments (think the black death in the middle ages or Zika or Aids). What I am seeking is not only that it is an outbreak (of what), also a start and end date and where. The number of casualties and the percentage that survived. Thanks, GerardM (talk) 09:58, 20 February 2018 (UTC)

Page vs page number[edit]

There has been a split of page number (Q1069725) into 'page number' ('pagina') and 'page' (as a one side of piece of paper; Page (paper) (Q49138218)). I'm not sure that all sitelinks are correctly linked to proper items, so I would appreciate some help, especially with non-Latin languages. Wostr (talk) 13:55, 20 February 2018 (UTC)

Working in English, I see that Wikidata pagination (Q783209) is linked to Wikipedia en:Pagination. Wikidata page number (Q11325816) is linked to Wikipedia en:Page numbering. Wikidata page number (Q1069725) is not linked to any English Wikipedia article. The English Wikipedia does not have a separate article for "Page number", instead it has a redirect from "Page number" to "Page numbering". I'm not sure what the best way is to reflect this situation in Wikidata.
As for shades of meaning, page numbering could refer to the physical process of applying page numbers, or a system of page numbering (for example, starting at 1 and going to the last page in the book, vs giving the first page of chapter 1 the number 1-1, the first page of chapter 2 the number 2-1, etc.). "Page number" in contrast means the number that has been placed on a particular page. Jc3s5h (talk) 15:25, 20 February 2018 (UTC)

how the Wikidata works[edit]

hi, i am new here dont know what is the wikidata but have intrest in it.can any one explain how it works.  – The preceding unsigned comment was added by Mnish pal (talk • contribs) at 03:42, 21 February 2018‎ (UTC).

Welcome. You may want to read Wikidata:Introduction first. For any questions not found there or on pages linked from there, feel free to ask again here. --Anvilaquarius (talk) 10:16, 21 February 2018 (UTC)

Help:Pages without elements[edit]

Does anyone know any good automated ways to merge pages (without elements) with already existing wikidata items? I have about 1000 categories in uk.wiki, which can be added to the elements like Category:2018 in Switzerland (Q27992176) and Category:Foreign relations of Cameroon (Q7304141) ---Andrew J.Kurbiko (talk) 03:47, 21 February 2018 (UTC).

If you give me the list and show me some patterns (like that category type can be matched to that category type for enwiki/ruwiki: "X in Switzerland" and "[the same in Ukrainian]") I can do it when I have some free time (in evening or in holidays). --Edgars2007 (talk) 06:57, 21 February 2018 (UTC)
They all fragmented, for ex., 20 "X in Switzerland" and 10 "X in Asia", thats why i need a tool. It would be unpractical to edit them manually, and will take rly a lot of time ---Andrew J.Kurbiko (talk) 13:47, 21 February 2018 (UTC).
Quick Statements version 1 can do a list of merges, see Help:QuickStatements#Item_merging -- but be very careful that you are indeed giving it the right pairs of items, and that these are indeed appropriate to merge; unlike QS2, QS1 does not give you a preview. Jheald (talk) 10:19, 21 February 2018 (UTC)
Thanks! ---Andrew J.Kurbiko (talk) 13:47, 21 February 2018 (UTC).

Q49000000[edit]

another cebwiki ..
--- Jura 05:33, 21 February 2018 (UTC)

Wasn't there the plan to stop import of these bot-created articles until we find a way how to deal with this low-quality and high amount data? They pop up faster than we can merge them. 13:13, 21 February 2018 (UTC)  – The preceding unsigned comment was added by Ahoerstemeier (talk • contribs) at 13:13, 21 February 2018‎ (UTC).
  • Maybe @GZWDer: has a plan. Personally, I somehow figured it might be easier to ignore duplicates. Most people don't seem to care that they are created, so why should I?
    --- Jura 06:39, 22 February 2018 (UTC)
    • It would be great if we could stop automatically creating items for cebwiki pages. This is a real issue for many OpenRefine users, as it degrades the quality of reconciliation results for geographical subjects. − Pintoch (talk) 12:33, 22 February 2018 (UTC)
However, there is not much of a community in cebwiki. Of the four admins besides the bot operator himself, only one has made more than four edits in 2018 so far. ceb:Espesyal:ActiveUsers lists 133 users currently (incl. bots), but the vast majority just came for a couple of edits, mostly triggered from other projects (like replacing deleted Commons images). Maybe we do want to think about an exclusion rule in WD:N, where items with only a cebwiki sitelink are to be excluded? --YMS (talk) 15:26, 23 February 2018 (UTC)
I think excluding cebwiki is a wrong approach: 1. the pages are being maintained 2. even we choose to exclude the sitelink, most of them still refers to an instance of a clearly identifiable entity described by reliable sources (Geonames maynot, but GEONET is), thus notable.--GZWDer (talk) 16:22, 23 February 2018 (UTC)
I don't think we should ban cebwiki sitelinks - I would just stop mass-creating items for cebwiki pages without a corresponding item. Not because the corresponding items would not be notable - just because we don't have the technical and human means to ensure that no duplicates are created. People can still create items and add cebwiki sitelinks on an individual basis. − Pintoch (talk) 10:38, 24 February 2018 (UTC)
There are two kinds of duplicates - the duplicates with already existing items which can be relatively easily found (especially since its all geographical items), then there are the duplicates between administrative units and populated places imported from geonames, which are annoying but one can argue them as being correct. But the issue which bugs me most are the WRONG statements added. Trying to get an elevation for a hill by taking only the inaccurate geographical location using the radar data which also has some inaccuracies is problematic, importing these without any error margin and often even without the imported from (P143) severly harms our data quality. See e.g. Osterloh (Q31294773) where the difference in elevation to a real map was just 15m, I had cases where it was >200m! Ahoerstemeier (talk) 23:25, 24 February 2018 (UTC)

Q50000000[edit]

And now El Refugio (El Burro) (Q50000000), also by User:GZWDer. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:20, 23 February 2018 (UTC)

User:GZWDer has a long story of indiscriminately mass-importing articles from projects, without precautions, without any statements, which caused a lot of problems for wikisources (fr and others), see User_talk:GZWDer/2015#wikisource_mass_import and others. This user claims to be "semi-retired", but User:GZWDer (flood) is certainly not :( --Hsarrazin (talk) 16:29, 23 February 2018 (UTC)
For ceb and sr wiki case, I plans to import many statements to them (like this). However for personal issues I can only edit Wikidata in certain months each year and I will continue this in June.--GZWDer (talk) 16:34, 23 February 2018 (UTC)

Remove Obsolete Getty Vocabularies IDs[edit]

Wikidata:WikiProject_Authority_control#Remove_Obsolete_Getty_Vocabularies_IDs: Any takers to replace old values of AAT ID (P1014), TGN ID (P1667), ULAN ID (P245) with the new values from the respective files cited there?

If you'd like to help, please comment there and ping me. Thanks! --Vladimir Alexiev (talk) 15:41, 21 February 2018 (UTC)

Bit confusing to have discussions on the normal page of the WikiProject instead of the talk page. Sjoerd de Bruin (talk) 15:44, 21 February 2018 (UTC)

python script works from the linux command prompt not from a browser[edit]

Hi,

I am new to wikidata and a novice with python.

I am running apache on a digital ocean droplet using lampp

The following script works fine when I run it from a linux terminal but when I try to run it on a browser I get an error

"Error occurred: <urlopen error unknown url type: https>"

Any help would be great

Regards

Paul

Not answering to your error problem, but to the question in general. Take a look at this thread. But do you really want to do this with Python? For running in web browser php would be more suited, imho. --Edgars2007 (talk) 06:30, 22 February 2018 (UTC)

cancelled films[edit]

I found Larrikins (Q22906279) which is a cancelled film from DreamWorks Animation, see [10]. instance of (P31)  film project (Q18011172) seems to be suitable for this, but what about publication date (P577), maybe <no value>? Maybe we can say somehow, that it was set for a February 16, 2018 release? And what about the other Statements esp. series (P179) Queryzo (talk) 08:53, 22 February 2018 (UTC)

Maybe <no value> as prefferred rank, and February 16, 2018 as deprecated rank with optional qualifier "reason":"cancelled film". --Edgars2007 (talk) 09:49, 23 February 2018 (UTC)

Property:P2291, and other questions about music[edit]

I asked several of these questions at Property talk:P2291, but since the only person who replied was Mbch331 and he ignored my last two pings, I'll assume he doesn't know the answers (or they don't exist) and ask the questions again here. The property currently has just 73 uses, almost or more than half of which violate its constraint values, so I assume it would be easy to change its constraints and the qualifiers' meanings in the context.

  • The only qualifiers presently allowed for this property, charted in (P2291), are series ordinal (P1545), end time (P582), and start time (P580).
    • (a) Why mandate series ordinal (P1545) (34 uses) instead of ranking (P1352) (7)?
    • (b) Many charts have a "chart date", often at a different time to the chart's actual publication date. How should this be indicated? Is point in time (P585) (7) sufficient for this, given that a chart is technically current on its chart date regardless of when the data was collected? Should a new property be created to specify this? It would make more sense to me personally to have just the chart date on data points, particularly for the Billboard Hot 100 (Q180072) since its data collection rules are unnecessarily complicated (see next point), with the chart's rules on the item for the chart. This would also make referencing for the qualifiers much more reasonable.
    • (c) What do end time (P582) (23) and start time (P580) (18) actually mean in the context? Should they define the period that the data is collected (two different periods for the Billboard Hot 100, since airplay data is Monday–Sunday and rest is Friday–Thursday – no existing schema for entering this information at present), or the period that the chart is current (i.e. from its publication time to the time of the succeeding chart's publication), or the period that the chart was the current chart according to its "chart date" (i.e. from its "chart date" to the next chart's "chart date")?
    • (d) If a song is on a chart for multiple weeks at the same position, should each week be a separate data point, or should they be combined?
  • Many charts, like the Hot 100, combine the data for the original version of a song with the remixed or covered version of a song, usually when the original artist has participated in the recording of the other version.
    • (e) How should this be indicated? Should all the data just be put on the "main" item for the song, or should it be placed on the remix version's item (one of which probably doesn't exist; the data model is largely incomplete here because Wikipedia articles usually don't exist for both concepts) if it was the one released as a single? Or should the data be placed on whichever version the chart position appears to be credited to? (The credits for Perfect (Q29051557) changed in its last week on the Hot 100, being credited to just Ed Sheeran instead of Sheeran and Beyoncé.)
    • (f) Should different versions of a song, even if released by the same artist(s) as the original version, have different items? Is the live performance of a song on a TV show or an officially-sanctioned podcast notable? What if it wasn't released officially (including fan recordings)?
  • Before mid-1998, the Hot 100 was a singles chart rather than a songs chart. I was informed by Mbch331 that a single release should have its own item, and each song on it should also have its own item.
    • (g) If a single was released in a different format/packaging/whatever in different regions, does each of those releases qualify for its own item? (On the UK Singles Chart, at least one song charted twice in the same week because it was released in four formats instead of the maximum mandated three. This suggests to me that they would.)
    • (h) Do digital singles and CD singles also qualify for their own item [if they have an entry in a database]? Does the version of a song sent to radio get its own item? The clean version? The lyric video(s) and music video(s)? The (official) version uploaded to Spotify/YouTube/SoundCloud/Dropbox/Facebook/Wikimedia Commons/the Wikimedia bug tracker (the line's somewhere there)? Is everything released on iTunes/YouTube/etc. Wikidata-notable?
    • (i) Should the data be added (in addition, or instead) to the items of the individual songs? What about the B-sides if the B-side(s) is/are the less important song(s)?
    • (j) If an album is released with e.g. a different album cover in two countries, do both of them qualify for individual items if they have different identifiers? What if the versions are identical aside from their identifiers? Are they classified as the same thing? If a random assortment of some of a set of objects is included in the release of an album, does every permutation get its own item?
    • (k) Which release of the single or album – if there are multiple items – gets the chart position data? All of them? The ones from which the chart position was calculated?
    • (l) Which item gets the sitelinks to Wikipedia?

Jc86035 (talk) 09:06, 22 February 2018 (UTC)

Since no one replied, I changed the qualifier from series ordinal (P1545) to ranking (P1352), and started a property proposal for "chart date". Jc86035 (talk) 09:36, 25 February 2018 (UTC)

Adding the Lexeme namespace to the licensing footer text[edit]

Hi everyone,

As you might know we’ve been working on adding support for lexicographical data over the past year. We are now getting close to a first version and I am tidying up the last pieces before we can get started collecting lexicographical data here on Wikidata and remix, query and reuse that data to learn more about the languages of this world. You can check out the demo system with the current state.

One of the remaining tasks is around licensing. Since the beginning of Wikidata all our structured data is released under CC-0. This has helped significantly with spreading our data widely and quickly and thereby helping us give more people more access to more knowledge. Our current licensing footer text however explicitly mentions the main and property namespaces as the places holding data under CC-0. Since lexicographical data is in a new namespace we need to adjust this text.

I am convinced it is in the best interest of Wikidata to extend CC-0 to all structured data namespaces. The reasons (in addition to my reasons for CC-0 in general):

  • We have fared very well with CC-0 so far and many partners use it as one of the main reasons they are attracted to Wikidata - both for re-use and contribution of data.
  • Having a mix of licenses is a potential legal minefield that can be exploited by some actors, threatening not only re-users but also our own contributors. It is a huge hassle for re-users, in particular small re-users like individual contributors, hobby developers, and small organizations, and will lead to less usage by these, and thus to less spreading of our knowledge.
  • It is the sound thing for data - much better explained by Luis in his blog posts (1, 2, 3, 4).
  • It will mean that we can not import some data from Wiktionary and other sources that is incompatible with CC-0 but that is already the case now. We have always leaned towards making re-use easier at the expense of easy importing. (See input from the legal team for more details on what kind of lexicographical data can be protected.)

So I would like to adjust the license text to say “All structured data (e.g. main, Property and Lexeme namespace) is available under the Creative Commons CC0 License; text in the other namespaces is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy.”.

Cheers --Lydia Pintscher (WMDE) (talk) 09:34, 22 February 2018 (UTC)

  • Symbol support vote.svg Support. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:23, 22 February 2018 (UTC)
  • Of course, the adjustment has my full and Symbol strong support vote.svg Strong support, since I never doubted one moment this is the correct approach. Sannita - not just another it.wiki sysop 10:35, 22 February 2018 (UTC)
  • Symbol strong support vote.svg Strong support Linking data with a mix of licenses is just a gordian knot. --Andrawaag (talk) 11:50, 22 February 2018 (UTC)
  • Pictogram voting question.svg Question What's the plan for textual definitions?
    https://wikidata-lexeme.wmflabs.org/index.php/Lexeme:L15#Senses doesn't show any but the substance of https://de.wiktionary.org/wiki/Leiter seems to be these.
    Similarly https://wikidata-lexeme.wmflabs.org/index.php/Lexeme:L13 compared to https://en.wiktionary.org/wiki/hard#English
    Would they have to remain at Wiktionary, be reduced to statements as on the prototype, added to another namespace or need to be re-written?
    --- Jura 11:53, 22 February 2018 (UTC)
    • You can see some of the definitions on the demo system (where it says "Führungsperson" for example). They would be CC-0 too. Just like with Wikipedia now we will build the infrastructure and collect data here that the Wiktionary editors are free to use as they deem useful for their work. Hope that helps. --Lydia Pintscher (WMDE) (talk) 07:19, 23 February 2018 (UTC)
      • Somehow I doubt there is much room for Wiktionary.org to make use of Wikidata's Wiktionary namespace to annotate its content further. Are there any plans for a structured way to enable this? Maybe a separate Wikibase(Wikidata) instance as for Commons?
        --- Jura 14:26, 24 February 2018 (UTC)
  • The choice of a permissive license is unfortunate but not entirely surprising, given the big corporate players that funded the creation of Wikidata. It's a departure from the early ideals of Wikimedia projects, of creating content that will be always free down the line. Now there's no point debating this, since it would make no sense to make the namespaces have incompatible licenses. The real discussion with the community should've been carrier out much, much earlier. NMaia (talk) 12:13, 22 February 2018 (UTC)
    This is fundamentally inappropriate, and most of all repetitive: we've been through this time and time again, of course there's no way to convince that no "big corporate players" were involved in the discussion and that it was a community decision, if you're convinced otherwise. Source: I was there when we discussed it. --Sannita - not just another it.wiki sysop 23:53, 22 February 2018 (UTC)
    Interesting, can you provide details when and when this "community decision" was made "offline"?
    --- Jura 07:00, 23 February 2018 (UTC)
    It wasn't made "offline" - and this is a final notice: please, do NOT put in my mouth words I've never spoken - it was made in the mailing list of Wikidata, while the project was still in beta. The first discussion was made in April 2012, then another in August 2012, and these are the first two discussions I can find just by casually browsing the ML archives. Check them out yourself if you don't believe me, I've got work to do, and frankly I'm tired of repeating the same things all over again. --Sannita - not just another it.wiki sysop 09:05, 23 February 2018 (UTC)
    I thought this was somehow related to Wikitionary, but it's about Wikidata in general. I took your "I was there" literally.
    --- Jura 20:53, 23 February 2018 (UTC)
  • Symbol strong support vote.svg Strong support CC-0 has been a key of Wikidata's success. Mixing it with less-free licenses will create significant hurdles for on- and off-project users. --Magnus Manske (talk) 13:25, 22 February 2018 (UTC)
  • Symbol support vote.svg Support ArthurPSmith (talk) 15:59, 22 February 2018 (UTC)
  • Symbol neutral vote.svg Neutral I agree that using CC-0 for all data make senses. On the other hand, I believe that using such licence will not allowed to import a lot of interesting stuff from the Wiktionaries. Pamputt (talk) 18:13, 22 February 2018 (UTC)
    • Yeah but in the long run I believe that is the better trade-off to make. We've made the same trade-off for the data in items. --Lydia Pintscher (WMDE) (talk) 07:23, 23 February 2018 (UTC)
  • Symbol support vote.svg Support obviously. VIGNERON (talk) 19:59, 22 February 2018 (UTC)
  • I also have the same question as Jura, since senses seem like they'd be derived from existing Wiktionary definitions. Mahir256 (talk) 21:30, 22 February 2018 (UTC)
  • Symbol support vote.svg Support This is indeed a major development. John Samuel 23:19, 22 February 2018 (UTC)
  • Symbol support vote.svg Support It would be crazy to start mixing licenses now, good to clear this up right away. I9606 (talk) 03:14, 23 February 2018 (UTC)
  • Symbol support vote.svg Support It might make it impossible to import data from Wiktionary, but in the long term it is better for reuse. Me too I am very attached to keeping data open, but Wikimedia has reached a stage where the embrace, extend and extinguish (Q1335089) strategy would not against us anymore, so better make the data as open as possible, which means CC0. Syced (talk) 06:21, 23 February 2018 (UTC)
  • Symbol oppose vote.svg Oppose It don't think of any lexicographer or linguist who may accept to publish under CC0 a work they spent five to twenty years on. CC0 does not respect the time spent in collection of words and meanings, structuring the language for a dictionary and edition. CC0 is in favor of compagny that will just use the data without considering to diffuse the knowledge, it will not reinforce the free reuse but only the stealing of data. Finally, I think this decision concern wiktionarians and deserve a better explanation of the problem, one that include the pro and the con. A this point, I still consider you are doing a fork of Wiktionary in Wikidata with your own agenda. -- Noé (talk) 07:54, 23 February 2018 (UTC)
    • Who is being asked to "publish under CC0 a work they spent five to twenty years on"? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:31, 23 February 2018 (UTC)
    • How does CC0 not respect the time spent? I think we are confusing the legal license with the provenance. If the data provider, also provides the proper references and qualifiers, it is respecting the work spent, don't you think? Yes, people can use wikidata content without pointing to this provenance, but what is it's value if they can't support those claims with original sources? Also, a lot of work is made possible with public funds, so not sharing those result with a public license is quite unfair?. I agree with Andy, nobody is pushing people to share they knowledge under CC0, if you don't like it you don't need to. But for those who would like to share knowledge publicly, CC0 provides the means. Different scientific resources did make the change to share the knowledge with the general public eg: example --Andrawaag (talk) 12:04, 23 February 2018 (UTC)
      Andy: Well, you're right, lexicographical data in Wiktionary could be written only by individuals and never by big imports from published sources. Good luck to start again from scratch.
      CC0 do not respect the time spent because it do not force reusers to mention the source of information. If references are provided, it is equal to diffuse it with CC BY or with CC0. Public funds = sharing with public licence, I agree, avec CC BY-SA is also a public licence, lucky us. I pointed out that I am quite sure a CC BY-SA licence may create a better environment to include integral of recent works directly given by their authors. You may not agree, but no study was provide for or against this, and I think a proper analysis and prospectives have to be made before such a vote. Noé (talk) 12:32, 23 February 2018 (UTC)
      Instead of rhetoric, please answer my question; "Who is being asked to 'publish under CC0 a work they spent five to twenty years on'"? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:50, 23 February 2018 (UTC)
      It was not rhetoric, maybe vague broken English by a non-native speaker, but not a manipulation through figures of style. I am not that mean. So, you cut half of my sentence. I was assuming a lexicographical project would like to have lexicographers to participate (people that already made dictionaries or linguists that did some lexicographical works and have already though about dictionaries issues), and I wrote a CC0 will not convince this kind of profiles to share data. But I was probably wrong by assuming such a goal for this project. More I read on this project and more I realize is not grounded on lexicographers needs and knowledge nor wiktionarians needs and knowledge but on wikidatians needs and vague idea of linguistic and lexicography practices and difficulties. Noé (talk) 14:19, 23 February 2018 (UTC)
      @Noé: So in your opinion, we should re-license Wikidata as WTFPL (Q152481) which is a little better than CC0 for the Public Domain software usages, but that opinion is not recommend by Free Software Foundation (Q48413) (cf. https://www.gnu.org/licenses/license-list.en.html#WTFPL). --Liuxinyu970226 (talk) 15:18, 23 February 2018 (UTC)
      I was not postulating anything for Wikidata in general, my messages were about the namespace for lexicographical data. As I understand it, WTFPL is made for software, not for data, so I don't get your point here. Noé (talk) 15:35, 23 February 2018 (UTC)
      Your English is clear. You said that you: "don't think of any lexicographer or linguist who may accept to publish under CC0 a work they spent five to twenty years on"; and I was asking you for evidence that anyone is being asked to do that. It now seems that you concede that no-one is. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:45, 23 February 2018 (UTC)
  • Pictogram voting question.svg Question If Wikidata data are in CC0 and a Wiktionary wish to include some, is this copyfraud? Noé (talk) 08:04, 23 February 2018 (UTC)
    • No that is fine. --Lydia Pintscher (WMDE) (talk) 08:09, 23 February 2018 (UTC)
      • No, it's not fine. What Wikidata community is already doing is massive copyfraud. I found a jurist specialized in free licenses and so far she confirmed that this doesn't seem legal at all. --Psychoslave (talk) 09:24, 23 February 2018 (UTC)
        • Psychoslave you misunderstood the question (or maybe Noé didn't ask what he meant to ask) the question here is the reuse of Wikidata data outside Wikidata, so in this case, the re-user is responsible ; there is no way the Wikidata community can do copyfraud in this scenario. I'm guessing you are thinking of the import of data from an external source inside Wikidata (here copyfraud by the Wikidata community is technically possible) but this is a different subject and one that has been raised multiple time already and even answered with some professional legal advice. Cdlt, VIGNERON (talk) 18:56, 23 February 2018 (UTC)
    • Lydia, can you provide evidences for your assumption? Noé (talk) 12:32, 23 February 2018 (UTC)
      • CC0 license states You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. So using CC0 data on Wiktionary is fine. --Jarekt (talk) 17:00, 23 February 2018 (UTC)
  • Symbol oppose vote oversat.svg Strong oppose With no surprise for those who followed my research on the topic, I strongly oppose to this. I will give more feedback bellow as soon as I find time for this. --Psychoslave (talk) 09:24, 23 February 2018 (UTC)
  • Symbol neutral vote.svg Neutral A database of lexical information lacking definitions will be quite bland. So what will happen? Data will be imported from CC0-compatible sources or reworded, probably starting with definitions from WordNet and out-of-copyright dictionaries. Sounds like a reboot of Wiktionary in a way, this time with a more permissive license. – Jberkel (talk) 09:42, 23 February 2018 (UTC)
  • Symbol support vote.svg Support - CC0 FTW! Wittylama (talk)
  • Symbol support vote.svg Support --Jarekt (talk) 17:00, 23 February 2018 (UTC)
  • Pictogram voting comment.svg Comment Why is this discussion happening here ? Surely the lexemes aren't going to be managed by the Wikidata community -- they are for the Wiktionary community to administer, and will be subject to that community's policies on content and every other aspect -- just as the upcoming CommonsData wikibase will be administered by Commons, not by us. That's one of the points for them being federated wikibases. This is not a decision for us to make. It is up to that community to choose how they wish to licence their work. My view therefore is we have no standing here; this is not our choice to make. This discussion is therefore not appropriate and should be closed, and/or re-started in a more appropriate forum. Jheald (talk) 18:07, 23 February 2018 (UTC)
    • The data is going to be here on Wikidata in a new namespace. It is the license of the content on Wikidata. --Lydia Pintscher (WMDE) (talk) 18:12, 23 February 2018 (UTC)
      • But do we think we, the Wikidata community, are going to be the ones administering it, making day-to-day rules and guidelines for its content and organisation? Or the Wiktionary community? Far better, it seems to me, if the Wiktionary community felt that they were the owners of these items. Jheald (talk) 18:45, 23 February 2018 (UTC)
  • Symbol support vote.svg Support --Pymouss (talk) 20:27, 23 February 2018 (UTC)
  • Pictogram voting comment.svg Comment It would be good to know about why the alternative approach (with the same model) was rejected. Please see my question/comment at: Wikidata_talk:Lexicographical_data#Separate_installation_for_Wiktionary_?.
    --- Jura 20:53, 23 February 2018 (UTC)
  • Symbol oppose vote.svg Oppose. This is exactly what we feared all along over at Wiktionary: that Wikidata would start handling lexicographical data without even bothering to consult the people who already create and manage lexicographical data on Wikimedia every day. Given the licensing situation and the glaring lack of communication, two parallel projects are going to work on the same problems, but separately. It should concern everyone here that out of the only usernames I recognise as active in any of the Wiktionaries, none of them have voted Support. Metaknowledge (talk) 22:07, 23 February 2018 (UTC)
    • The question here is about Wikidata's licence terms, and only that. Other than that a more liberal licence gives Wiktionary greater freedom to reuse material from Wikidata, that decision has no bearing on Wiktionary. There appears to have been plenty of prior consultation on the wider issues; not least by Léa (in English), (and in French) over the last year and a half. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:50, 23 February 2018 (UTC)
  • BA candidate.svg Weak oppose. I would prefer that structured data be CC-0 and free-text data be CC-By-SA in general. The Lexeme namespace is likely to contain a lot of free text (definitions and example sentences) which will fit better with CC-By-SA than CC-0, though I agree that the linkages between different Lexemes and Items should stay in CC-0 to avoid database rights disputes. Deryck Chan (talk) 11:51, 24 February 2018 (UTC)
  • Symbol oppose vote.svg Oppose Because it's not interesting for people. The thing you create isn't reality of languages, of linguistic studies and of community needs... Lyokoï (talk) 18:38, 24 February 2018 (UTC)
    • What are you opposing? Like others above, you seem to be answering a different question to the one asked. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:41, 24 February 2018 (UTC)
  • Symbol support vote.svg Support using CC-0 for lexicographical Wikidata data (and only importing CC-0-eligible data). Jc86035 (talk) 10:16, 25 February 2018 (UTC)

Discussion elsewhere[edit]

Please be aware of wikt:Wiktionary:Beer parlour#Wikidata and CC0 licence for lexicographical data. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:28, 23 February 2018 (UTC)

API: action=compare[edit]

With action=compare, one can retrieve diffs via API. Does anyone know how to create the diff of an item creation, i.e. when there is no parent revision id to hand over to parameter fromrev? A value of fromrev=0 does not work. For Wikipedia, an empty fromtext= parameter instead of fromrev does the job (example), but that does not work for Wikidata items (example with error). I suspect that I have to provide parameters fromcontentformat=, fromcontentmodel=wikibase-item, and maybe frompst as well (example), but it does not work and I am meanwhile running out of ideas. Unfortunately, there is only little use of action=compare at Wikidata until now, to I can’t read other code… Thanks for help and input, MisterSynergy (talk) 09:35, 22 February 2018 (UTC)

You can give feedback about the Weekly Summary[edit]

Hello all,

If you read the weekly Wikidata newsletter, written by volunteers and the Wikidata team, we would be very interested by your feedback. The newsletter reached its 300th edition and we are wondering how we together could improve it and make it even more useful for the community.

You can find a page with a few questions. You don't have to answer to all of them.

If you're not reading the Weekly Summary, or used to but stopped reading it, there is a section just for you!

And as a reminder, the Weekly Summary is collaborative and you can add information for the next edition on this page. Your participation would be very appreciated :)

Thanks in advance for your help, Lea Lacroix (WMDE) (talk) 12:46, 22 February 2018 (UTC)

Is it possible to see the complete history of an individual statement from an Item?[edit]

You can get a full record of all the changes made to a Wikidata item, by simply clicking "View history". From that history, I guess, that it is possible to flesh out the full history of an individual statement of that claim. However, to do this for a batch of many statements becomes impossible quickly the more statements are involved. I have looked through the api calls (https://www.wikidata.org/w/api.php), but couldn't find a usable call for this. My guess is that it should be possible, given it is tracked in the full history of an item. Long question short, does anyone have an idea how to track changes on an individual statement from a Wikidata item? When was created, how often did it change, by whom and how?

--Andrawaag (talk) 13:43, 22 February 2018 (UTC)

No other than querying change by change and comparing them. Matěj Suchánek (talk) 09:58, 24 February 2018 (UTC)

town clerk[edit]

Are municipal notary (Q1047879) and municipal clerk (Q883211) the same thing for a merge? It seems that it is the same local government official in different languages. Where I live the town clerk is licensed to be a notary public. --RAN (talk) 16:23, 22 February 2018 (UTC)

What about notary (Q189010)? I'd merge this one with municipal notary (Q1047879), at least on first sight. Grüße vom Sänger ♫ (talk) 17:47, 22 February 2018 (UTC)
In the Netherlands, a notary is an actual academic degree and a salaried position. In the USA any secretarial position can become a notary through passing a test (like for a driver's license). I don't think you can compare these across country borders and the items should probably be per jurisdiction. Jane023 (talk) 17:54, 22 February 2018 (UTC)
For notary (Q189010), the English Wikipedia article it's linked to is an umbrella term that covers both the US-style notaries public with limited training and authority, as well as notaries in other countries with training comparable to attorneys. The item municipal notary (Q1047879) has no English Wikipedia article linked to it, and the English description, "clerk of the competent local government", and the also-known-as "town clerk" should not be applied to "notary". Jc3s5h (talk) 18:04, 22 February 2018 (UTC)
Certainly some merging seems to be in order; there are many items found when searching for "notary" or "notary public", and some of these seem to be suitable for merging. But municipal clerk (Q883211) should not be merged with any of the notary or notary public items. In some jurisdictions, notaries have vastly different education and qualifications than a municipal clerk, and it is unlikely the same person would fill both roles. In some places, like where I live, all town or city clerks are notaries public, but the vast majority of notaries public are not town or city clerks. Jc3s5h (talk) 17:56, 22 February 2018 (UTC)
Well I see the Dutch wiki article for "Notaris" now links to notary (Q189010) and I can see at a glance that the interwiki to the English article is incorrect. This probably is true for a multitude of professions that have been carried over in different ways by different countries over time. Sorry I have no time to look into this and help clean up though. Jane023 (talk) 18:05, 22 February 2018 (UTC)
  • It seems that notary public (Q15479268) (a licensed position) and notary (Q189010) (an historical position) are very similar and perhaps the Wikipedia articles should be merged. It seems that municipal notary (Q1047879) and municipal clerk (Q883211) are the same. The links to Wikipedia articles need to be sorted out, the problem is the wording "public". We use it to mean "civil position that serves the public" as in "notary public" and we use it as "public office" meaning an "appointed or elected political position". I changed "clerk" and "notary" to "municipal clerk" and "municipal notary" to distinguish the political offices. I think they can be merged, there is little overlap in the languages links to the Wikipedias and the ones that overlap are meant for "notary public", the civil position. --RAN (talk) 19:22, 22 February 2018 (UTC)
  • This is the wrong place to discuss merging Wikipedia articles, no matter which of the several Wikipedias for the various languages is being referred to. The Wikipedias write whatever articles they want to, and Wikidata links to them as best it can.
In English, there are three notary-related articles that cover large parts of the world: notary (Q189010) is any kind of notary who deals with legal papers. notary (Q189010) is an umbrella term for two kinds of notaries, notary public (Q15479268), the type of notary prevalent in most of the US and much of Canada, and civil law notary (Q23838068) who are prevalent in continental Europe and countries that derive their legal traditions from continental Europe. None of these terms are historical terms; they all apply to notaries active today.
All three of these articles discuss notaries who are installed and recognized by the government. The term "public" refers to the fact that all these notaries are awarded their positions by the government and the government accords extra recognition to their acts, beyond the acts of an ordinary private person. Sometimes the presence or absence of the word "public" is used as a shorthand to distinguish American-style notaries from continental-style notaries, but they are all recognized by their governments. A good contrast in the US would be a notary of the Roman Catholic Church (en:Notary (cannon law) in English Wikipedia; there are no language links so apparently there is no corresponding Wikidata item.) Such a notary's acts would only be recognized by the Church and the government would not give any special recognition to the acts of a Church notary.
"Municipal clerk" is a pretty good term, but "municipal notary" is not. In most of the US notaries are appointed by the state (e.g. California), not by a city or town.
By the way, I am a notary public in US State of Vermont, and was appointed by the assistant judges of my county. Jc3s5h (talk) 21:18, 22 February 2018 (UTC)

Merger request[edit]

Can anyone please merge falling and rising factorial (Q2339261) with Pochhammer symbol (Q132335)? Thanks. --Mhhossein (talk) 20:09, 23 February 2018 (UTC)

There are conflicting sitelinks. Sjoerd de Bruin (talk) 20:35, 23 February 2018 (UTC)
Pochhammer symbol (Q132335) is, as far as I understand it, a special version of falling and rising factorial (Q2339261), see here: de:Fallende_und_steigende_Faktorielle#Verallgemeinerung Grüße vom Sänger ♫ (talk) 20:52, 23 February 2018 (UTC)

Old RFBOTs[edit]

I was looking at WD:RFBOT, and noticed that there's a large number of requests that have been inactive for months or even years. There was one three-month-old RFBOT that didn't even have any contents. I used common sense and closed that one on my own, but I'm hesitant to deal with the other ones before bringing it up here. Since these old RFBOTs just clutter things up, I propose the following:

  • An RFBOT may be procedurally closed by any user if any of the following conditions occurs:
    1. It has no meaningful contents and is more than 48 hours old.
    2. No user has edited the page in more than six months.
    3. The bot's operator (or one of its operators if there are multiple) has not edited the page in more than a year.
  • Any RFBOT thus closed may be reopened by the operator at any point.

Or we could be more casual about it and just close any RFBOT that seems "too old". Thoughts? — PinkAmpers&(Je vous invite à me parler) 23:13, 23 February 2018 (UTC)

Your proposal looks sensible. I am doing the same for property proposals and I have the feeling that common sense works reasonably well - I do not feel the need for particular new policies. As you said requests can be reopened afterwards so it's not like you are doing something irreversible. My own goal is just to make sure property proposals look a bit more welcoming: newcomers should not be daunted by a backlog of half-supported proposals gathering dust. I think we just need more people volunteering for these bureaucratic tasks rather than more policies about them. I really like the fact that wiki-lawyering is basically non-existent in Wikidata: let's keep it like that! − Pintoch (talk) 10:11, 24 February 2018 (UTC)
I cleaned up these kind of pages in the past. I usually leave a note about lack of activity and that it will be closed in a week unless it becomes active (including a ping for the proposer). Next weekend you can close all the requests that didn't have any replies. No need for more bureaucracy. Be nice, treat the proposers the way you would like to be treated. Multichill (talk) 13:58, 24 February 2018 (UTC)

Import from China Biographical Database (Q13407958)[edit]

I'm doing some bulk addition of basic biographical data from this database, to enrich our entries about historical figures of China. There's a project page at Wikidata:WikiProject East Asia/China Biographical Database import. I'm just notifying the community here in case anyone else is using this database (which is already mostly reconciled via Mix'n'Match).

Ontological issue: we often use the same item for a "Chinese dynasty" in the sense of a state within the Imperial era of China and in the sense of the family of rulers of those states. See for example Yuan dynasty (Q7313) or any item which is instance of (P31) both Chinese dynasty (Q12857432) and historical Chinese state (Q50068795). For an encyclopedia article, it makes sense to combine the history of the country with that of the people who ruled that country, and for my purposes I'm not encountering any problems. Still, maybe it would make more sense to separately represent the family and the state? Just raising the question in case someone more ontologically-inclined wants to examine it. MartinPoulter (talk) 14:03, 24 February 2018 (UTC)

I think Yuan dynasty (Q7313) is just an empire (state) and cannot refer to the imperial family in that era. --Okkn (talk) 15:11, 24 February 2018 (UTC)
Thanks User:Okkn. If that's the case, then please go ahead and edit the item to remove the incorrect statements. Cheers, MartinPoulter (talk) 17:18, 24 February 2018 (UTC)
@MartinPoulter: Do you need items about "family"?  --Okkn (talk) 19:53, 24 February 2018 (UTC)
@Fantasticfears:, as the original importer of the CBDB data. Mahir256 (talk) 01:36, 25 February 2018 (UTC)

Help with enabling wikidata support for a widely used enwiki template[edit]

Hey there. I've been adding wikidata support to en:Template:Infobox power station at snail's pace over the past years. And as I move further, the wikidata coding is becoming a little too complicated for me. Is anyone willing to work with me so as to help convert the template to support wikidata?

The template supports all types of power stations. So wind farm articles only transcludes parameters pertaining to winds farm, whereas nuclear plant articles transcludes parameters pertaining to nuclear power stations, and so on. Certain parameters are obviously easy to add (i.e. name, country). The issue comes up with more complicated parameters, such as nameplate capacity in megawatts and parameters that may need the creation of new wikidata properties (i.e. nuclear plant cooling towers, etc).

Looking forward to getting this done once and for all. :-) Rehman 15:15, 24 February 2018 (UTC)

@Mike Peel, RexxS: who are good at this. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:30, 24 February 2018 (UTC)
I would actually wait for the time being, since it is likely (or at least not unthinkable of) that an RfC would result in a total prohibition of Wikidata in the English Wikipedia.--Ymblanter (talk) 20:09, 24 February 2018 (UTC)
I wouldn’t. If the data has quality good enough for Wikipedia, advance it to actual use. From the “short description” episode of the past months I wouldn’t say that there is a robust opposition against Wikidata use in English Wikipedia. Btw. WMDE is currently working on a solution that enables Wikidata editing directly from Wikipedia infoboxes in the Visual Editor, see File:Client editing prototype walkthrough.webm; this might be worth to consider when Wikidata is included in an infobox, although it probably needs a couple of months until this functionality arrives… —MisterSynergy (talk) 20:25, 24 February 2018 (UTC)
There are already a number of infoboxes in en.Wikipedia taking some, or all, of their data from Wikidata. Indeed, these seem to be most successful for technical subjects, such as the one under discussion. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:58, 24 February 2018 (UTC)
Well, the outcome of the Wikidata citation RfC was near-unanimous, with most users not actually listening to any arguments. As soon as the vandalism problem has not been solved, the vandal Wikidata edits will be reflected in the template, and it will be pretty easy for the antiWikidata brigade to argue that all edits must be rolled back. This is what happened with the World Heritage infobox, and I can not say that the concerns are completely unjustified.--Ymblanter (talk) 21:01, 24 February 2018 (UTC)

TED profile[edit]

How do I correctly add the Ted Profile for Ayana Elizabeth Johnson? Thanks, GerardM (talk) 15:32, 24 February 2018 (UTC)

described at URL (P973). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:28, 24 February 2018 (UTC)

Wikinews categories[edit]

Our current practice is to connect article sitelinks to one item (for example Senegal (Q1041)) and category sitelinks to one item (for example Category:Senegal (Q6975863))). The Wikinews people keep moving the sitelinks to Wikinews category from the category item to the article item. This messes up our data structure here on Wikidata and I don't think we have consensus on this project that we want this. We only tolerated it in the past because without arbitrary access they no other way to show links to Wikipedia articles on Wikinews categories. With a template the Wikinews people can show whatever links they want on their categories without needing to move links around here. It's just a matter of copying over Commons:Template:Interwiki from Wikidata and Commons:Module:Interwiki to Wikinews and update it to suit their needs. Let's get this sorted out. Multichill (talk) 15:55, 24 February 2018 (UTC)

@Multichill: What's the position if there otherwise is no category-item here? Are Wikinews people forced to create one to match their category, or (like Commons) are they fine to link to the article-item in such a circumstance?
Also, what is the harm in systematically linking from article-item here to a category there? What is the benefit in preventing such links? Wikinews articles are all designated instance of (P31) Wikinews article (Q17633526), so a regular item here is not going to be linked to both a category there and to a news article. If there is a story, the news article will have an item of its own. There is no chance of a collision. Why is there therefore any advantage in their not linking a subject to a regular item here?
We have to use Commons:Template:Interwiki from Wikidata and Commons:Module:Interwiki on Commons because there are sometimes gallery pages there. But there are (I think?) no equivalents of gallery pages on Wikinews. So why add this clunky indirection, when a regular sitelink would do the job just as well?
There is also a difference with Commons, in that if there is a Commons category (P373) statement on an item, then most connected Wikipedias will directly show a sitelink from their article to the Commons category. But, as far as I am aware, no equivalent mechanism is in place for Wikinews, so if there is no sitelink from the article-item, then there will be no sitelink to Wikinews at all shown on the Wikipedia item.
That to me makes it entirely understandable that Wikinews editors would seek to link from article-items to their subject categories. I don't see any particular good reason to stop them. Jheald (talk) 16:41, 24 February 2018 (UTC)
Create a category just like in for Wikipedia. What I'm saying is not something new is not something new, I'm just getting rid of an exception that has grown. Exceptions are an indication the data modeling is wrong. Wikinews makes our data inconsistent. Part of the categories are like Category:Royal Air Force (Q7404780) and links keep getting moved around. If Wikipedia's would want to link to Wikinews they can still do it. Multichill (talk) 18:38, 24 February 2018 (UTC)
Or we could just say: if article-items are systematically a better sitelink for these pages, then go for it. For all of them. Site-wide. What is the downside?
And you didn't answer my first question: What is the position if there otherwise is no category-item here? Are Wikinews people forced to create one, or (like Commons) are they fine to link to the article-item in such a circumstance? What does that serve, other than create a redundant item that links to nothing and has no meaningful statements on it? Jheald (talk) 19:46, 24 February 2018 (UTC)

Wikidata:Requests for comment/Privacy and Living People[edit]

There has been a request for comment about developing a policy about privacy and items about living people. Your participation is encouraged above. --MediaWiki message delivery (talk) 20:48, 24 February 2018 (UTC) (from Rschen7754)