Shortcuts: WD:PC, WD:CHAT, WD:?

Wikidata:Project chat

From Wikidata
Jump to navigation Jump to search

Wikidata project chat
Place used to discuss any and all aspects of Wikidata: the project itself, policy and proposals, individual data items, technical issues, etc.
Please take a look at the frequently asked questions to see if your question has already been answered.
Please use {{Q}} or {{P}}, the first time you mention an item, or property, respectively.
Requests for deletions can be made here. Merging instructions can be found here.
IRC channel: #wikidata connect
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2018/10.

Project
chat

Lexicographical
data

Administrators'
noticeboard

Development
team

Translators'
noticeboard

Request
a query

Requests
for deletions

Requests
for comment

Bot
requests

Requests
for permissions

Property
proposal

Properties
for deletion

Partnerships
and imports

Interwiki
conflicts

Bureaucrats'
noticeboard

Contents

Noobie Questions[edit]

Now I'm curious about this WikiData thing. I found something needing merging, then came back about "The Dog" ( https://www.wikidata.org/wiki/Q28136699 ). Seeing that no one had touched it I looked into it. The information does not overlap. Then I wondering about why I'm unable to add more details to that article.

For example, The Dog has two co-directors and I can't even add one, much less worry about duality.

I find it strange that I can add "terrorism" as one of the main subject qualifiers but I can't add "theft" "robbery" "bank" etc. (I pulled up Iron Man (2008 film) to use as a template/reference for adding details.)

Turns out I can still add a lot in "depicts" and "main subject". Is there a difference? I added much of the same to both.

I stumbled upon a glitch: Adding the film premiere info (URL, title, date, retrieved date, author), all was fine until I tried to add unknown authors, then it would freeze and no longer "publish" - even when I removed the 'author' part. The first time I'd filled out all the things without publish-saving. The second time I'd filled out each item and published one at a time - until I had to refresh Firefox. Similarly, I discovered you have to add the co-directors names and some details before you can even reference them. Empty non-links are not allowed yet people stubs are. Weird.

I also wanted to add the link to my original article https://infogalactic.com/info/The_Dog_(2013_film) (almost identical because I wrote them at the same time - but IG has an image that was removed on WP (who censor everything)).

I have no dog in maintaining "The Dog" or it's unimportant article, but I thought it'd be a good enough place to learn about WikiData.

I kinda see the potential but there's sooooooo far to go - and then there's the faaaaaaar bigger problem of the corporatocracy censorship, distortion of events, evasive truth, and intolerance of anti-establishment ideas and authentic freedom. Any A.I. that may use this is already crippled by all the corrupted systems, including legacy media the limited and limiting so-called source of "facts" allowed on Wikipedia, etc.

Suggestions for good tutorial videos or a short overview would be greatly appreciated. ~ JasonCarswell (talk)

Bad birthdays[edit]

Google's Knowledge Graph has informed me that en:Laura Vaccaro Seeger, Q17147769, was born at 00:00:00 on 1 January 1900, which is obviously an error as the world's oldest person was born in 1903. (User:GreenMeansGo has deleted her birth date from the record here). It turns out that this statement was added by Reinheitsgebot four months ago, drawing on her VIAF information, although her VIAF page doesn't have a birth date. I suspect that the bot misinterpreted the complete lack of data as an indication that she was born at 00:00:00 on 00-01-01.

Could someone query the database to find everyone who was allegedly born on 1900-01-01, and then analyse the results to see how many of these dates were added by this bot? I'm just guessing that if there are many of them, that most of them will be errors. Nyttend backup (talk) 19:29, 3 October 2018 (UTC)

<ns1:birthDate>1950</ns1:birthDate>
<ns1:deathDate>0</ns1:deathDate>
<ns1:dateType>flourished</ns1:dateType>

Date types can be 'lived', 'circa', or 'flourished'. To quote https://github.com/OCLC-Developer-Network/viaf-dates/blob/master/README.md:

'lived' says the dates are birth and death dates and should be accurate += 3 years. 'circa' says the dates are a guess and should be given a 10 year error of margin. 'flourished' are more likely to be dates the person worked or a century date and are given 100 years leeway.

In this case work period (start) (P2031) is a more appropriate property than date of birth (P569) (diff). But even when VIAF states dateType=lived, it can still be incorrect or indicate something else – e.g. https://viaf.org/viaf/72927896/viaf.xml for Ebenezer Hewlett (Q18671012):

<ns1:birthDate>1737</ns1:birthDate>
<ns1:deathDate>1747</ns1:deathDate>
<ns1:dateType>lived</ns1:dateType>

One more example – Mahākāśyapa (Q335304) hadn't died at age 1 (diff) https://viaf.org/viaf/62449721/viaf.xml:

<ns1:birthDate>-550</ns1:birthDate>
<ns1:deathDate>-549</ns1:deathDate>
<ns1:dateType>flourished</ns1:dateType>

Please see this article as well: Parsing and Matching Dates in VIAF. Santer (talk) 14:11, 11 October 2018 (UTC)

Merging list of monuments (P1456) with appears in the heritage monument list (P2817)[edit]

I can not see that these two Properties are making any difference, both of them are Wikidata property related to geography (Q52511956) or a subclass of (P279) of that. At the time being list of monuments (P1456) 7 805 and appears in the heritage monument list (P2817) has 62 689 items. Pmt (talk) 07:16, 6 October 2018 (UTC)

  • I think we could probably delete the later. It was created for a country that used WLM lists that don't follow an administrative structure. --- Jura 08:07, 6 October 2018 (UTC)

is a Wikinews article[edit]

Wikinews article (Q17633526) 2011 Norway attacks (Q79967) is both instance of (P31) terrorist attack (Q5710433) and instance of (P31) Wikinews article (Q17633526) (see also various Description). This is an issue IMHO. Visite fortuitement prolongée (talk) 18:29, 6 October 2018 (UTC); 19:26, 6 October 2018 (UTC)

@Visite fortuitement prolongée: There are two items which match your description; which one are you talking about? Mahir256 (talk) 18:53, 6 October 2018 (UTC)
@Superchilum, Laddo: regarding 2011 Norway attacks (Q79967); @Laddo: regarding 2017 Quebec City mosque shooting (Q28549976) (as "Huhbakker" is no longer around). Mahir256 (talk) 18:57, 6 October 2018 (UTC)
Sorry, I wanted to talk about 2011 Norway attacks (Q79967). Corrected. Thank you for finding 2017 Quebec City mosque shooting (Q28549976). Visite fortuitement prolongée (talk) 19:26, 6 October 2018 (UTC)
@Visite fortuitement prolongée: Can you clarify your point? These are events, and the first articles reporting that event. They should be separated? -- LaddΩ chat ;) 22:18, 6 October 2018 (UTC)
For Q79967 Description, fr Description is "article de Wikinews", de is "Artikel bei Wikinews", an is "articlo de Wikinews", bav is "Artike bei Wikinews", bs is "Wikinews članak" etc. Very different from en "two sequential lone wolf terrorist attacks in Norway on 22 July 2011", then I guess that there is an issue. Visite fortuitement prolongée (talk) 22:46, 6 October 2018 (UTC)
Even if you dont want a separate item, at least I think that Description and instance of (P31) should not be "Wikinews article". Visite fortuitement prolongée (talk) 23:00, 6 October 2018 (UTC)
@Laddo: Yes. See the result of Wikidata:Requests_for_deletions/Archive/2018/07/08#Q17655696. Mahir256 (talk) 23:19, 6 October 2018 (UTC)

So a Wikidata item with ns0 articles on Wikipedia should not contain ns0 articles on Wikinews, but instead Wikinews categories? And all the ns0 articles on Wikinews should remain separated? --Superchilum(talk to me!) 08:12, 7 October 2018 (UTC)

Yes, I think that's the best solution indeed, if there are several Wikinews articles about the same news event. Especially the Dutch and Russian Wikinews versions work with categories in such cases. On Wikipedia, big news events have their own articles which contain all the relevant information, so categories are not needed there in that case. --De Wikischim (talk) 08:53, 11 October 2018 (UTC)
On the other hand, cases like d:Q28549976 are somewhat different. Since no Wikinews categories are involved here, all the articles (both Wikipedia and Wikinews) can just remain on the same Wikidata item. De Wikischim (talk) 08:59, 11 October 2018 (UTC)
When Wikinews is concerned, the following connections are made to other projects:
  • Wikinews article -> Wikinews article (no connection to other projects)
  • Wikinews category -> Wikipedia article / Wikivoyage article / Commons category
Ymnes (talk) 14:38, 11 October 2018 (UTC)
No they can't remain on the same Wikidata item. A mass shooting (Q21480300) is an event, not a text like a Wikinews article (Q17633526). -Ash Crow (talk) 16:27, 11 October 2018 (UTC)
Well, a lot of Wikidata items now containing both Wikipedia and Wikinews pages will have to be split up in that case. However, I wonder if it's really worth investing that much time and energy in. De Wikischim (talk) 19:31, 11 October 2018 (UTC)

Reminder about birth-death dates[edit]

Hi all, I spend time now and then filling in death dates for women, but of course this problem is true for all people, female, fictional and otherwise. Please try to be more exact in filling in dates. Someone who was obviously born after 1950 should at least have a birthdate with precision of a decade and not century (which reverts to 1901s). See this link to help work on the century of your choice: Wikidata:WikiProject Women/Centenarians. Thx Jane023 (talk) 14:04, 10 October 2018 (UTC)

If the year of birth isn't publicly available, the decade probably isn't either. I don't think guessing is a good idea. Ghouston (talk) 22:13, 10 October 2018 (UTC)
Guessing may be a bad idea, but unfortunately our sources have already done that and the result is we are left with lots of items with very strange birthdates - some are even set to the year 0 or 100. The 1901s is just an example. Jane023 (talk) 17:13, 14 October 2018 (UTC)

Merge two items[edit]

Could anyone please merge Q17003641 and Q20113514? --193.157.194.219 07:32, 11 October 2018 (UTC)

Could anyone please merge Q11980054 and Q24511179? --193.157.194.219 10:06, 11 October 2018 (UTC)
The geo coordinates for these two items point to two different fjords. [1] and [2]. I am not sure they should be merged. — Finn Årup Nielsen (fnielsen) (talk) 17:08, 11 October 2018 (UTC)
✓ Done for the first two, the second two are still problems. --Liuxinyu970226 (talk) 11:10, 12 October 2018 (UTC)

a way to protect?[edit]

I have Romeo and Juliet (Q83186) on my watchlist, and I can easily remove it so feel free to tell me NO.

Everyday or at least often, some IP makes a wrong edit to it, one edit only. Can it be protected because of "stupid uncalled for gaming" or whatever?--RaboKarbakian (talk) 16:14, 11 October 2018 (UTC)

Looks like 6 edits in the last 2 weeks, but not too frequent before that. You might want to request this on the Wikidata:Administrators' noticeboard. ArthurPSmith (talk) 18:25, 11 October 2018 (UTC)
I've protected it for a month. If it's still attracting vandalism after that, let us know on the page ArthurPSmith linked above. - Nikki (talk) 18:48, 11 October 2018 (UTC)
The main page highlights Romeo and Juliet and the "Discover" section, but doesn't link to the item directly. Could this be the cause somehow? --Yair rand (talk) 20:30, 11 October 2018 (UTC)

Political movement or ideology[edit]

There is question among me and @Fnielsen: about Lars Hedegaard (Q1806231), who belong to the Counterjihad (Q3374768) (or Counter-jihad movement, or CJM). The CJM is currently a political movement (Q2738074) in Wikidata, and is described by scholars (see Q3374768#P1343) as a loose network of peoples and organisations. Is the CJM an organization (Q43229), a political movement (Q2738074) or an political ideology (Q14934048) ? Main question: how do Wikidata say that somebody belongs to the CJM: member of (P463)? political ideology (P1142)? else? Visite fortuitement prolongée (talk) 16:48, 11 October 2018 (UTC)

This problem concerns not only persons but also organizations such as International Free Press Society (Q4354370). To complication matters further there is also the possibility to invoke affiliation (P1416) (beyond member of (P463) and political ideology (P1142)). — Finn Årup Nielsen (fnielsen) (talk) 17:02, 11 October 2018 (UTC)
Both affiliation (P1416) and member of (P463) seem to be for organisations, and Counterjihad (Q3374768) doesn't seem to be an organisation. I suppose it's correctly a political movement (Q2738074) and political ideology (P1142) should be used. I'm not sure what the difference is between political movement (Q2738074) and political ideology (Q14934048). Ghouston (talk) 19:55, 11 October 2018 (UTC)
There may be a distinction between political movement (Q2738074) and political ideology (Q14934048) (they both have articles on cswiki), in which case using political ideology (P1142) with a political movement (Q2738074) could be a mistake. I don't know how you decide whether a particular instance such as Counterjihad (Q3374768) is one or the other, or what property you should use with a political movement (Q2738074). Ghouston (talk) 20:53, 11 October 2018 (UTC)
There are a lot of small local political movements about things like preventing high-rise buildings or giving tenants the right to own pets which can't really be described as ideologies. An ideology is some kind of grand scheme about how politics or society should be organized, such as communism (Q6186) or conservatism (Q7169). Presumably political ideologies will also be political movements, or have associated political movements. In the past I suggested (in jest, but maybe it's not a bad idea) properties like "supporter of" and "opponent of" to allow recording a person or organization's opinions about random things. That discussion was about atheism (Q7066) and whether or not it's a religion, but it could even be used to record a sports team somebody supported. Ghouston (talk) 21:06, 11 October 2018 (UTC)

Items created unintentionally twice on several occasions[edit]

Hello I noticed twice the following behaviour:

When creating an new item by clicking in Commons in the sidebar on the button 'In Wikipedia Add liniks' and although i just clicked once there were 2 items created. This happens every now and again without that I would be able to tell which circumstances lead to this. Moreover the 2 items created have each time subsequent item numbers. The last time i noticed this was earlier today:

You will as well notice that both items where created in the same second and so I think I can exclude that I made 2 actions.

the same happened as well a week ago:

and as well a few minutes earlier here:

and the day before:

as well as twice here:

as well a week earlier:

and the first time here:

Is there anyone who has an exèlamation fo rthis behaviour? I suggest to create an issue in Phabricator in order to allow deeper investiagtion.

Many thanks for any feedback on this issue. Robby (talk) 22:13, 11 October 2018 (UTC)

Many tools use SPARQL queries for duplicate detection, and when there is a lag on the server used by such a query, then duplicates may arise. Such lags have occurred multiple times during the relevant period, and I am not sure what causes them. --Daniel Mietchen (talk) 01:35, 12 October 2018 (UTC)
Particularly worrying is the fact that the sitelinks are present *both* in Category:1990s in the Balearic Islands (Q57215615) and no label (Q57215616) - something I thought was impossible, as sitelinks are supposed to be unique. Pinging @Lydia Pintscher (WMDE). Thanks. Mike Peel (talk) 06:32, 12 October 2018 (UTC)
These are so-called true duplicates. --Pasleim (talk) 08:57, 12 October 2018 (UTC)
Any suggestions on how to proceed with this?
  • Create an new Phabricator ticket or is there an open issue on this in Phabricator?
  • Update the list on true duplicates or first establish a complete list with all these true duplicates (my knowledge of SPARQL does not allow me to create a request to generate such a list (I do not even know whether this would be possible) ?
  • merge the duplicates and take no further action?
Thanks for further feedback and/or proposals Robby (talk) 21:13, 12 October 2018 (UTC)
It looks like the developers will run a script to update the list soon, see Wikidata:Contact_the_development_team#True_duplicates_clean_up?, and then we can merge them. There's a Phabricator ticket linked from there too. - Nikki (talk) 12:29, 14 October 2018 (UTC)
thanks for this update. I've added a comment in the corresponding ticket in phabricator. Robby (talk) 13:44, 15 October 2018 (UTC)

Why did Donna Strickland (Q56855591) not have a Wikidata item when her Nobel Prize was announced?[edit]

Over on the English Wikipedia, there is an insightful essay about the mechanisms that led to her not having an article there, but no Wikipedia had an entry, and neither had Wikidata, so I am wondering what we can learn from that for Wikidata-related workflows. --Daniel Mietchen (talk) 01:28, 12 October 2018 (UTC)

Like many authors of academic papers, we had items about her papers, with her name as a author name string (P2093) value (see the histories of Q33306866, Q33335086, Q35516328, Q36021270, for example) but our automated tools had, presumably, not been able to create an item about her, as her ORCID profile has "No public information available". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:22, 12 October 2018 (UTC)
Lile ORCID, which is as trustable as IMDB, is the only criteria... Sjoerd de Bruin (talk) 23:59, 12 October 2018 (UTC)
"Lile"? ORCID iDs are certainly more trustable than IMDB, in this context. I said nothing to claim that they are "the only criteria [sic]", but it they are the primary ID used by our automated tools to create such items. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:35, 13 October 2018 (UTC)
  • Maybe she was probably just younger than the usual winners? --- Jura 16:38, 12 October 2018 (UTC)
  • Two further factors, besides presumably gender bias: (1) her unusual decision to never have applied for a full professorship and (2) the lack of a Wikipedia article in no small part because of what I would consider an over-strict application of the idea of "independent" sources, where we don't consider an accredited university "independent" enough to be citeable about their own faculty. - Jmabel (talk) 19:59, 12 October 2018 (UTC)
  • Much of Wikidata is created from pre-existing databases. The unfortunate consequence of this is that any bias in the source databases will be reflected in Wikidata. Where gender bias exists in the source databases, it takes proactive editing to overcome that bias. I have repeatedly been shocked to find that novels by leading women--even Pulitzer winners--have not had Wikidata entries, but on searching further I have often found that these same novels were also absent from VIAF and the Library of Congress. So unless Wikidata editors are actively seeking to counter pre-existing bias, we can expect the default to be a reflection of whatever bias was present in our sources. --EncycloPetey (talk) 17:44, 13 October 2018 (UTC)
  • From a knowledge perspective, I think the Wikidata situation of Q905436 in 2017 or even now is more problematic. --- Jura 17:49, 13 October 2018 (UTC)
    It's not much worse than stable marriage problem (Q620702), which is another Nobel winner topic. --EncycloPetey (talk) 17:53, 13 October 2018 (UTC)
    If they were done correctly, the other questions probably wouldn't arise. --- Jura 12:55, 14 October 2018 (UTC)

Data loss during the data centre switchover[edit]

Hello all,

During the data centre switchover routine on October 10th, some unexpected problems occurred over the past days:

  • For a few hours, a small part of the data was not accessible. Some items and lexemes seemed to have disappeared.
  • Some data may have been lost, including edits, preferences changed as well as user accounts created during a period or about 50 minutes (from 2018-09-13 09:08:17 UTC to 2018-09-13 09:58:26).

Part of the data has already been restored (edits and revisions. The rest (user accounts, preferences) will be restored at the beginning of next week.

If you edited Wikidata on September 13th, please check your contributions. If you encounter any problem in the next days, like items not reappearing or something missing, let me know.

If you're interested in technical details, you can have a look at the Phabricator ticket. Thanks for your understanding, Lea Lacroix (WMDE) (talk) 14:45, 12 October 2018 (UTC)

Hello all,
Edits that occurred in the 50 minutes that were temporarily lost during the switch back to the eqiad data centre may now not exist in the current revisions of the pages. This is due to edits being made to these pages while either the edits did not exist in the revision table, or before the page_latest field was updated in the page table.
These lists should indicate all revisions that could be missing. Feel free to check them. Lea Lacroix (WMDE) (talk) 13:29, 17 October 2018 (UTC)
  • Is this something specific to Wikidata or did it happen at all wikis? --- Jura 13:33, 17 October 2018 (UTC)
It impacted only Wikidata, which runs on its own server now. Lea Lacroix (WMDE) (talk) 14:22, 17 October 2018 (UTC)

Multiple sandboxes for training[edit]

I'm going to be delivering a couple of training sessions on Wikidata in the near future: October 20 in Cambridge, and October 25 for Coventry University. I often do a brief introduction to Wikidata when doing Wikipedia training, but these two events will focus on Wikidata. I always prefer to have participants actively trying out editing, rather than just hearing about it, so my usual teaching involves them working in their sandboxes for Wikipedia training. I would very much like to develop active participation in these events, but on Wikidata, users don't have user-sandboxes in their user space. I'd therefore like them to each create one or more "sandbox-items" in mainspace that they can then edit for practice. For example: user:Fred may create Sandbox-Fred Coventry and make test edits there as if it were the Coventry Wikidata entry, without disturbing the real item. Each participant would have their own item(s) to avoid the confusion of multiple people editing the same item, as they would if we used Wikidata Sandbox (Q4115189). After the event, I would request deletion of all the "Sandbox-" items.

Now, would there be any objections to this scheme? Are there issues or complications that I haven't foreseen? Does someone know of a better way? I'm interested in any and all feedback. Cheers --RexxS (talk) 16:18, 12 October 2018 (UTC)

  • We do have three sandboxes. Maybe we could have a few more. The advantage them being stable is that users don't get confused about the edits. Alternatively, maybe we could generate lists of potential items that could easily created by new users. --- Jura 16:28, 12 October 2018 (UTC)
    Thank you, Jura. I'd be really interested in generators for lists of potential items for creation by new users, but I'd want that as an addition for the second lesson, after participants are comfortable with the interface. Cheers --RexxS (talk) 16:42, 12 October 2018 (UTC)
  • This seems reasonable, but do you have a mechanism to clearly identify these sandbox items, so they don't cause trouble later? Maybe encourage everybody to make them instance of (P31) Wikidata Sandbox (Q4115189), or add that when you become aware of one? ArthurPSmith (talk) 18:52, 12 October 2018 (UTC)
    @ArthurPSmith: Hopefully, I'll have three mechanisms:
    1. I intend that each item should take the form <"Sandbox-"><name-of-user><" "><name-of-item>. Currently no item begins with "Sandbox-", so they should be easy to track down afterwards.
    2. I will be asking the participants to post a message on my talk page (to experience talk page conversations). That will allow me to scan their contributions and I'll quickly spot any edits to items not fitting the pattern I prescribed.
    3. I usually have some of the session time where participants work freely on a topic that interests them, and I work my way round everybody, helping out and checking on what they are doing. That tends to be a safety-net where I can spot those who have problems (and they are the ones most likely to create items outside of the pattern).
    It's not foolproof, but hopefully will avoid leaving things for others to have to clean up. --RexxS (talk) 20:03, 12 October 2018 (UTC)
    • Symbol oppose vote.svg Oppose as far as I'm concerned. --- Jura 20:08, 12 October 2018 (UTC)
      Care to elaborate? --RexxS (talk) 20:47, 12 October 2018 (UTC)
      • If you occasionally create test items with random statements, any Wikidata user who queries the database can end up getting them in their results and polluting them. In addition to my first suggestion, you could also create random items at https://test.wikidata.org --- Jura 20:54, 12 October 2018 (UTC)
        I can see the issue with potential result pollution, even for a brief session, so I'm sympathetic to that point. I agree that it wouldn't scale well. I've always shied away from using test.wikidata.org because of the possibility that the interface may be altered significantly by testing, but it looks quite comparable right now. I don't know anybody who has successfully used it for classroom work, but I think it's definitely worth trying out for at least one event. Thanks for the tip. --RexxS (talk) 22:32, 12 October 2018 (UTC)
  • I don't see an issue since you plan to RFD them after the event. I'm not sure I'd be particularly bothered without that caveat either, but I haven't thought it all through. --Izno (talk) 20:51, 12 October 2018 (UTC)
  • If the existing sandboxes aren't enough, then I think https://test.wikidata.org/ would be better, because people can happily test without messing up the live data and without having to worry about being reverted or even blocked or the items being deleted while they're trying to test. - Nikki (talk) 09:30, 13 October 2018 (UTC)
  • I don't see a good reason why you have to train with fake data. If you take real data that currently missing in Wikidata it would be more motivating to the people in your classroom.
You might for example take books that are currently in Wikidata that are only tagged as books and that have ISBN numbers and then ask your students to fill the items with more true information. ChristianKl❫ 15:03, 13 October 2018 (UTC)
  • For test.wikidata.org , you might need to define some items or properties before the course. (Maybe the bot running there could be configured to do that on each reset).
    The idea of looking for items with few statements seems like a good one as well, e.g. http://petscan.wmflabs.org/?psid=6118865 . You could also use categories with articles that need items. --- Jura 14:15, 14 October 2018 (UTC)

Author Qualifiers[edit]

In the Item - Q57077013 , I would like to use qualifiers to distinguish Coordinating Lead Authors, Lead Authors & Review Editors.

Are the properties object has role (P3831) & subject has role (P2868) the right qualifiers?

Or,

should I be using another property more suited for authors, Any suggestions?  – The preceding unsigned comment was added by Wallacegromit1 (talk • contribs).

subject has role (P2868) is correct, object has role (P3831) as qualfier of author name string (P2093) or author (P50) is wrong because the report is the subjejct and the authors are the objects. --Pasleim (talk) 09:13, 13 October 2018 (UTC)

Thanks for your quick feedbacks. But, When I add object has role (P3831), a message pops up saying;

"object has role is not a valid qualifier for author name string (P2093) – the only valid qualifiers are: series ordinal, of, subject has role, or, affiliation"

A bit more clarity, please!

. --- Wallacegromit1


  • It seems it was mixed up on Property:P2093 as well. I corrected that, but some 150 items need fixing too. --- Jura 17:31, 13 October 2018 (UTC)

Several "Wikilinkproblems" moved from talk page[edit]

Can someone wikilink en:Babić (Q20519335) with de:Babić (Q797730) ? 178.3.19.113 17:14, 10 October 2018 (UTC)

Can someone wikilink en:Čović (Q21487827) with de:Čović (Q341615) ? 178.3.19.113 17:17, 10 October 2018 (UTC)

Can someone wikilink en:Ivanić (Q21513651) with de:Ivanić ((Q16849113)) ? --178.3.19.113 17:20, 10 October 2018 (UTC)

Can someone wikilink en:Izetbegović (Q56538786) with de:Izetbegović ((Q1255192)) ? 178.3.19.113 17:23, 10 October 2018 (UTC)

Who deleted all the interwikilinks of Bosnian politican surnames ? --178.3.19.113 17:23, 10 October 2018 (UTC)

Can someone wikilink en:Ivanović (Q21507848) with de:Ivanović (Q526937) ? --178.3.19.113 17:27, 10 October 2018 (UTC)

Moved by Liuxinyu970226 (talk) 03:17, 13 October 2018 (UTC)

 Not done family name has to use a different item than disambiguation pages (Q27924673) --Liuxinyu970226 (talk) 03:18, 13 October 2018 (UTC)
We definitely need some way to link items of these two types to each other; something akin to is a list of (P360). Suggestions?Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:30, 13 October 2018 (UTC)

Ethnic and religious composition of human settlement[edit]

How to store data about ethnic and religious composition of human settlement (Q486972) in many different point in time (Q186408), if I know the quantity of representatives of particular ethnic group (Q41710) or religious organization (Q1530022) in different point in time (Q186408)? - Kareyac (talk) 14:02, 13 October 2018 (UTC)

You could use Wikimedia Commons and Data namespace. strakhov (talk) 18:16, 13 October 2018 (UTC)
@strakhov. Thank you, choosing the desition I’ll consider your proposal. - Kareyac (talk) 08:48, 14 October 2018 (UTC)

Double[edit]

Q12410695 was a double of Q7136547. I merged it there. Can somebody please delete Q12410695? Debresser (talk) 16:34, 13 October 2018 (UTC)

Please merge items with the instructions listed on Help:Merge next time. I've turned the former item into a redirect. Sjoerd de Bruin (talk) 17:16, 13 October 2018 (UTC)

A persons full or official name[edit]

Full name (Passport name). There are several ways to indicate a persons name. But I can not find a simple way to indicate a persons full name. An item usually uses the name a person is known under and most of the time equal to the article name on wikipedia. Like for Betty Ford (Q213122) who has the full name Elizabeth Anne Ford, probably used as her name in official documents and Passport. I am not able to find a property reflecting this I can find birth name (P1477) but for Betty Ford the birth name is Elizabeth Anne Bloomer and her {{P|P2562}} and {{P|P1559}} will be Betty Ford or derivatives of that with Ford as Family name. Is there a need for a property reflecting a persons full name? For females who have changed their last name when married I can find a property for a persons full or official name usefull. Pmt (talk) 22:06, 13 October 2018 (UTC)

  • There is P1448 for this ("official name"). --- Jura 22:33, 13 October 2018 (UTC)

Quality of a source needs to be documented and communicated[edit]

I am comparing the data of Nobelprize.org and Wikidata see Listeria list and run into problems what sources to trust

Suggestion is that we formalize how WIkidata document an used source (or start with sources that are WD properties):

  1. we start add some quality guidance of what type of source this is e.g
    1. Basic facts
      1. Number of people working with this source (fulltime/community...)
      2. When the organisation maintaining it was started
      3. If they have people full time hired
      4. If they have a documented quality process
      5. If they have controlled changed management
        1. Do you get a ticket when reporting problems and how do you report problems
        2. Can you review older versions of the source
        3. Is a reported change request possible to track
        4. Is this source Data driven
          1. Can we access it using SPARQL and federated search
          2. Can we access primary sources used
          3. Can we access a digital version of primary sources used
          4. Do they have external identifiers (en:Linked data) compare 5stardata.info
          5. Do they have Wikidata as same as property or are WD and the external source "share" an external property like VIAF ID (P214)
    2. Quality reviews of experts
      1. e.g. is this sources used by documented authorities in the field
    3. Quality reviews of Wikipedians
      1. Comments from people who has used this source what are the experiencies
      2. Where to find reported problems/mismatches
    4.  ???

- Salgo60 (talk) 05:27, 14 October 2018 (UTC)

  • Much of this has little to do with the quality of a source. For example, a database maintained by one reputable academic about an area in which he or she is expert can be fine; a database maintained by a group of followers of Lyndon LaRouche, or left as a legacy from the Stalin-era Soviet Union or Nazi Germany is inherently suspect no matter how formally proper it might seen and no matter the formal credentials of its participants. - Jmabel (talk) 01:38, 15 October 2018 (UTC)
@Jmabel: It has if you dont try to write something down then you dont communicate your opinion or your understanding. I can see a bigger need to document quality of surces
  • with the increasing size of the Wikidata project
  • the project is getting more and more Global
  • number of added properties is "exploding" and it is more difficult to understand the value/quality of a source
I did a test (in Swedish) to document one of the best sources in Sweden Dictionary of Swedish National Biography (P3217) a source that 99% of the people who has study history at the Swedish University trust but I guess nearly no one outside Sweden knows about see link some facts
- Salgo60 (talk) 10:39, 15 October 2018 (UTC)
I agree that there is a need for that. I don't know how to realize it. --Marsupium (talk) 20:55, 16 October 2018 (UTC)
@Marsupium: Better something than nothing?!?! Cant we start with something like featured article (Q17437796) but for a source... when you start look at historical famous people in Sweden (e.g. Selma Lagerlöf (Q44519) Q44519#P569) we are getting +10 sources indicating a birth date ==> would be nice to filter/order them on trusted sources - Salgo60 (talk) 07:30, 17 October 2018 (UTC)
@Salgo60: Fine, here you my brainstorming now ;):
I really feel the same and Q44519#P569 is a good example. I think we should even partly remove sources. A short previous related discussion is Wikidata:Project chat/Archive/2017/10#Redundant Wikipedia citations. But it is a hard task. I agree with Jmabel that most of the above mentioned doesn't say much about quality of a source (some we have already established ways to indicate, e.g. SPARQL endpoint).
Though as a small start: In the example Q44519#P569 I guess for an algorithmic evaluation https://web.archive.org/web/20160401152316/http://jeugdliteratuur.org/auteurs/selma-lagerlof should ranked lowest, a bare URL without author (P50) or other subproperty of (P1647) of creator (P170), publisher (P123) or anything else, then Find a Grave (Q63056) (and some of the others?) as a crowdsourced source. Perhaps we should state Find a Grave (Q63056)instance of (P31)  "crowdsourced work" (no item for that yet). Then it gets difficult. More criteria would be if a source is peer-reviewed.
For "Where to find reported problems/mismatches" deprecated statements referenced with a source and their proportion of all statements can be counted. Also the uses of Template:External reference error reports can help, also with "Do you get a ticket when reporting problems and how do you report problems". We should get more of that in the main Wikidata database to enable querying that information. Also bug tracking system (P1401) is somehow related.
BTW: Do you have a specific use case for this? I think it might be good to try to handle a specific use to figure out how to deal with this in general. --Marsupium (talk) 10:00, 17 October 2018 (UTC)
@Marsupium: the user case I have is a federated search comparing WIkidata and Nobelprize.org in a Listeria list link
  • Lesson learned
    • Wikidata is excellent in fast getting the correct death date
    • As a Nobel prize is global I fast run into the problem of finding sources you dont know anything about
    • My dream scenario is that Wikidata produce lists like above indicating a difference AND that we also present the best sources confirming the facts
  • Rank sources
    • it's difficult but for Q44519#P569 we have the church books i.e. en:Primary Sources in electronic form link telling that Selma Olivia Lovisa is born..... if you can read that source then that is what you trust... we als have one of the best ranked sources (in Swedish) link ==> its a secondary source but written by professionals using primary sources
- Salgo60 (talk) 10:24, 17 October 2018 (UTC)
@Salgo60: Comment: "secondary source but written by professionals using primary sources" is even better than a primary source, at least for Wikipedia:No original research (Q4656524).
Sorry, writing here, I forgot the first sentence of the section. "AND that we also present the best sources confirming the facts" meaning a SPARQL implementation? At least for the first mismatch with >1 reference Q106471#P569 ranking isn't too difficult, a SPARQL implementation for that example would be something like this:
SELECT ?reference ?referenceItem ?referenceItemLabel ?quality_step_c
WHERE 
{
  wds:q106471-515C1587-101A-48A6-AF11-61EE615B90FE prov:wasDerivedFrom ?reference. # the statement is https://www.wikidata.org/wiki/Q106471#P569
  ?reference pr:P248|pr:P143 ?referenceItem.
  BIND(0 AS ?quality_step_a)
  BIND(IF(EXISTS{?reference pr:P143 [].},?quality_step_a - 1,?quality_step_a) AS ?quality_step_b) # the reference has a property P143 (imported from Wikimedia project)
  BIND(IF(EXISTS{?referenceItem wdt:P629 [].},?quality_step_b + 1,?quality_step_b) AS ?quality_step_c) # the referenceItem has a property P629 (edition or translation of)
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
ORDER BY DESC(?quality_step_c)
Try it! Would that be something you would want? And now a template shall output the best source? --Marsupium (talk) 11:20, 17 October 2018 (UTC)
Thanks that is excellent also user Larske (talkcontribslogs) did some magic at Wikidata:Request_a_query#Status_update
I have spent to much time working with the secondary source mentioned above Dictionary of Swedish National Biography (P3217) and lesson learned is that they also have problems. They have published books since 1918 and when we today have the sources in electronic format and also tools like Wikidata we can see that even professionals has problems and I think WIkidata could be a good community to document that... - Salgo60 (talk) 11:36, 17 October 2018 (UTC)

Denmark[edit]

Is it possible to solve the problem which state (Q35 (Denmark) or Q756617 (Kingdom of Denmark)) Danish locations and Greenland locations belong to? There cannot be two states. Other properties are affected to this problem, too. I do not understand the difference between both. There is no English article on Q756617. --RolandUnger (talk) 09:13, 14 October 2018 (UTC)

For a start, I think en:Denmark is about Kingdom of Denmark (Q756617), but has been incorrectly connected to Denmark (Q35). I don't know much about Denmark, but I think it's something similar to Kingdom of the Netherlands (Q29999) and its constituent countries, which can also confuse people in Wikidata. Ghouston (talk) 10:13, 14 October 2018 (UTC)
Kingdom of Denmark is the official name of Denmark, so it should be the same. At least, we need a unique value for country (P17). If Denmark is a state then Kingdom of Denmark is maybe only a name of a state but not the state itself. --RolandUnger (talk) 11:06, 14 October 2018 (UTC)
No, I believe Kingdom of Denmark includes Denmark proper, Faroe Islands, and Greenland. For example, Denmark is part of the EU, whereas Faros Islands and Grenland (and, consequently, the Kingdom of Denmark) are not.--Ymblanter (talk) 12:10, 14 October 2018 (UTC)
Something similar exists between Netherlands (Q55) and Kingdom of the Netherlands (Q29999). However, it's complicated by the fact that the Caribbean Netherlands (Q27561) are part of both, while Curaçao (Q25279) only belongs to Kingdom of the Netherlands (Q29999). —Rua (mew) 12:37, 14 October 2018 (UTC)
We had the following request some time ago: Wikidata:Bot_requests/Archive/2016/12#Country_→_Denmark --- Jura 12:46, 14 October 2018 (UTC)
But I think the request was discussed but not executed. So we have the situation that cities like Copenhagen (Q1748) are situated in the state of Denmark (Q35) and cities in Greenland (Q223) are situated in state of Kingdom of Denmark (Q756617) but these are the same country! And if you see the data sets of both Denmarks then you can learn that all is completely mixed in the Wikipedias. Normally we use country names like Germany (Q183) without political description (official name is Federal Republic of Germany). Other countries like France, Netherlands have overseas territories, too but there are only one state. I think Danish authors should help to remove the confusion. --RolandUnger (talk) 16:41, 14 October 2018 (UTC)
  • A possible way to resolve it would be to unify the interpretation of country of citizenship (P27) and country (P17): if a "country" is part of a larger state, but doesn't have it's own citizenship, then don't allow it as a country (P17) target. That would mean changing quite a few statements for the Netherlands, but would avoid changing a lot for the UK. I can't see any indication either that Greenland (Q223) and Faroe Islands (Q4628) have separate citizenships. I think it would make sense, since a state shouldn't be treated differently in Wikidata depending on whether it calls its internal subdivisions "countries" or "states". Ghouston (talk) 20:53, 14 October 2018 (UTC)
    • Another option which doesn't require changing the Netherlands or the UK would be to allow things which have their own ISO 3166-1 codes. It's a widely-used international standard, so it seems reasonable to say that the things it lists are often considered to be countries. - Nikki (talk) 21:29, 14 October 2018 (UTC)
      • But then we'd have different definitions of country for country of citizenship (P27) and country (P17). ISO 3166-1 would also make places like Hong Kong, Macao and "United States Minor Outlying Islands" into countries. Ghouston (talk) 22:43, 14 October 2018 (UTC)
        • I don't think that's a problem. They're different properties, they can have different allowed values (and these are not the only two country properties we have, see country for sport (P1532) for example, a single set of allowed values for all country properties is not possible). country (P17) has a wide variety of uses so I don't think it makes sense to require the narrow definition that makes sense for country of citizenship (P27). - Nikki (talk) 10:37, 15 October 2018 (UTC)
          • Sure, but I don't see what's to be gained in this case by treating the Netherlands and Denmark differently to other states (or are there other exceptions too?). Many of them have regional parliaments and semi-autonomous areas. If "Kingdom of the Netherlands" was given an alias "Netherlands", and the existing statements assigned to it, then presumably it would appear first when selecting values and there'd be less confusion. The only advantage I can see in treating the Netherlands and Denmark differently to say the UK is that it's sort of the status-quo, and maintaining the status-quo seems to be an incredibly powerful force on Wikidata. I'm not sure why that is, for such a relatively new project. Ghouston (talk) 00:02, 16 October 2018 (UTC)

Soundex formatter URL[edit]

The formatter URL on Soundex (P3878) is https://www.wikidata.org/wiki/Special:Search?search=haswbstatement%3A"P3878%3D$1" . Is that correct? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:19, 14 October 2018 (UTC)

Seems useful and better than nothing. @Jura1: let me mention as you haven't been mentioned. Sjoerd de Bruin (talk) 12:41, 14 October 2018 (UTC)
I'm unclear how providing such a link to one of our data consumers is "Useful"; please can you enlighten me? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:22, 14 October 2018 (UTC)
Can you enlighten us what "consumer" you have in mind? --- Jura 13:34, 14 October 2018 (UTC)
Any. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:18, 14 October 2018 (UTC)
Can we have a sample use case? --- Jura 14:20, 14 October 2018 (UTC)
I hope so - that's what I'm asking for; the use case for providing this URL to data consumers. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:24, 14 October 2018 (UTC)
I don't know. You were talking about data consumers. Maybe you can enlighten us how they get it and what you have in mind. --- Jura 14:28, 14 October 2018 (UTC)

There being no use case for this formatter URL, I propose to remove it. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:12, 15 October 2018 (UTC)

  • I think you omitted to answer the question above. Do I need to translate it in some other language? --- Jura 17:14, 15 October 2018 (UTC)

way of itemstructure[edit]

I'm little in hang of idea: How to say, that Walther Freise (Q57313462) is chairperson (Q140686) of Naturforschende Gesellschaft der Oberlausitz (Q1970282) inclusive dates in the right way (did it via part of (P361)). Thank you for help, thank you very much for your work, Conny (talk) 13:28, 14 October 2018 (UTC).

Solar Hijri Calendar[edit]

Hi, "time" data type does not support SH calendar. We don't use Gregorian calendar so we don't convert dates from SH to AD. For example when I want to edit date (property: inception) for item Q4819111 I don't know exact equivalent in AD calendar. How can I edit in my own language? --دوستدار ایران بزرگ (talk) 21:27, 14 October 2018 (UTC)

Wikidata does not yet support calendars other than the Gregorian and Julian calendars. There are some Phabricator tasks open for expanding this: See for example this task for Hebrew calendar support. You can open a similar task for Solar Hijri support on Phabricator. --Yair rand (talk) 23:05, 14 October 2018 (UTC)
✓ Done Thanks I made a thread in Phabricator. --دوستدار ایران بزرگ (talk) 06:32, 15 October 2018 (UTC)

Countries and their subdivisions and territory[edit]

The question of which country a region or subdivision is part of can sometimes be quite complicated, and our current handling of this is very imprecise and occasionally inconsistent. I'd like to establish some standards for this and add them to the relevant documentation. To summarize some of the relevant things to take into account:

  • Some items correspond to regions: geographical features (islands, peninsulas, etc) or other areas which don't exist as a part of any county's set of subdivisions, but the territory of which is within one or more countries. Some items correspond to subdivisions, which may exist within the set of subdivisions used by only one country among a set of countries that the corresponding territory may be located in, or the same subdivision may be used by multiple countries that the area may be in. (That is, there are occasionally competing subdivision structures over the same area, with different items, and sometimes not.)
  • Territories can be controlled and administered by a country; a particular country can be "the government" over an area, in practice. Regular administration can be military or civilian. Under certain circumstances, control may be exercised by an alliance of countries, sometimes working under an international organization. Occasionally, military control and day-to-day civil administration of the same area may be run by different countries.
  • A country can claim a territory as their own. This can be done with or without claiming the territory as part of the country proper. Disputed claims between countries can be further complicated by the fact that a local government of a subdivision can have its own opinion, and claim itself to be part of a particular country.
  • A territory can be internationally recognized as being part of a country. Any country or group of countries can recognize an area as belonging to a country. (I'm unsure to what extant this existed before modern history.) There can be hundreds of different countries expressing opinions on this, so presumably we don't want to duplicate the whole list many times.

The existing properties used in this area are: country (P17), contains administrative territorial entity (P150), located in the administrative territorial entity (P131), and territory claimed by (P1336). (Also somewhat relevant are coextensive with (P3403) and territory overlaps (P3179), which manage the relationship between regions and subdivisions of the same area.)

Using the existing properties, along with possibly new properties or qualifiers, we should clarify how these properties should be used to express the above information on regions and subdivisions, keeping several priorities in mind:

  • Specificity: Items should ideally include as much of the data as possible, unambiguously as possible. Users should be able to query any specific element of association between a territory or subdivision and a country.
  • Simplicity: Editors should be able to easily figure out how to format a statement for any such kind of relation, without having to read endless complicated documentation.
  • Minimalism: Most areas are undisputed, and are claimed by, controlled by, administered by, and recognized as being part of, only one country. We should try to minimize the number of extra statements so that we don't need to add extras to every such ordinary situation.

Essentially all of the ways a subdivision/area can be connected to a country are not dependent on each other, making everything rather complicated. To take a fictionalized example, in case this helps:

The government of Q01 lists among its administrative subdivisions the Q02, which corresponds to the territory/area of the geographic feature Q03, which is claimed by Q04 as subdivision Q05, and administered by Q06 (as a different type of subdivision, which has a separate item Q07). The local government of Q02 itself considers itself to be part of Q07. The international community considers the area to be part of Q08 and recognizes it as such.
What statements should be used on each of these items to convey this information?

How should this data be ideally structured? --Yair rand (talk) 23:34, 14 October 2018 (UTC)

Wikidata:WikiProject Country subdivision has some pre-existing work in this area, but it seems to have been largely dormant for quite a while. There's still a lot of even quite basic work to do in this area (e.g. how much is missing or red on Wikidata:WikiProject Country subdivision/Items, even in terms of the first-level administrative country subdivision (Q10864048) in a lot of countries) so making sure that the underlying concepts are in place and well-documented so that people can help out by filling in a lot of those gaps, would be very useful. --Oravrattas (talk) 14:00, 16 October 2018 (UTC)
I've linked to this thread from the WikiProject talk page. --Yair rand (talk) 20:51, 16 October 2018 (UTC)

Taxonomy: concept centric vs name centric[edit]

Hi there,
I'm a new editor to Wikidata but I'm already an extensive consumer of data by Wikidata using the SPARQL interface.
During the last year, there's been one point that's really bothered me. My plan is to fix it, first by hand, later by bot. However, already my first edits were reverted, so maybe I have first to reach out.
The point is that a concept with multiple names gets modelled by one item. The exception to that are taxa which have one item per scientific name. Thus, a taxa with multiple scientific names has multiple items.
This generates a few problems:

  1. For data consumer (and likely new editors) it's difficult to understand the data structure if multiple ways of modelling are used across Wikidata.
  2. Links to Wikipedia are split; if different language versions of Wikipedia describe the same taxon under different names they are not linked to each other.
  3. Data is added multiple items. For example "Image" or "taxon common name" are added on each item about the same taxon.

How can we resolve the problem? My solution is to merge items about the same taxon and use ranks and qualifiers to indicate which statements are outdated resp. preferred. 130.92.255.36 07:56, 15 October 2018 (UTC)

99of9
Abbe98
Achim Raschka (talk)
Brya (talk)
Dan Koehl (talk)
Daniel Mietchen (talk)
Faendalimas
FelixReimann (talk)
Infovarius (talk)
Jean-Marc Vanel
Joel Sachs
Josve05a (talk)
Klortho (talk)
Lymantria (talk)
MargaretRDonald
Mellis (talk)
Michael Goodyear
MPF
Mr. Fulano (talk)
Nis Jørgensen
Peter Coxhead
PhiLiP
Andy Mabbett (talk)
Plantdrew
Prot D
pvmoutside
Rod Page
Soulkeeper (talk)
Strobilomyces (talk)
Tinm
Tom.Reding
Tommy Kronkvist (talk)
TomT0m
Tubezlob
RaboKarbakian
Circeus
Pictogram voting comment.svg Notified participants of WikiProject Taxonomy --- Jura 08:02, 15 October 2018 (UTC)

  • I added "taxonomy" to the section header above and pinged the relevant WikiProject. BTW it's not really specific to the field that links to Wikipedias don't follow Wikidata's structure. Merging items to connect random elements isn't really a good idea. --- Jura 08:01, 15 October 2018 (UTC)
    • I understand that connecting random elements isn't a good idea but my plan is to merge items describing exactly the same taxon, just known under different scientific names. As analogy, look at "Louis XIV of France". He is known under many different names, still only one item describes him. --130.92.255.36 08:20, 15 October 2018 (UTC)
Agree this is currently a major nuisance. As an example, the species formerly known as Madanga ruficollis (Q178830) was recently reclassified as Anthus ruficollis, but I know if I change the taxon name (P225) to that (apparently sacrosanct and "unchangeable", but nothing to indicate it, nor to make the change impossible), someone will just revert it. Currently, as far as I can see, it has to wait for someone to create a new item for it here, and transfer all the links across to it - very cumbersome! - MPF (talk) 08:41, 15 October 2018 (UTC)
  • In items here about taxa, the taxon name is leading. If an author publishes a paper to suggest that a species should be placed into a different genus - say that Madanga ruficollis has to be place in the genus Anthus like in the example above - it is a misunderstanding that the earlier name has been rejected or disappeared. First of all authors may have different opinions on the issue. But also the earlier name has been published (often many times, for instance on a Wikipedia). You will see that taxonomy publications on Anthus ruficollis will mention the earlier name Madanga ruficollis. The most correct way on Wikidata is to make a new item on the new name and keep the old one - linking them by taxon synonym (P1420). Of course the confusing is understandable, since the plants or animals at hand do not change by placement in a different genus. One might think that just changing taxon name (P225) would be sufficient. However, the placement of a species in a different genus reflects more than just renaming, it shows new insights and/or opinions in the placement of the species in the tree of life in relation to other species. The header "concept centric vs. name centric" hence is not reflecting correctly what is going on as the name reflects the concept and the taxonomical concept has indeed changed. For convenience reasons, sitelinks may be collected at one item. Lymantria (talk) 09:22, 15 October 2018 (UTC)
  • True, it's just that the proceedures for doing all this at wikidata are so obscure and impenetrable. And if changing P225 is not the way to do it, why is this not locked to make it impossible to do accidentally? Is there really no easier way? - MPF (talk) 09:45, 15 October 2018 (UTC)


  • I fully agree with the initial comment in this section and in fact because there is no agreed solution to this problem Wikidata is no longer of much interest to me. I am mostly concerned about fungi, which are normally known by their scientific names (at least in English), and those names are currently undergoing enormous numbers of changes with new genera constantly appearing. For these homotypic synonyms there is no controversy at all that the species are exactly equivalent (it is true that there also exist cases where there are conflicting views as to the exact equivalence and those cases are more complicated to handle, but they are much less frequent). We need Wikidata items for all the important synonyms, but one of the items should be selected to mean the real organism; the other items should be restricted to taxonomical information. The special "Wikidata" item should contain the wikilinks and all properties which belong to the organism (and which do not need to be duplicated for all the items). We should avoid language which implies that the selected "Wikidata" name is the right one and that the other names are wrong. Almost the only thing that Wikidata does for the Wikipedia projects at present is provide the interlanguage links, but the various language versions may naturally happen to use different names for the same species, so unless one special "Wikidata" item is chosen for all, the various language pages will not be correctly linked together.
I think the title "concept centric vs. name centric" reflects well the main issue here. It is true that the exact definitions of the various names are not identical, but I think the place to document that is in the Wikipedia article, of which there should probably be only one. All the descriptions are attempts to define the same actual species which exists in nature and if there are conflicting definitions, and someone wants to include that information in Wikidata, I think that it should be documented by using references and author attributes to multiple properties on the same main organism item.
I made a detailed proposal for a solution in the discussion of the synonym property, but unfortunately it was not agreed. I proposed that only the special item representing the organism should have instance of (P31) = taxon and the others should have instance of (P31) = synonym, but some other method could also be used. In difficult cases multiple special "Wikidata" items could be allowed. Some objections were raised but I think they have solutions and anyway it is a very high priority that Wikidata should be able to selected have organism-level items. Otherwise I think the Wikidata data model is failing to support "tree of life" information satisfactorily. Strobilomyces (talk) 10:00, 15 October 2018 (UTC)
    1. This is not really the proper venue for this discussion, which should be at Wikiproject Taxonomy.
    2. Without a detailed case-by-case study it is not really possible to tell if something is the same taxon. Taxa are not necessarily stable, and may be highly dynamic. For example, to most authors Pentaphylacaceae consists of one species, but there are those who consider it to comprise some five hundred species. In most cases, the only known way to track taxa is by scientific name (or sometimes clade names); there is, as yet, no way to track taxa by other means. So there would be nothing to base items on (and once it does become possible to have identifiers for all taxa ever recognized, it likely will prove that these are very, very many).
    3. It is easy to track scientific names. One name, one item makes it easy to gather database identifiers and references referring to that name. Data retrieval is also easy, as in the enwiki taxonbar (pointed out above).
    4. Something of a problem is the placement of the sitelinks. All the sitelinks of homotypic names can be placed together in one item (so as to have them linked), as long as not too much value is attached to what item they are in. Various other solutions have been proposed, but as yet nobody supported an approach suggested by any other user.
    5. Another problem is that from time to time, there are users who feel that there is, or should be, a single accepted name for each and every taxon. Besides not being in the spirit of the WMF, there is the slight problem that these users do not necessarily agree with each other, or with most of the world literature. - Brya (talk) 10:55, 15 October 2018 (UTC)
  • @Brya: Thanks for your answer.
1. I would cetainly be happy to have the discussion at Wikiproject Taxonomy.
2. In general it can be difficult to tell if two taxa are synonyms or not, but in many cases there is no doubt, for instance the fungus Marasmius alliaceus now has a new name Mycetinis alliaceus and it is absolutely clear that the two terms refer to exactly the same mushrooms, since the change was at the genus level. There are many hundreds of cases like that in the fungi and it would be useful to have a solution for those simple cases. I suppose that if there is doubt it may be necessary to keep the names separate but I think that the claim of synonymy should be with a reference and so if it is sometimes wrong, that does not mean Wikidata is wrong. Taxonomic databases often give synonym information and that is what I propose we should use to determine if taxa are the same. There will be difficult cases, but it is normal to have to take that that sort of decision in this context. Wikipedia pages can sometimes cover multiple closely related groups and we should not need a separate taxon item in Wikidata unless a separate Wikipedia page is necessary. It would be an immense service to users to provide items for particular organisms even if the meaning of the organism is a bit vague and it is necessary to read the Wikipedia article to understand exactly what the various names mean.
If I understand you right, your example is at the family level; in the past family Pentaphylacaceae consisted of only one species, but DNA work has shown that that species belongs in the clade of family Ternstroemiaceae and due to the priority rules the name of the combined family has to be "Pentaphylacaceae", so it has gained 500 species. By the way, the author string of the family does not change in this process! Well, I think my proposal needs to apply mostly at the species level, which I think is the "organism" level. Perhaps I should not try to apply it at the family level.
3. It is true that because of the strict nomenclature rules, names are easier to track than taxa, but it is a terrible disadvantage if we don't have Wikidata items which correspond to the real species (and perhaps other levels). I don't think that we should give up and I think we can track the taxa using the Wikipedias, once they are covered. I think it is OK if initially Wikidata is loaded with thousands of items at the name level from external databases, but in those cases where the organisms get articles in language Wikipedias or categories in Wikimedia Commons we should improve the quality of the data. Then I think that synonyms should be linked together and one "Wikidata" item should be selected, so that we will have an item at the organism level.
4. I am very glad to see you say "All the sitelinks of homotypic names can be placed together in one item". I have done this in certain cases, but I was worried that I was breaking some rule. I also agree that not much value should be attached to the question of which item they are in. If there are non-taxonomic properties (which belong to the organism), they should also be added in the item with the sitelinks. In fact this means that where an organism has pages in at least two Wikipedias, we do actually have a selected organism-level item. If this could just be agreed as a formal principle, I would be much happier.
5. Yes, it is a problem if users are intransigent about pursuing their particular taxonomy, and I think that is one reason why we need clear rules about how to deal with divergent taxonomies. Strobilomyces (talk) 15:11, 15 October 2018 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── A pointer to this discussion has been posted at Wikiproject Taxonomy; and members of the project have been pinged here. This venue is fine.

If all the sitelinks of homotypic names are to be placed together in one item, then that item would need to have all those names as aliases; there has previously been strong resistance to using such aliases. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:40, 15 October 2018 (UTC)

That would only be the case if the intent is to cause maximum confusion. - Brya (talk) 17:00, 15 October 2018 (UTC)
  • Using the number of Wikipedia sitelinks as a primary criterion to define items is probably the most complicated approach. For Wikipedias, it doesn't really matter across the number of items information is spread: they can easily load from hundreds of structured items. Anyways, from an external standpoint, it doesn't really surprise me that a field called "-nomy" is name-centric. --- Jura 17:18, 15 October 2018 (UTC)
- Using the number of Wikipedia sitelinks is the only method we have at present, but I would like to see an agreed property used instead. I think you are grossly optimistic about how easy it is to get software implemented. The problem is not the difficulty of writing the software or the load on the system, it is the necessity to (1) define clearly and agree the required data model and specification, (2) persuade the sitelinks software team that these special rules for taxonomy items are important enough to devote resources, and (3) publish the rules clearly and win acceptance so that they will be taken into account in other less basic software such as Lua scripts. Strobilomyces (talk) 11:52, 17 October 2018 (UTC)
- Thanks for notifying Wikiproject Taxonomy, by the way. Strobilomyces (talk) 11:57, 17 October 2018 (UTC)

The Color of the Bikeshed[edit]

I had the same problem, actually, with magazines. Journals, if that is a more respectable name for ya. Bird-lore became Audubon Mag, or some such. A big difference between the two is the license to use them.

The early "find and identify" taxonomy, probably uses whatever reference material they had with them at the time. "Scientific Society" notes and inclusions, maybe a book; samples brought back home and identified with whatever reference material they had available.

150 years later, I find a species name here. None of the cited taxon groups within the species and none of the wikipedias use the name. I separated them, the interesting part was "who is calling this that". It got merged again and the one site that called it by the merge name was left off (another was found or created since then). I was treated as if I was the idiot.

Wikidata is not a taxon identifier authority. If there are two news agencies reporting a thing and they both claim that a different thing happened, this is not the place to determine the correctness of one or the other. So it is with taxon names.

If a user of the data doesn't like the complicatedness of the data, the user should be instructed, then, to go fix the "science" as it has been practiced or to find something less complicated to be involved in. I would like to call this my opinion, but it is not even that! It is the fact of taxonomy.--RaboKarbakian (talk) 16:30, 15 October 2018 (UTC)

User:RaboKarbakian - as far as I can tell you created an item for the same name (two items for one name), rather than an item for the homotypic synonym. [ Brya (talk) 17:00, 15 October 2018 (UTC) (- split comment)]
User:RaboKarbakian - I think that Wikidata should try to provide identifiers for particular species, and (less importantly) for taxa at other levels. Authors of Wikipedia pages have to confront this problem in deciding how many pages to create and selecting titles, and the definitions of Wikidata items depend on Wikipedia pages if those exist. The projects are doing a big service if they can accomplish that and it is very unhelpful to say that each user should have to review the entire taxonomic history to find the right item; only a few specialists should need to do that and the differences between the homotypic synonyms are unimportant for most purposes. Strobilomyces (talk) 12:59, 17 October 2018 (UTC)
I think that what you are suggesting is more of this: https://www.wikidata.org/w/index.php?title=Q5175878&type=revision&diff=759332100&oldid=759325089 (Which was an impressively ill informed waste of my time, and the time of others as well) Please, look at the links there to confirm. Wikimedia is actually able (by license) to host the documents that named and used the older takes on TOL and names of living things. It is far more interesting and honest to just report it as they did it. Without spending a lot of time discussing it and by spending more time working on it, all the species names can be documented throughout history with the papers and even the "vulgare" or "romantic" (old definitions to be used here) writings. To give this method of documentation the same limitations as an encyclopedia -- why even bother with this as the wikipedias are there doing what you want already?--RaboKarbakian (talk) 02:51, 18 October 2018 (UTC)
The "stated as: Leptinella plumosa" is superfluous in an item that deals with Leptinella plumosa, as only papers and database with the name "stated as: Leptinella plumosa" are included in that item. - Brya (talk) 03:16, 18 October 2018 (UTC)

homotypic synonyms[edit]

User:Strobilomyces - putting all homotypic synonyms in one item superficially looks nice (and it might have worked if the basic structure of Wikidata had been different), but it will either lose information or be frighteningly complicated, with every statement accompanied by qualifiers indicating to which name it applies. This may even be the case for something as basic as taxon rank; no reason why there may not be homotypic synonyms in three or more different ranks. It will be terribly awkward to edit or to read.
        Properties of taxa should preferably be referenced and placed with the name in the reference. What applies to one particular circumscription may not apply to a different circumscription. And many different circumscriptions for one taxon is not limited to the family-level. What is one species to one taxonomist may well be hundreds of species to another taxonomist. A notorious case is the dandelion.
        And of course we would need new properties and a new taxobox module.
        Maybe somebody will come up with a piece of software to link sitelinks in different items (the Bonnie and Clyde issue is still there as well). - Brya (talk) 17:00, 15 October 2018 (UTC)
User:Brya I am not suggesting putting all homotypic synonyms in one item, I am suggesting being allowed to mark one of them as special and to use that special one for sitelinks and for other properties which belong to the organism. Indeed there are hometypic synonyms at different ranks, that is OK. Normally properties of taxa are the same for all homotypic synonyms; they tend to be very general (example: has fruit type (P4000)). If the circumscriptions have different organism properties, I think they are not homotypic and at least that is a very special case which does not occur in the present Wikidata and which could be treated exceptionally. By giving priority to these obscure cases we are losing the opportunity to have an organism-level item in Wikidata, which could really add value to the project. I don't think that the dandelion mini-species change anything here; we can represent alternative taxonomies using references, qualifiers etc. and the same decisions have to be made with or without this proposal.
The taxobox module refers to name-related information and I actually doubt that any change to that is needed. But we would need new properties. I would like to put forward another simplified version of my proposal:
  1. The current name-based data structure would stay as it is and we would only add information to it.
  2. It would be allowed to add the new property "organism item" to taxonomy items to indicate a chosen item for sitelinks. I am not proposing a change to the software, but in time we should update the data so that those items with sitelinks have the new property and where homotypic synonyms (indicated by taxon synonym (P1420) and instance of (P31) with of (P642)) have different sitelinks, one item should be chosen and the sitelink information merged. Perhaps there could be exceptions, but I think if the Wikipedia pages are not equivalent, the items should not be homotypic synonyms. The "organism item" could be useful for other purposes, such as keeping in one place information which is independent of name. Perhaps another method of labelling the special item would be better, for instance it could be an attribute of instance of (P31) = taxon.
  3. A restriction would be that Wikipedia pages should only link to items marked as "organism items". This would not be enforced by software, at least at first, but it would solve the problem that sitelinks for the same organism can be scattered amongst different names.
  4. Also a new property "Authority for current name" should be added to allow indication (with a reference) that one of the items is the "real" current name. Conflicting views could easily be accommodated. This is a somewhat different issue, but it would make clear that the current name may be different from the "organism item".
This proposal would solve the sitelinks problem and allow organism-level data to be added without losing any information. In order to be useful I think it would just need a formal proposition and a consensus that this is a useful addition to the taxonomy part of Wikidata. Strobilomyces (talk) 19:21, 15 October 2018 (UTC)
  1. This is a quite different solution than proposed by others in this thread.
  2. This proposal would not so much solve the sitelinks problem, as move the problem to a different level. It requires a "consensus taxonomy" which does not exist in reality. - Brya (talk) 03:22, 16 October 2018 (UTC)
I don't agree that this requires a consensus taxonomy. For each set of homotypic synonyms it requires one to be selected, but there does not need to be any consensus that the one selected is the true current one. It needs consensus that the given names are really synonyms, but in many cases I think that is not a problem. The sitelinks are still useful and important even if the organisms of the articles they link are technically slightly different in some way. If there is doubt, the items can be left separate (i.e. not linked by synonyms to provide a unique item for sitelinks). Strobilomyces (talk) 14:33, 17 October 2018 (UTC)
If one is selected more or less at random, what is the difference with the current situation? Also, at the moment every possible synonym-relationship can be expressed in principle. It would probably be better if we had a "is a synonym of taxon" property and a "is homotypic with" property. - Brya (talk) 17:05, 17 October 2018 (UTC)
The difference with the current situation is that it would be an agreed policy which users could take advantage of and build on. It should be recognized as an error if someone adds sitelinks to a homotypic synonym of an item with sitelinks. In future software could be added to enforce it (perhaps with exceptions if necessary), which would not be possible without an agreement. There should be some formal way of marking which arbitrary item from the set of synonyms has been selected (just using the sitelinks is unsatisfactory; it only works if there happen to be sitelinks and does not look official). I am not sure what is the best way to mark the selected synonym; it is true that we could use the existing properties taxon synonym (P1420) and instance of (P31) with of (P642) to indicate this, but this has been rejected in the past because it is considered to be taking a POV as to the correct current taxonomy.
I agree with the "is a synonym of taxon" property (very like taxon synonym (P1420)) and the "is homotypic with" property. Also I think we need a "current name according to" property to show which synonym is considered to be the current one according to the given authority. This would allow conflicting taxonomies to be documented and would emphasize that the selected synonym for sitelinks is not necessarily the true current name. Strobilomyces (talk) 16:40, 18 October 2018 (UTC)
Ah, an explicit marker that does not mean anything?
  • The "is a synonym of taxon" property would be the inverse of "taxon synonym (P1420)".
  • We do have a "current name according to" property: it is "taxon name", which can be referenced.
- Brya (talk) 17:33, 18 October 2018 (UTC)
„For each set of homotypic synonyms it requires one to be selected“. WD is not a taxon authority. What we try to do is following and model different taxonomic point of views (aka concepts) along references. --Succu (talk) 21:13, 18 October 2018 (UTC)

Separate names from concepts?[edit]

Given the number of others who have jumped in... Perhaps it would simplify things for most purposes if the taxonomic name items were completely divorced from the items representing classes (i.e. any collective grouping) of organisms? We do have a brand new Lexeme namespace specifically designed to hold information about words and short phrases, their origins, and their meanings (linking those meanings to regular Q items where appropriate). Probably we're not ready to wholesale move those millions of Q items over to corresponding L entries, but supporting the distinction between words and their meanings seems at least a sensible start: let's create independent Q items, not in the "taxon" hierarchy, for the conceptual species, genera, families, etc at least where there are corresponding wikipedia pages, and just model them separately. ArthurPSmith (talk) 18:38, 15 October 2018 (UTC)

  • It might work for Common names (currently strings on some items), but I don't think it would solve it for the actual items. You'd still need to name them ;) --- Jura 18:44, 15 October 2018 (UTC)
  • I agree that a complete set of organism items separate from the name items would solve the theoretical problems nicely and achieve the modelling aims. But I have always assumed that this would not be acceptable as it would look like duplication, be confusing and difficult to explain, increase the amount of work in updating the data, and be a big change to the current system. But in my opinion the current system is not fit for purpose - we need taxon-level information. But we would have to have a very robust consensus in order to go in that direction. Strobilomyces (talk) 19:37, 15 October 2018 (UTC)
    • @Strobilomyces: One of the reasons I suggest it is that the taxon hierarchy has always been out of sync with the way we manage basic class membership relationships in the rest of wikidata via instance of (P31) and subclass of (P279) statements. The duplication issue I think can be explained simply by looking at those class relationship statements on the items - it's not uncommon in the rest of wikidata to have multiple items with the same name, distinguished mainly by their position in the class hierarchy. ArthurPSmith (talk) 20:40, 15 October 2018 (UTC)
  • I agree with the statemens of several others above. It would in some ways be best if only the currently valid name (zoological def) had a page and synonyms were referred to it. However, that may work at Wikipedia and its the preferred method at Wikispecies also. But Wikidata is attempting to database all terms. As stated earlier any synonym is still an available name (again zoological def) it has not been disposed of in any way and can at any time be resurrected. However some way of linking the synomyms would be a good idea, it would also be useful to identify if a name is the currently accepted name or combination. Cheers Scott Thomson (Faendalimas) talk 21:03, 15 October 2018 (UTC)
  • A separate name-space for taxon names is interesting in theory. It would presumably work well for species of birds. It may work for taxa where Wikipedias have genuine pages for taxa, that is at some length and in some detail. However, there are hundreds of thousands Wikipedia pages that don't have genuine information, but are based solely on the scientific name and taxonomic position ... - Brya (talk) 02:59, 16 October 2018 (UTC)
  • I have the impression that taxon items are supposed to represent the work of classification by a scientist, not the actual organisms involved. Since they are only "instance of taxon", and not subclasses of anything, they can't have instances. That means you need other items to represent that actual organisms, e.g., human (Q5) for humans. When you have an item like Wolf of Ansbach (Q39019), it presumably needs to be an instance of something other than Canis lupus (Q18498), which gives a constraint violation. A new "wolf" item, perhaps, or just assign it to animal (Q729), which does have a subclass? Ghouston (talk) 23:03, 16 October 2018 (UTC)
    @Ghouston: parent taxon (P171) is a subproperty of (P1647) subclass of (P279). --Yair rand (talk) 23:41, 16 October 2018 (UTC)
    Thanks, subproperties of the subclass property is a new one to me. The constraints don't know about it either. Ghouston (talk) 00:27, 17 October 2018 (UTC)
    I've adjusted the constraint on instance of (P31).--99of9 (talk) 01:50, 17 October 2018 (UTC)
    There already is a "wolf item", at Q3711329. - Brya (talk) 05:16, 17 October 2018 (UTC)

Conflated ranks[edit]

Utatsusaurus hataii (Q3053716) seems to conflate a genus and species. What's the best way to disentangle them? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:15, 16 October 2018 (UTC)

Leave it alone. Most fossile taxa are "monotypic". We are missing a lot of fossile type species. --Succu (talk) 21:58, 16 October 2018 (UTC)
Why would we "leave alone" an item which clearly conflates two distinct topics? What does other missing items have to do with this issue? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:45, 16 October 2018 (UTC)
@Pigsonthewing: It would seem to me that the genus and species for a monotypic genus both refer to the same thing. What am I missing? How are they different topics? - Jmabel (talk) 03:28, 17 October 2018 (UTC)
So what's Utatsusaurus (Q20672904)? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:22, 17 October 2018 (UTC)
@Pigsonthewing: I'm literally not sure I understand your question, assuming it was addressed to me. I have no specialized knowledge to answer some of the possible meanings of the question; what exactly is in doubt? - Jmabel (talk) 15:28, 17 October 2018 (UTC)
You asserted, AIUI, that it is correct that Q3053716 is about both the species and the genus, because they are "the same thing" and are not "different topics"; yet Q20672904 is about the genus alone. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:09, 17 October 2018 (UTC)
What I'm saying is that the things in the world that are in the genus are exactly the same as the things in the world that are in the (unique) species for that genus. Both concepts describe the same things. - Jmabel (talk) 22:08, 17 October 2018 (UTC)
Actually they are not quite the same thing. However, the problem is not here but at the Wikipedias, so unless Wikipedias split these, this is the straightforward way to handle them. - Brya (talk) 05:12, 17 October 2018 (UTC)
A genus and a species are not „the same thing”. You need a species to „define” a genus. --Succu (talk) 06:12, 17 October 2018 (UTC)
agree thet are not the same thing, a genus is a group of species more similar to each other than to anything else. A species is a group of populations capable of successfully interbreeding and are more similar to each other than anything else. Being monotypic does not change the definition as it also includes the hypothetical unknown relative in the genus, ie undiscovered fossil history that can be inferred from relationships. Also known as ghost linneages. Cheers Scott Thomson (Faendalimas) talk 06:28, 17 October 2018 (UTC)

I see that Brya has made some changes, which go part way towards untangling (but do not fully untangle) the two subjects. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:22, 17 October 2018 (UTC)

I did not "untangle" anything. I just adjusted the label; labels tend to be fairly meaningless (there are lots of wrong labels out there, as many users put in the title of the Wikipedia page rather than the topic of the item). And I moved it to fossil. - Brya (talk) 03:22, 18 October 2018 (UTC)

Amsterdam[edit]

Why are there two entries for Amsterdam in the Netherlands? There is no difference between Amsterdam as a capital and as a municipality. Both refer to the same administrative and legal body. It also confusing, as both contain now the same properties like population.

Amsterdam (Q727) capital and largest city of the Netherlands

Amsterdam (Q9899) municipality in the Netherlands, containing the city of Amsterdam  – The preceding unsigned comment was added by Historazor (talk • contribs) at 12:12, 15 October 2018‎ (UTC).

Not so. The municipality of Amsterdam has a couple of populated places apart from the city, for instance Holysloot (Q672125) and Ruigoord (Q3252907). Lymantria (talk) 13:42, 15 October 2018 (UTC)
And one is shown as located within the other. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:55, 15 October 2018 (UTC)
But even if they were coextensive with (P3403) to each other, a municipality and a city are not the same. Lymantria (talk) 19:20, 15 October 2018 (UTC)
"Municipality" also seems to be ambiguous: it can refer to either a territory administered by a local government body, or by the body itself. For some places in Wikidata these have separate items, e.g., London Borough of Camden (Q202088) and Camden London Borough Council (Q5025790), and other places they don't. It's not really clear if Amsterdam (Q9899) refers to only the administrative body, or the territory as well. Ghouston (talk) 04:11, 16 October 2018 (UTC)
Both. Lymantria (talk) 06:40, 16 October 2018 (UTC)

Wikidata weekly summary #334[edit]

Query Lexemes in the Query Service[edit]

Hello all,

Graph of Lexemes derived from L2087

I’m very happy to announce that another important feature for Lexicographical Data has been deployed: the ability to query Lexemes in the Query Service.

Here are a few examples:

The queries are based on the RDF mapping that you can find here. Feel free to help improving the documentation, so people can understand how to build queries out of Lexemes.

Thank you very much to Tpt who’s been doing a huge part of the work by mapping Lexemes in RDF, and Smalyshev (WMF) who made the RDF dumps available and integrated in the Query Service.

Feel free to play with it, bring some of these ideas of queries to life, and let us know if you find any issue or bug. These can be stored as subtasks of this one on Phabricator.

If you have questions about Lexicographical Data in general, feel free to write on the talk page of the project. If you have specific questions about the integration in the Query Service, you can also ping Stas onwiki or on IRC.

Cheers, Lea Lacroix (WMDE) (talk) 08:06, 16 October 2018 (UTC)

Problems with sports league level (P3983)[edit]

MisterSynergy Thierry Caro &beer&love Vanbasten_23 Malore Pictogram voting comment.svg Notified participants of WikiProject Sports

I noted two problems about this property:

@Xaris333, Fundriver, Kooma: as main contributors --- Jura 17:28, 15 October 2018 (UTC)
Like
< Premier League (Q9448) View with Reasonator View with SQID > sport league system search < Q18559 >
sports league level (P3983) View with SQID < 1 >
? Xaris333 (talk) 21:04, 15 October 2018 (UTC)
@Xaris333:Yes, like that.--Malore (talk) 22:31, 15 October 2018 (UTC)
Hello,
well first off: The german description talks about sports in general, i think you should just correct the english description and you are all fine. Also the english description isn't really fitting the name of the Property.
To the suggestion: I wouldn't do this, because in history the the sports league level (P3983) of some specific league has often changed, because of the creation of a new league and therefor it should be possible to place a start- and enddate to the league niveau (even tho this didn't happened yet I think, but I could tell you some league, where it changed and would be appropriate to set a start- and enddate on the property).
Best regards,
Fundriver (talk) 13:35, 16 October 2018 (UTC)
@Fundriver: It's possible to have two different statements that differ only by qualifiers, something like that:
sport league system
Normal rank imaginary league system Arbcom ru editing.svg edit
sports league level 1
start time 1970
end time 1980
▼ 0 reference
+ add reference
+ add value
sport league system
Normal rank imaginary league system Arbcom ru editing.svg edit
sports league level 2
start time 1980
end time 1990
▼ 0 reference
+ add reference
+ add value
However, maybe it's better to have two separate properties.--Malore (talk) 15:12, 16 October 2018 (UTC)
Yes, I know - it would be possible. But I think the sports league level (P3983) is a main element to describe a sports league. Otherwise you tell basicly, your league was 4x in the same sport league system, only with different qualifiers.. Fundriver (talk) 17:19, 16 October 2018 (UTC)
Ok, you're right. I proposed league system property.--Malore (talk) 23:45, 16 October 2018 (UTC)

Number of peoples or animals among a nationality / population / breed[edit]

Hello. Sorry for my english. Is there a property for the number of "objects" among a population (animals or human) ? As an exemple, for French people (Q121842), we don't have any estimation for the number of french people in the world. This could be very useful (and should be allowed for use for animals, like Q588252 also). Thanks. --Tsaag Valren (talk) 09:31, 16 October 2018 (UTC)

@Tsaag Valren: We have quantity (P1114), but I'm not sure it's the right property for this. --Yair rand (talk) 20:50, 16 October 2018 (UTC)

Languages actually used in labels and titles / translations of labels and titles[edit]

While the language of a label or title must be specified, the actual language may differ:

  • Bend It Like Beckham (Q369492) The German title of that British-Indian movie is Kick it like Beckham, which is English (German distributors like the appeal of English as a modern/hip language, but they want to avoid using the kind of words and phrases that a majority of potential ticket buyers may not understand; so, bend bad, kick good).
  • Die Hand Die Verletzt (Q4162624) This episode of an American television show has a German title, probably also because it sounds cool to American ears.
  1. Is there any way in Wikidata to specify the actual language? I would find it interesting to figure how how often and which different languages are being used, and in what countries this is more prevalent.
  2. Is there a way to specify the literal meaning of a title? That television episode title above means The hand that wounds (naturally, the German title of the episode is Satan). Downsides: many translations (N*N for N languages), translations would be mostly original research.
  3. Unrelated Wiki syntax question: Any way to not have this second (numbered) list be an indented part of the second bullet item of the first list?

If these things are not possible yet, do you have an opinion on whether they might be something worth supporting in the future? --109.91.87.44 11:49, 16 October 2018 (UTC)

For literal meanings, see literal translation (P2441). - Nikki (talk) 16:16, 16 October 2018 (UTC)
Consider also applies to jurisdiction (P1001). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:17, 16 October 2018 (UTC)
Re syntax: You can you *#, if there are no intervening line breaks. --Yair rand (talk) 21:21, 16 October 2018 (UTC)

For point #1, the title (P1476) statement usually specifies a language, and you can also add language of work or name (P407). Although, language of work or name (P407) may be hard to interpret in that case, since the original name is in one language but most of the dialogue in another. Then you've also got native label (P1705), since there may be multiple title statements and who knows which would be the original. But it gets confusing. Ghouston (talk) 22:56, 16 October 2018 (UTC)

Thesis data in the Edinburgh Research Archive[edit]

Hi, following this discussion, we are in a position to begin the mass import thesis data from the Edinburgh Research Archive. In a nutshell:

  1. ChaoticReality has developed a utility to import the thesis record metadata from ALMA catalogue entries for the Edinburgh Research Archive’s theses.
  2. Can we create and delete test Wikidata items to see that the QS export is working correctly?
  3. Following this testing phase, this can we place a bot request to then mass import given the size of sample we are talking about from the Edinburgh Research Archive? Stinglehammer (talk) 12:03, 16 October 2018 (UTC)

JSON-LD now on beta[edit]

Hello all,

We’re planning to add JSON-LD as a serialization format for Wikidata. This will allow for example an easier access to RDF data from Javascript.

This is now deployed on https://wikidata.beta.wmflabs.org. Example: https://wikidata.beta.wmflabs.org/wiki/Special:EntityData/Q64.jsonld

As with the other formats we already support (like turtle or rdf/xml), content negotiation is used if the format is not indicated by a suffix like .jsonld. The MIME type that can be used in the Accept header to request JSON-LD output is application/ld+json.

If you’re interested in this feature, please test it, and let us know if you find any issues. The related ticket is this one. If everything goes as planned, we will enable it to wikidata.org on October 31st.

Cheers, Lea Lacroix (WMDE) (talk) 13:05, 16 October 2018 (UTC)

Also pinging Cscott and Multichill who were involved in the previous discussions, and Maxlath who's working a lot with Javascript :) Lea Lacroix (WMDE) (talk) 13:05, 16 October 2018 (UTC)
Is there a description, comparable to mediawikiwiki:Wikibase/DataModel/JSON? Jc3s5h (talk) 13:59, 16 October 2018 (UTC)
I believe that mediawikiwiki:Wikibase/DataModel/ is the appropriate description, since the JSON-LD format is a direct representation of the underlying datamodel, as expressed in the existing turtle/RDFa/n-quads representations, and they don't appear to have separate description subpages. The "old" JSON format is an ad-hoc format with no standardized semantics, which is why it needed its own description. Specs for JSON-LD can be found linked from https://json-ld.org/. See also https://gerrit.wikimedia.org/r/465547 which will add appropriate JSON-LD links from article pages to the JSON-LD format data from wikidata (via content negotiation using the HTTP Accept header). Cscott (talk) 14:49, 16 October 2018 (UTC)

Suggestions based on constraints[edit]

Hello all,

A few months ago, we enabled suggestions based on constraints values for the constraints section of a property. We would like to explore further this possibility of having better suggestions for entities, that’s why we created a beta feature.

If you’re going to your Preferences and check the beta features list, you’ll be able to enable "Entity suggestions from constraint definitions".

At the moment there are two constraint types that are used for generating suggestions:

You can learn more on this page.

Feel free to try it, and let us know if you find it relevant and useful.

If you find any issue, feel free to report it in this ticket. Lea Lacroix (WMDE) (talk) 11:58, 17 October 2018 (UTC)

@Lea Lacroix (WMDE): Nice... another thing. I try to use the restriction on Property:P778#P2302 and one thing I see as a problem is that the world outside Wikidata is not as pure and beautiful as I would like it to be ==> that I get for 3500 Swedish Church parishes a lot of exceptions from my restrictions see Wikidata:Database_reports/Constraint_violations/P778 (I am still cleaning). With this experience I feel an easy way to move things into the exception list would be nice - Salgo60 (talk) 12:54, 17 October 2018 (UTC)
  • enfin! Merci. --- Jura 17:08, 17 October 2018 (UTC)
  • It's probably not a planned feature, but it's possible to set up a "one-of-constraint" that isn't used by the constraint system, but by the suggestions tool only. See Property:P106#P2302. I added the values used by >2% of items. --- Jura 18:58, 17 October 2018 (UTC)
    • @Jura1: By "one-of-constraint" do you mean "one-off-constraint", or something else? - Jmabel (talk) 22:10, 17 October 2018 (UTC)
  • Little bug - when adding a reference to a statement with a filtered list, the stated in (P248) qualifier also shows the filtered list, which is of course nonsense there. Ahoerstemeier (talk) 09:26, 18 October 2018 (UTC)

Plant species sequenced by RAD-SEQ[edit]

Hi, had a question emailed to me from one of our academics about the meta-analysis of evolution in plants. They need to find as many as plant species which have been sequenced with RAD-seq (a sequencing method). Upon checking, they found Web of Science only allow datasets to be downloaded one by one and NCBI not to be terrible comprehensive. Worse, people tend to build their own database to deposit this data rather than have a central hub like Wikidata. So, the question would be the academic wanted to see if plant species on Wikidata could, in theory, have RAD-seq info added or if this would be outwith project scope? Any thoughts let me know Stinglehammer (talk) 16:44, 17 October 2018 (UTC)

@Stinglehammer: What form would this data take? Can you give an example? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:01, 17 October 2018 (UTC)
Could you please try to enhance Restriction site associated DNA markers (Q7316324) first? --Succu (talk) 21:33, 17 October 2018 (UTC)

Updating of the formatter url[edit]

What's up with formatter URL (P1630) on KMDb documentary ID (P3750). On the page of the property there's the correct link to https://www.kmdb.or.kr/eng/db/kor/detail/movie/A/$1 but on the item My Heart Is Not Broken Yet (Q11292885) it seems like http://www.kmdb.or.kr/vod/vod_basic.asp?nation=A&p_dataid=$1 is still used. The change is already 4 days old. Did a change introduce a bug that prevents these links from updating? ChristianKl❫ 19:57, 17 October 2018 (UTC)

@ChristianKl: As with other things on item pages, formatter URL's are not updated unless the item itself changes in some way, or you clear it from the cache (load with ?action=purge). ArthurPSmith (talk) 20:23, 17 October 2018 (UTC)

Planned RDF ontology prefix change[edit]

We are planning to change the prefix and associated URIs in RDF representation for Wikidata from:

PREFIX wikibase: <http://wikiba.se/ontology-beta#>

to:

PREFIX wikibase: <http://wikiba.se/ontology#>

If you are using Wikidata Query Service, you do not have to do anything, as WDQS already is using the new definition.

However, if you consume RDF exports from Wikidata or RDF dumps directly, you will need to change your clients to expect the new URI scheme for Wikibase ontology.

Also, if you're using Wikibase extension in your project, please be aware that the RDF URIs generated by it will use this prefix after the change. This is defined in repo/includes/Rdf/RdfVocabulary.php around line 175:

self::NS_ONTOLOGY => self::ONTOLOGY_BASE_URI . "#",

The new data will have schema:softwareVersion "1.0.0" triple on the dataset node[1], which will allow your software to distinguish the new data format from the old one.

The task tracking the change is phab:T112127. I will make another announcement when the change is merged and deployed and the data produced by Wikidata is going to change.

Please contact me (or comment in the task) if you have any questions or concerns. Smalyshev (WMF) (talk) 22:27, 17 October 2018 (UTC)

Can we please rename be_x_old in left box to be_tarask?[edit]

We already renamed this Wikipedia years ago, but why this box still called be_x_old instead of nowadays be_tarask? Can we please submit a Gerrit patch to rename that? --180.97.204.30 22:46, 17 October 2018 (UTC)

still a challange to developers. --Liuxinyu970226 (talk) 14:26, 18 October 2018 (UTC)
By the way, it's impossible to solve it with a single patch because the identifier is part of the data of items. Matěj Suchánek (talk) 15:25, 18 October 2018 (UTC)

Mobile errors (Microsoft Edge)[edit]

The word "image" here seems to appear in the upper left corner of the screen.

I attempt to edit Wikidata with my Microsoft Lumia 950 XL phablet with Microsoft Edge and for some reason I can't add "Statements" using the mobile view so I am forced to switch to "Desktop mode", however in desktop mode my browser seems to "flash and crash 💥" and then forcefully reloads. Adding statements can take up to 30 (thirty) minutes because of the frequent crashing, I don't have this error on any other website (including other Wikimedia projects), is anyone else experiencing this? -- 徵國單  (討論 🀄) (方孔錢 💴) 07:32, 18 October 2018 (UTC)

Senses are now part of Lexicographical Data[edit]

Hello all,

As previously announced, the next big piece of Lexicographical Data on Wikidata is now deployed: Senses.

Senses will allow you to describe, for each Lexeme, the different meanings of the word. By using multilingual glosses, very short phrase giving an idea of the meaning. In addition, each of these Senses can have statements to indicate synonyms, antonyms, refers-to-concept and more. By connecting Senses to other Senses and to Items, you will be able to describe precisely the meaning of words with structured and linked data. But the most important thing is that Senses will be able to do is collect translations of words between languages.

Thanks to Senses, you will be able to organize and connect the existing Lexemes better, and to provide a very important layer of information. With Senses support, we now have all the basic technical building blocks to allow structured machine-readable lexicographical data, that can be reusable within and Wikimedia projects and by other stakeholders.

Feel free to try editing Senses. You can use the sandbox to make some tests. Let us know if you have questions or find bugs.

Note: there are still issues with sorting the IDs of Senses, Forms and sorting the glosses, that will be solved later this week. Thanks for your understanding.

Cheers, Lea Lacroix (WMDE) (talk) 10:16, 18 October 2018 (UTC)

Ixinandria (Q3753571) was synonymized to Rineloricaria (Q138093) but...[edit]

Hi, how to resolve this? Ixinandria steinbachi is the only species in Ixinandria (other wikis links from species articles to Q3753571 - named as genus!) But Ixinandria was last recognized as synonym of Rineloricaria (Q138093). I can't merge Q3753571 with Q138093, because of links from other wikis – they should be linked with new element named Rineloricaria steinbachi (synonym: Ixinandria steinbachi) or Ixinandria Q3753571 should be renamed as Rineloricaria steinbachi Q3753571 retaining all wikilinks. Ark (talk) 14:58, 18 October 2018 (UTC)

Best practices for adding property constraints/violation checks.[edit]

I've noticed that a common error is to add position (Q4164871) and occupation (Q12737077) as values for instance of (P31) along with human (Q5) (presumably since in conversation we say "A is a writer" or "B is a senator").

See the following queries for examples:

 SELECT ?human ?humanLabel WHERE {
   SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
   ?occupation wdt:P279*/wdt:P31+ wd:Q12737077.
   ?human wdt:P31 wd:Q5.
   ?human wdt:P31 ?occupation.
 }

(query)

 SELECT ?human ?humanLabel WHERE {
   SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
   ?position_held wdt:P279*/wdt:P31+ wd:Q4164871.
   ?human wdt:P31 wd:Q5.
   ?human wdt:P31 ?position_held.
 } 

(query)


I've previously migrated these manually to the occupation (P106) and position held (P39) predicates respectively but since this seems to be a frequent error I would love to add these as a constraint violation of instance of (P31). Since this property impacts so many items I was hoping to test this in a sandbox but can't seems to find the place to do this.


Cheers, ElanHR (talk) 19:11, 18 October 2018 (UTC)

How to link a branch/subfield to its parent topic?[edit]

Currently, branches and subfield are linked to their parent topics by part of (P361). For example,

< arithmetic (Q11205) View with Reasonator View with SQID > part of (P361) View with SQID < mathematics (Q395) View with Reasonator View with SQID >

. Wouldn't be better to limit the use of part of (P361) to physical things and use facet of (P1269) in cases like this?--Malore (talk) 02:08, 19 October 2018 (UTC)

They mean different things. history of mathematics (Q185264) is a facet of (P1269) mathematics, arithmetic (Q11205) is not. --Yair rand (talk) 02:13, 19 October 2018 (UTC)
An alternative would be of (P642) mathematics (Q395), but that doesn't seem ideal. For medical fields, there's an item medical specialty (Q930752) which is used quite a bit. Ghouston (talk) 02:52, 19 October 2018 (UTC)