Shortcuts: WD:PC, WD:CHAT, WD:?

Wikidata:Project chat

From Wikidata
(Redirected from Wikidata:Bybrunnen)
Jump to navigation Jump to search

Wikidata project chat
Place used to discuss any and all aspects of Wikidata: the project itself, policy and proposals, individual data items, technical issues, etc.
Please take a look at the frequently asked questions to see if your question has already been answered.
Please use {{Q}} or {{P}}, the first time you mention an item, or property, respectively.
Requests for deletions can be made here. Merging instructions can be found here.
IRC channel: #wikidata connect
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2018/07.

Project
chat

Lexicographical
data

Administrators'
noticeboard

Development
team

Translators'
noticeboard

Request
a query

Requests
for deletions

Requests
for comment

Bot
requests

Requests
for permissions

Property
proposal

Properties
for deletion

Partnerships
and imports

Interwiki
conflicts

Bureaucrats'
noticeboard

Contents

What heart rate does your name have?[edit]

You name does have a heart rate, right?

"But surely", you say, "a name cannot have a heart rate? Only a living thing can have a heart rate!"

And yet, dog (Q144) is an instance of common name (Q502895), and it has a heart rate.

When are we going to stop this nonsense? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:24, 22 June 2018 (UTC)

What's your proposal to resolve „this nonsense“? --Succu (talk) 22:12, 23 June 2018 (UTC)
Pick one item to be used in statements (an instance of taxon, although the label can be "dog" for convenience), and minimize use of the duplicates created for Wikipedias. Ghouston (talk) 05:37, 24 June 2018 (UTC)
Apparently using a taxon-centric model with other names being properties or aliases is unacceptable although it not entirely clear what use-cases that model would fail in. The claim for using a name-centric model is that one item for every name keeps them independent of the organims being referred to because every taxonomic author can have their own view of what the circumscription (characters defining a group) of a taxon (a group of organisms) is. This would presumably also require items with labels that are meaning-free - maybe taxon entities that are unnamed and use a UniqueID. See also Property_talk:P1420. Pinging also @Peter_coxhead: @Kaldari:. Shyamal (talk) 11:36, 25 June 2018 (UTC)
So where are properties - like heart rate, or number of legs, or gait, or... - which are about the (class of) organism, not the name, supposed to go? And, anyway, aren't items about the names of things supposed to go in the Lexeme namespace? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:32, 25 June 2018 (UTC)

Andy Mabbett, and the common name, not the taxon!, has a "taxon common name" Q144#P1843. Also subclassOf "domesticated animal" AND "pet", with "pet" having subclassOf "domesticated animal".
It is easy to solve, but users that want to mix items about subclasses of biota with items about names of these subclasses stand in the way. 178.5.32.201 12:33, 24 June 2018 (UTC)

This is the heart of the problem. Yes, organisms have physical properties, which may be shared by a group of organisms (a taxon); names do not have physical properties. Yes, Wikidata muddles taxa and their names. However, as discussed at length at Property talk:P1420#taxon-centric, at present there's no known way of representing taxa in Wikidata and allowing different taxonomic views to be shown. Peter coxhead (talk) 06:14, 26 June 2018 (UTC)
Even given the problems with multiple overlapping taxa, is there really a need for separate items for common names? Most taxa don't have them. Now we have three items for "dog", when 2 seems sufficient. Ghouston (talk) 10:12, 27 June 2018 (UTC)
Perhaps the question should then be - Should wikidata aim to (attempt to) capture details of taxonomic histories and circumscriptions rather than serve as an index to Wikipedia entries that explain those subtleties? Shyamal (talk) 11:01, 27 June 2018 (UTC)

@Pigsonthewing: The problem you've identified here is only the tip of the iceberg. Currently Wikidata conflates taxons and taxon names into the same items. And thus when a species has multiple synonyms, things like heart rate, range maps, interwiki sitelinks, etc. get randomly spread among those various synonyms. See the lengthy discussion at Property_talk:P1420#Changing_this_to_a_string_property. Clearly the current system is not tenable. We probably need to do a project-wide RfC about this at some point. Kaldari (talk) 22:35, 27 June 2018 (UTC)

In practice, P3395 is not a property that applies to a taxon; it is a human-centric property, used for persons and for what are commonly perceived to have personalities, also (that is, pets). Thus there is no real conflict in these items. Just a case of raising trouble in this Project chat. - Brya (talk) 07:19, 7 July 2018 (UTC)
It beggars belief that anyone with even a school-level education in biology would claim that only "persons and what are commonly perceived to have personalities" have a heart rate (for that is what heart rate (P3395) is). But if you really think that all I am doing in this section is "raising trouble", you know where the admin noticeboard is; raise your complaint there. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:12, 7 July 2018 (UTC)
It beggars belief that anyone with even an average helping of common sense would go by surface appearances and would refuse to look at how a property is actually used. Wikidata has several properties that are used only for pets. - Brya (talk) 06:18, 8 July 2018 (UTC)
Really, Brya? So what's Q15978631#P3395? Do we keep Homo sapiens (Q15978631) as pets, now? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:21, 11 July 2018 (UTC)
@Pigsonthewing: Well, I do. Don't you? - Jmabel (talk) 22:07, 11 July 2018 (UTC)

Another example[edit]

P01069419 (Q55196248) is a physical object; a prepared specimen of plant material, in a museum collection. It is instance of (P31)=holotype (Q1061403), qualified of (P642)=Eugenia plurinervia (Q55195930). All well and good.

But holotype (Q1061403) is instance of (P31)=zoological nomenclature (Q3343211)+botanical nomenclature (Q3310776)+prokaryote nomenclature (Q27514375) (all three being, ultimately, subclasses of terminology (Q8380731)+naming convention (Q6961869)) and subclass of (P279)=type (Q3707858).

type (Q3707858), in turn, is instance of (P31)=biological nomenclature (Q522190) (again eventually a subclass of terminology (Q8380731)+naming convention (Q6961869)); and part of (P361)=taxonomy (Q8269924).

Is this right? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:19, 26 June 2018 (UTC)

Note also that this causes a constraint issue on the item's inventory number (P217). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:37, 27 June 2018 (UTC)
P01069419 (Q55196248) was created to describe the holotype (Q1061403) of Eugenia plurinervia (Q55195930) to be used with taxonomic type (P427). In this case the type (Q3707858) (a nomenclatural concept) itself is a preserved specimen (a individual sheet) held at Muséum national d'histoire naturelle (Q838691). In my opinion we should get rid of instance of (P31) qualified of (P642) constructs. And yes, all subclasses of type (Q3707858) need a review and remodeling. Do you have a proposal? --Succu (talk) 19:52, 27 June 2018 (UTC)
"For better or worse, we should recognize that much of taxonomy's cumulative body of work is not well aligned with the requirements for ontological representation and reasoning." - Franz, NM ;Thau, D (2010) Biological taxonomy and ontology development: scope and limitations. Biodiversity Informatics 7:45-66. Shyamal (talk) 05:06, 29 June 2018 (UTC)
dog (Q144) does subclass domesticated animal (Q622852). ChristianKl❫ 09:40, 29 June 2018 (UTC)
So how do your remarks here are helping to find a workable solution, Shyamal and User:ChristianKl? --Succu (talk) 21:27, 29 June 2018 (UTC)
Knowing the cause of a problem is part of the solution. I am not sure the few participants here who are discussing it in good faith should be expected to provide instant solutions when they were not around when the problem was created in the first place without much discussion. It seems clear that creating synonyms and names as items is best set aside for now until there is a larger discussion that examines a practical ontology and one that does not get archived out from here. Shyamal (talk) 02:15, 30 June 2018 (UTC)
I know this paper very well, but it certainly does not support your conclusion. For me the essence is summarized as „[...] research along the taxonomy/ontology interface should focus either on strictly nomenclatural entities and relationships or on ontology-driven strategies for aligning multiple taxonomies, but not on building static networks for large portions of the tree of life.“ This is were WD could do a minor contribution. But this is not closely related to the issue raised above. --Succu (talk) 20:44, 30 June 2018 (UTC)
A useful step then would perhaps be to explain what kinds of queries you would be running from your ideal taxonomic database and what your results are expected to be. Shyamal (talk) 11:50, 1 July 2018 (UTC)
What's your „ideal taxonomic database“? I havn't one! See the paper you cited. What queries do you want to be answered? How many species exists is out of scope here. --Succu (talk) 21:37, 1 July 2018 (UTC)
You misread me - there was nothing mentioned anywhere above that had to do with wanting to know the number of species. I do not see an purpose in your statements other than dismissing discussion. Shyamal (talk) 04:01, 2 July 2018 (UTC)
A first goal should be to get the nomenclatural aspect running. Afterwards there is plenty room to do integrate all the different taxonomic viewpoints. --Succu (talk) 21:31, 3 July 2018 (UTC)
I don't think entities in the Q-namespace should be instance of (P31) of common name (Q502895). I think it's better to model names in our new datatypes than in the Q-namespace. We will have lexeme's for "dog" and "human" and any other species that might have the same problem. Ontology-wise that clears up the problems very well.
It does have the problem that this means "human" and "Mensch" wouldn't get interwikilinks from Wikidata anymore given that they are different lexemes but I think the amount of items is small enough that we can solve the issue with a bot that creates manual interwikilinks. ChristianKl❫ 15:34, 30 June 2018 (UTC)
Sorry, but I can't imagine how the new namespace could resolve all this issues. --Succu (talk) 21:45, 30 June 2018 (UTC)

Geely[edit]

I had split the item to Geely (Q739000) (parent and private) and Geely Automobile (Q55118127) (subsidiary and listed company), as well as merging no label (Q10724582) to the parent company item. However no label (Q10724582) seem contaminated with wrong description of "family name" and other language equivalent, could someone spot all the wrong description in Geely (Q739000)?  – The preceding unsigned comment was added by Matthew hk (talk • contribs) at 20:31, 23 June 2018‎ (UTC).

Large numbers of aliases[edit]

Is it helpful for some items to have a very large number of aliases? I was looking at people, and here are the winners:

Jan Brueghel the Elder (Q209050) (362), Jacques Courtois (Q1366347) (307), Christian Wilhelm Ernst Dietrich (Q536581) (305), Willem van de Velde the Elder (Q722912) (285), Cornelius van Poelenburgh (Q247183) (279), Jusepe de Ribera (Q297838) (276), David Teniers the Younger (Q335022) (274), Herman van Swanevelt (Q875164) (259), Guercino (Q334262) (256), Anthony van Dyck (Q150679) (246)

Many of these seem to be misspellings (often unlikely) or alternate word orders, which is discouraged by Help:Aliases#Criteria_for_inclusion_and_exclusion. Many of these aliases seem to be coming from ULAN (e.g. http://www.getty.edu/vow/ULANFullDisplay?find=&role=&nation=&subjectid=500007095). Bovlb (talk) 17:58, 3 July 2018 (UTC)

Yes, it's helpful. ULAN doesn't invent aliases, they include aliases used by sources. It increases the chance of finding items.
The interface could probably be improved to collapse if the number of aliases exceed a certain number. You could open a task for that in phabricator. Multichill (talk) 20:17, 3 July 2018 (UTC)
@Multichill: While I will agree that they may be useful aliases to have, they may be better distributed among other languages (whose labels and aliases you can still search in the search bar no matter what your interface language is); for instance, those aliases containing 'vecchio' could instead be moved to the list of Italian aliases, or those with 'velours' to the list of French aliases, and so on as appropriate. Mahir256 (talk) 20:34, 3 July 2018 (UTC)
ULAN records come supplied with a provenance tag. It's unclear why one would want to import a record marked with the LU U tags since the provenance is explicitly marked as undetermined. Examples include "Sammet=Breughel" for http://www.getty.edu/vow/ULANFullDisplay?find=&role=&nation=&subjectid=500007095 Mpetrenk (talk) 20:47, 3 July 2018 (UTC).
To increase the chance of finding items?
ULAN doesn't tag labels with a language so I wouldn't know what language to copy it to.
BTW [1] this isn't very helpful. I don't bother undoing it because I know the bot will restore it anyway (unless ULAN dropped an aliase in the meantime). Multichill (talk) 20:52, 3 July 2018 (UTC)
@Multichill: Could you adjust your bot such that it doesn't repeatedly add the same alias to the same item? --Pasleim (talk) 21:25, 3 July 2018 (UTC)
I don't consider this incorrect behavior, so no. Multichill (talk) 21:32, 4 July 2018 (UTC)

Looking more closely at ULAN, many of these entries are marked "V" for vernacular or local language, presumably Flemish for Jan Brueghel the Elder. Others are marked U for undetermined language. There doesn't seem any basis to conclude that these are valid aliases in English (but not any other language). Throwing them into an arbitrary language (English) just to increase search recall seems to me like a bit of a hack (and violates alias policy). Discarding the information about language and "historical local use" doesn't give us a faithful representation of this source. Maybe we need a separate property for ULAN labels so we can add appropriate qualifiers. Bovlb (talk) 21:17, 3 July 2018 (UTC)

Interesting proposal, since I have noticed that the ULAN aliases are not even complete and I have added aliases myself. It would be nice to have a property for the aliases - we already have the property "nickname" for the Italian aliases known as "bent-names", but I guess we should distinguish between the nicknames (used mostly in archives) and the "misspellings" which are used in museums (though they don't see these as being misspelled. We clearly need to update the policy though if that is the issue, since these aliases are crucial for findability. Jane023 (talk) 21:33, 3 July 2018 (UTC)
What about having only one alias field for all languages? At least for person names, this should be fine and would reduce maintanance work. Steak (talk) 11:14, 4 July 2018 (UTC)
  • It would look nicer if it just displayed the first three aliases, and kept the others hidden, but used in searches. It would also be nice of we auto generated the alias for names like "John Simpson Doe" as "J. S. Doe" and "John S. Doe" for people that do not have ULAN entries. It would cut down on duplicates being created and aid researchers. --RAN (talk) 15:27, 4 July 2018 (UTC)
  • I think aliases that are obviously in a different language should be included only in their proper language. Thus “hans holbein der jungere” should not be an alias in English. Nor should “school of Hans Holbein” be an alias for the artist, since it will lead to false matches. Nor should last-name-first variants (and especially broken things like “Edward Coley Burne, Sir Jones”). Yes, this will reduce automatic matches in some tools. But I think that’s better than having alias lists full of utter garbage. - PKM (talk) 19:45, 4 July 2018 (UTC)
Why is a shorter list better than a list which does not reduce automatic matches. What exactly do we gain for this loss in functionality? --Tagishsimon (talk) 21:54, 4 July 2018 (UTC)
Are aliases finding aids, or are they information for other sites and applications to use? What is a third party application supposed to do with the list of aliases for Hans Holbein (Q48319)? If the feeling is that every recorded spelling in several languages should be valid “English” aliases, then I’ll defer to the consensus. But I think we have done a disservice by importing name fields from catalogues in one language to another, without cleaning up appellations either before or after. - PKM (talk) 03:07, 5 July 2018 (UTC)
To add to this, I personally always interpreted the aliases to be alternative labels; i.e. terms that could have been used as the label, but aren't since we can only have one label at a time.
To mix alternative labels with generic terms that aid in the search for the item seems like bad practice to me; and it would seem better that the two categories of terms were split into separate sections.
On the other hand, if the aliases were always meant to aid searching and not to list alternative labels, perhaps the alias section should have a different name to better reflect its purpose. --Njardarlogar (talk) 10:23, 5 July 2018 (UTC)

@Multichill: We could exclude at least alternative capitalizations and word orders? That would comply with current Help:Aliases and leave out a lot of aliases that aren't needed for search anyway. What do you all think? --Marsupium (talk) 07:16, 12 July 2018 (UTC)

  • I think the interface could be improved if there are many aliases. It's a bigger problem when there is no alias and a duplicate is created than if there are several aliases. There are still many lists that have aliases for every entry that could be in such lists. I think these should be cleaned up.
    --- Jura 08:07, 12 July 2018 (UTC)
  • There are a number of related issues being raised here:
  1. Should aliases be limited to those that comply with Help:Aliases, being other common names for the entity in the specific language, without including variations in spelling and word order, and without including related entities? Do we need to balance this restriction against search recall? How does this restriction apply to import bots?
  2. How do we ensure that Wikidata search has good recall? If we create a new property for ULAN, can we use strings from those claims?
  3. How should the user interface deal with large numbers of aliases?
  4. When we import names from a source like ULAN, should we simply make them English aliases, even if they don't comply with alias policy, and even if that means losing important qualifiers about language and usage?
Bovlb (talk) 19:15, 15 July 2018 (UTC)

QuickStatements is very slow today[edit]

Running QuickStatements is quite glacial last day or two. Anybody has any idea why? Magnus? Anybody else noticed it? --Jarekt (talk) 22:34, 6 July 2018 (UTC)

I got 9k statements added in 8 hours, 2 nights ago; only ~1500 in 8 hours last night. --Tagishsimon (talk) 22:47, 6 July 2018 (UTC)
I would say #Wikibase’s maxlag now takes dispatch lag in account is behind this. Matěj Suchánek (talk) 10:18, 7 July 2018 (UTC)
@Lydia Pintscher (WMDE): I'm not sure if this is working as it should. What's the target edit rate for users?
--- Jura 05:24, 8 July 2018 (UTC)

Here is a list of edits at 7:36

‎‎ . . ‎QuickStatementsBot 	83
‎ . . ‎Citationgraph bot 	70
‎ . . ‎BotMultichill 	58
‎ . . ‎SuccuBot 	32
‎ . . ‎Edoderoobot 	31
‎ . . ‎Renamerr 	19
‎ . . ‎Deryni 	10
‎ . . ‎Aosbot 	6
‎ . . ‎MatSuBot 	6
‎ . . ‎Coffins 	5
‎ . . ‎FShbib 	5
‎ . . ‎Raphodon 	4
‎ . . ‎Villy Fink Isaksen 	2
‎ . . ‎Red Winged Duck 	2
‎ . . ‎Marcok 	2
‎ . . ‎Citationgrap	1
‎ . . ‎Lokal Profil 	1
‎ . . ‎Liuxinyu970226 	1
‎ . . ‎Epìdosis 	1
‎ . . ‎5.206.223.111 	1
‎ . . ‎Ghuron 	1
‎ . . ‎Rashinseita 	1
‎ . . ‎JuanCamacho 	1
‎ . . ‎Derzno 	1
‎ . . ‎Robby 	1
‎ . . ‎Schekinov Alexey Victorovich 	1
‎ . . ‎Gareth 	1
‎ . . ‎Supernino 	1
‎ . . ‎Harry Paudyal 	1
‎ . . ‎Zaqaryan13 	1
‎ . . ‎Чръный человек 	1

About 350 total, all Quickstatements ones by Tagishsimon (5 or 6 concurrent batches).

If the site has some trouble following, maybe bots should be de-activated entirely.
--- Jura 07:46, 8 July 2018 (UTC)

Please don't jump to conclusions.
If you look at https://grafana.wikimedia.org/dashboard/db/wikidata-dispatch you'll see the numbers are quite normal and the replag didn't hit 5 seconds so no bots would need to slow down.
Looks like Magnus changed some code, I would look into that. Multichill (talk) 21:59, 8 July 2018 (UTC)

Multichill (talk) 21:59, 8 July 2018 (UTC)

I can change it back, but I can't see why it should be faster, unless the API actually returns replag errors. --Magnus Manske (talk) 12:10, 9 July 2018 (UTC)
I have changed the code back, and restarted the bot. Let me know if this works better. --Magnus Manske (talk) 12:13, 9 July 2018 (UTC)
My currently running QS batch does not speed up. The server response time is still at around 13 to 15 seconds per edit. —MisterSynergy (talk) 12:18, 9 July 2018 (UTC)
It is still slow for me too. I am running a batch of 20k+ statements and just clocked it at 220 edits per hour, so it will take 92 hours, assuming it can mantain that rate. Some of my other batches seemed to have slowed down with time. --Jarekt (talk) 12:34, 9 July 2018 (UTC)
  • @Magnus Manske: Petscan still lingers around 6 edits per minute. Could this be improved?
    --- Jura 13:16, 12 July 2018 (UTC)
    • I just re-checked it and it's slightly higher. Still
      --- Jura 13:18, 12 July 2018 (UTC)
Working on a dataset of 60,000 items that means that to add 1 statement + 1 qualifier + 1 reference is going to take 600 hours -- ie 25 days, even running at 24 hours a day, whereas previously it could be done in two.
Even a casual 1000-edit task is now going to take over three hours, whereas before it would be less than 15 minutes. Six hours if the edits are to be referenced.
I appreciate that previously WDQS was dropping some QS edits, which was also not good, but throttling back this hard makes working with Wikidata simply not viable.
QSbot is a piece of core user infrastructure. It deserves a shared edit-rate quota equal to multiple individual bot accounts. Jheald (talk) 16:24, 13 July 2018 (UTC)
+1 for Jheald's suggestion. The backlog for QS right now is such that I hesitate to add any more batches to it. @Lydia Pintscher (WMDE): do you agree that the interface/toolset is now at a point at which it ill-serves users who wish to make amendments by the 10s of thousands? Are you content that we must wait days for batches to complete? Finally, why is it that QS, a critical tool, is provided and maintained solely by the sainted Magnus, rather than being provided as a core wikimedia offering by wikimedia developers? It gives the impression that wikimedia has scant understanding of or concern for the needs attaching to bulk edits. --Tagishsimon (talk) 19:39, 15 July 2018 (UTC)
Addendum. The short temporary_batches I ran through QS from my client browser today ... 3 edits per minute, tops. I have maybe 150k edits batched up, roughly 34 days to do if done from the client. So I can wait days for QuickStatementsBot to timeslice itself through my queue, or wait a month or so from my client. Neither is attactive. (And for the avoidance of any doubt, none of this is criticism of Magnus; it is, I'm afraid, a criticism of what seems like an antediluvian offering from Wikimedia.) --Tagishsimon (talk) 00:29, 16 July 2018 (UTC)
I somehow suspect that this is a flaw in the QuickStatements tool or any other component of User:Magnus Manske's framework. The dispatch lag and the maxlag value are totally okay these days, and I can’t find any reason why editing should be throttled all the time… —MisterSynergy (talk) 16:53, 13 July 2018 (UTC)
Maybe it's because almost everybody gets throttled. NikkiBot and Citationgraphbot seem to run at decent speeds. I guess I should change tools.
--- Jura 16:59, 13 July 2018 (UTC)
You might be right Mr S, but I do wonder whether QSbot is being throttled to 60 edits/min to be shared between everybody. If that's the case, it needs to be changed. Jheald (talk) 17:07, 13 July 2018 (UTC)
User:QuickStatementsBot is subjected to the same limitations as all other users, which currently is 80 edits/min to my knowledge. If QS runs smoothly, you can simply use your own (bot) account to edit directly via the QS tool without having to share the edit rate with other users’ batches. The only drawback is that you cannot close the browser window. —MisterSynergy (talk) 17:15, 13 July 2018 (UTC)
@MisterSynergy: This is what I think may have gone wrong, or may have changed. I'm logged in as JhealdBatch (talkcontribslogs) in an alternate browser running QS in a window, but I am still only getting 6 edits/min. Okay, so looking at the stats page that Lydia linked to earlier, it does appear that that is being accounted for separately to the 63/min average rate of QSbot; but I can't help wonder whether a cumulative throttle is being applied to all QS edits, across all active accounts. Jheald (talk) 17:40, 13 July 2018 (UTC)
This afternoon I am now getting two edits a minute from QS. Are we seriously trying to build a database here? Jheald (talk) 15:57, 15 July 2018 (UTC)
We can't work with mix-n-match either. --Gerwoman (talk) 16:47, 15 July 2018 (UTC)

Recommend merging of birth at sea (Q46998262) and death at sea (Q46998267) and renaming to "at sea"[edit]

Previous related discussions

  1. Wikidata:Project chat/Archive/2017/12#Born at sea and Died at sea
  2. Wikidata:Project chat/Archive/2018/05#Birth/death at sea

These fields simply do not work well enough, and they unnecessarily separate all related items that happen at sea. How is the place for a birth or a death at sea different with relation to "at sea"? I heartily agree that we need this more generic descriptor for the non-specific location I just don't see that it should have an event associated with the location when it is solely a location. How/why should they be different places? The term should be a more generic "at sea" where a more accurate place cannot be provided.

This merge and rename would also allow us to capture the missing requirements where people were baptised, married or buried at sea, which cannot currently be captured, and would be a nonsense to create as additional items.  — billinghurst sDrewth 13:51, 7 July 2018 (UTC)

Noting example of a reference to be used, where burial location should = "at sea"; and where memorial is in Gallipoli IWM.  — billinghurst sDrewth 13:53, 7 July 2018 (UTC)
A cursory look at the properties of these two items will show that they are not suitable for merging. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:13, 7 July 2018 (UTC)
Fine, then let us create a true location "at sea" and move the birth and death locations to the new item. We can then delete these weird location items that are not locations.  — billinghurst sDrewth 06:08, 8 July 2018 (UTC)
Let's try it and see how it goes: at sea (Q55438959). I don't see any need to delete the existing items. Ghouston (talk) 05:36, 10 July 2018 (UTC)
now fails a constraint on P119: not located in a administrative territorial entity. Possibly an unwanted constraint, since I think it would be undesirable to add everybody buried at sea as exceptions. Ghouston (talk) 06:06, 10 July 2018 (UTC)
I guess the best approach here would be to have "at sea" as an allowed value to the P119 constraint alongside the existing constraints - is that possible? Andrew Gray (talk) 15:16, 12 July 2018 (UTC)
It would be possible, but all the seas and oceans would have to be allowed too, since "at sea" is only needed when the specific place isn't known. Ghouston (talk) 02:44, 15 July 2018 (UTC)
no longer fails constraints, thanks to new unknown value items added to at sea (Q55438959). Ghouston (talk) 02:46, 15 July 2018 (UTC)

Where are the poets?[edit]

Hello. With the recent launch of Template:Literature links (Q54933000) on the French Wikipedia and the subsequent creation of Poetry Foundation ID (P5341), Poets.org poet ID (P5343), Printemps des poètes poet ID (P5344), Poetry Archive poet ID (P5392), cipM poet ID (P5393), Poets & Writers author ID (P5394) and Poetry International Web poet ID (P5430) here, poetry (Q482) is having its own small momentum on Wikidata. And this should keep going as Wikidata:Property proposal/Scottish Poetry Library ID and Wikidata:Property proposal/Australian Poetry Library ID are currently under way. The thing is, we're still not really covering as many subgenres, from contemporary Spanish poetry (Q3401098) to poetry in Africa (Q7207525), as we probably should. Would you help me identify potential properties by pointing out interesting databases that cover at least 150 poets through individual entries? There can probably be 5 to 25 relevant ones that we should consider. Let me know which they are if you can find some. They might be found on the website of a poetry festival (Q27186004) presenting past guests, for instance. Thierry Caro (talk) 13:57, 8 July 2018 (UTC)

Nothing at all? Thierry Caro (talk) 14:52, 10 July 2018 (UTC)
I'm not a poet and I'm unaware of any big poet databases - most poetry collections are focused around good poems from a genre and era, not poets. Deryck Chan (talk) 15:55, 10 July 2018 (UTC)
You might consider asking the Wikisource projects, instead of posting here on wikidata. Many Wikisourcers have been transcribing poetry collections into Wikisource. --EncycloPetey (talk) 15:57, 10 July 2018 (UTC)
a specialized request. for english there is https://www.loc.gov/rr/program/bib/poetrycrit/databases.html ; https://rpo.library.utoronto.ca/display/ ; https://guides.library.illinois.edu/c.php?g=694327&p=4921393 ; https://www.poetryfoundation.org/poems/browse#page=1&sort_by=recently_added ; https://www.poetryarchive.org/ ; http://library.princeton.edu/resource/3584 ; https://quod.lib.umich.edu/e/epd/ ; http://www.splitthisrock.org/poetry-database -- but these are by poem / author not genre, and some come with paywalls. it is all ad hoc, word of mouth: LOD has not reached them. Slowking4 (talk) 02:25, 11 July 2018 (UTC)
@Deryck Chan, EncycloPetey, Slowking4: Thank you. That's a very good beginning. We now have Wikidata:Property proposal/RPO ID and Wikidata:Property proposal/Split the Rock ID to start with. Thierry Caro (talk) 02:03, 12 July 2018 (UTC)
And also Wikidata:Property proposal/Les voix de la poésie ID and Wikidata:Property proposal/Poetry in Voice ID. Thierry Caro (talk) 14:10, 12 July 2018 (UTC)
If anyone has anything in languages other than French and English… don't hesitate to let me know! Thierry Caro (talk) 18:59, 12 July 2018 (UTC)
We could use a few more language of work or name (P407) statements on the poems, to identify the authors' languages:
SELECT ?auth ?authLabel ?sample_poem ?sample_poemLabel ?langs WHERE {
  {
     SELECT DISTINCT ?auth (SAMPLE(?item) AS ?sample_poem) (GROUP_CONCAT(DISTINCT(?iso_lang); separator=', ') AS ?langs) WHERE {
        ?item wdt:P31/wdt:P279* wd:Q5185279 .
        ?item wdt:P170|wdt:P50 ?auth .
        OPTIONAL {?item wdt:P407 ?lang . ?lang wdt:P218 ?iso_lang}
     } GROUP BY ?auth
  }
  
   SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en,fr,de,es,it,nl,sv,ru,cn,ar". }
} ORDER BY str(?authLabel)
Try it! -- Jheald (talk) 17:27, 13 July 2018 (UTC)
Thank you all! We also have Wikidata:Property proposal/Poetry Society of America ID going on now. Thierry Caro (talk) 20:25, 13 July 2018 (UTC)
And then Wikidata:Property proposal/MAPS ID. Thierry Caro (talk) 03:37, 15 July 2018 (UTC)

Fiction countries and country of citizenship[edit]

Why

< Hulk (Q188760) View with Reasonator View with SQID > country of citizenship (P27) View with SQID < United States of America (Q30) View with Reasonator View with SQID >

? Fiction characters lives in fiction countries. For example,

< Harry Potter and the Philosopher's Stone (Q43361) View with Reasonator View with SQID > narrative location (P840) View with SQID < London in fiction (Q6671104) View with Reasonator View with SQID >

and not

< Harry Potter and the Philosopher's Stone (Q43361) View with Reasonator View with SQID > narrative location (P840) View with SQID < London (Q84) View with Reasonator View with SQID >

.

Why not to have an item for fiction USA (and other countries) to use to fiction characters?

Xaris333 (talk) 11:19, 9 July 2018 (UTC)

  • Seems to me he is a fictional citizen of the actual U.S., not a fictional citizen of a fictional U.S. - Jmabel (talk) 14:59, 9 July 2018 (UTC)
    • The USA that Hulk lives is not the real USA a real USA citizen lives. Xaris333 (talk) 15:13, 9 July 2018 (UTC)
    • @Jmabel: One purpose of creating (classes of) fictional items was not to find fictional items in queries where they are not expected. If there is a strict wall between the two worlds, you can safely search for the citizens of the US omitting properties like_instance of (P31) and still find only actual items and never fictional one. Second, in the cas of Hulk, it seems that he lives in a country that is depicted in movies or comic books, definitely not the real USA. He’s a citizen of some fictional USA, and he’s a fictional characeter. Assuming there is no such thing as a « fictional citizenship ». author  TomT0m / talk page 15:16, 9 July 2018 (UTC)
      • That makes sense (especially in this case). Not the way I'd have modeled it, but reasonable.
      • Do we have a way to distinguish between the setting of a fiction in a realistic place and the setting of a fiction in a more fictional world? I'm thinking, for example, that the United States of Philip Roth's Goodbye Columbus is very much the actual U.S., whereas the one in The Plot Against America is quite fictionalized. - Jmabel (talk) 15:22, 9 July 2018 (UTC)
@TomT0m: I am a bit uncomfortable with this approach, because it will need us to create a whole parallel set of fictional countries, and perhaps a parallel set of cities to go along with London in fiction (Q6671104). The same argument might lead to duplicating all sorts of position-held values ("fictional President of the US"), occupations ("fictional lawyer"), etc just to avoid them showing up in queries. Then we would need to set up all the links between them so that "find me all the fictional scientists" would get all the subclasses etc.
It seems so much simpler to just make sure that we use an appropriate instance of (P31) and stay in the habit of using that for queries, and use all the routine properties and values for fictional people and places. We seem to manage okay at the moment and this would be a change with really dramatic consequences. Andrew Gray (talk) 19:40, 9 July 2018 (UTC)
I disagree. The real USA citizenship is may not be the same with the same USA citizenship. I mean, by adding that the country of citizenship of a fiction character is the real USA, we may be wrong. In the fiction world that country may be bigger, or smaller, or has a different (fiction) president. By say that Hulk has country of citizenship the real USA, is like we say that he lives in the real USA, with the real president, with the real population etc. And what about is the fiction character is the president of USA in the fiction USA? Then that character will has country of citizenship the real USA, but real USA item will have a real president as a value. And if the fiction character is the president of USA, would we add it to real USA item as the president due to value requires statement constraint (Q21510864)??? Xaris333 (talk) 20:06, 9 July 2018 (UTC)
@Xaris333, TomT0m: Please go ahead with your logic: if Hulf is citizen of a fictional USA, and Jack Ryan (Q1068314) is citizen of another fictional USA, then you have to create 2 different items for "fictional USA", in order to be able to perform correct queries for each different fictional USA.
The correct way is not to create correct queries: so if you want to want to have all citizens of real USA, then you have to create a query citizen of USA AND instance of human. If you want to have all US citizens of Avengers world, then combine citizen of USA AND instance of Avenger's character. Snipre (talk) 20:47, 9 July 2018 (UTC)
So if a fictional character is the president of USA, we will add it at the item of USA as president? And if you want to have the real USA president, then combine president of USA AND instance of human. That what you are telling us? Xaris333 (talk) 20:50, 9 July 2018 (UTC)
No, of course we don't add it to the USA item as "president". We seem to manage just fine saying (or any of the existing 33 fictional US presidents) without doing that.
In general, I really don't agree that this is a problem, and I think requiring fictional country items will just make everything much more complicated and confusing. Yes, the Hulk's "fictional USA" may not be precisely the same as the "real USA", but it makes sense to me to use the same item - it's clear that if the item is about a fictional concept, its claims need to be evaluated in that way. Andrew Gray (talk) 21:01, 9 July 2018 (UTC)
Yes, because your proposal is not symetrical: it offers you the possibility to retrieve real citizens or real residents but not US president of Avenger's world. Instead of creating a junk item like fictional USA which mix everything, you should propose a solution which use correctly the data: there is no reason to treat in a better way real data than fictional data or you have to rediscuss the usefulness of adding fictional data in WD.
And about the fact to add the president names to USA item, this example shows why this habit is a bad one.Snipre (talk) 21:06, 9 July 2018 (UTC)
It's not clear to me why presidents are even listed on United States of America (Q30). It's contributing to the excessive size of that item and only duplicating statements that can be found elsewhere. Ghouston (talk) 05:02, 10 July 2018 (UTC)
@Snipre: Actually, on a conceptual level, a fiction country is instantiated every time someone live the experience of reading a book, a comic book or another art type, in one imagination. In a sense, a fiction item is a class of experience, so it’s both totally ok to use a singular item that represents « all the experiences of person who lives (read/watch) a fiction in which the USA are mentioned or depicted ». It’s also perfectly ok to subclass it to represent a more precise experience, for example the US depicted in the Marvell universe if we find a reason for it, but it’s far from a logic we have to follow to be consistent internally. author  TomT0m / talk page 07:26, 10 July 2018 (UTC)
@TomT0m: Thank you for the ontology lesson, but I am a pragmatic guy: one of your main argument to create fictional countries was "One purpose of creating (classes of) fictional items was not to find fictional items in queries where they are not expected". I don't understand why this statement is true for separating real/fictional items and not to separate fictional/fictional items which aren't part of the same fiction world. If you want only real US citizens, I want only US citizen from Marvel's world, so why your logic real/fictional should be favoured ? Snipre (talk) 20:23, 10 July 2018 (UTC)
@Snipre: I’m a pragmatic guy too, at times. It’s pretty clear why we would want to put a clear reality/fiction barrier: I assume unless specified explicitly every query should be assumed to be about the real world, and a user should not have the surprise to see a fiction item and exclude it from the query one way or another. In fiction, this is a completely different story and its not crystal clear which query should be made or not, so the work to build separate worlds is far more questionable, and it is a lot of work. At the limit, you can create items for every reboot of every comic, or every episode, or you can use the same item for a set of work. It’s both less justified and more complicated, so pragmatically the « ontology solution » is a good one, less work and a clear ontological reason that justify the use of the same items for different fictions. A possibility of subclassing that leaves the door open if needed, so it’s flexible. Other argument: there is fictional analog of (P1074) View with SQID that can cross the fiction/real world barrier and allows to retrieve values from statements of the real object. We can assume that the value of the fiction item for the property is the value of the real world one, unless in the fiction it’s different (say the USA population has dropped because of a disease in the fiction). Then there is a justification to subclass the « fictional USA » item: there is a property for which we can make a different statement. author  TomT0m / talk page 07:29, 11 July 2018 (UTC)
Properties for fictional entities work reflect how the entity is depicted in the context of the work. If the work describes a character as being a citizen of the United States, and it is implied that "United States" refers to the country that exists in the world, the item should reflect that. Of course the character isn't a citizen of the actual United States, it's not a citizen of anywhere because it's a fictional character. The Hulk neither lives in a real US nor a fictional US, because the Hulk does not actually live, nor actually exist. The character is, however, depicted as living in the US. A fictional character can't have a real occupation, nor have family members, nor have a gender, ethnicity, birth date, etc, but items have these properties reflect the depiction in an "in-universe" manner. --Yair rand (talk) 22:13, 9 July 2018 (UTC)
The claim is surely invalid, because if you had access to US official records you'd discover no such citizen. Ghouston (talk) 05:11, 10 July 2018 (UTC)
@Ghouston: I'm sure you intended that as a relatively frivolous remark, but it sheds light on how complicated this can be. There is no central registry of U.S. citizens. Among others, anyone born in the U.S. is a citizen. They don't necessarily have papers to prove that they were born here. Until rather recently, many, perhaps most, rural southern African Americans lacked such papers. That didn't mean they weren't citizens (although those on the political right in the South did try to make more or less that case in terms of voting). - Jmabel (talk) 16:44, 10 July 2018 (UTC)
If the Hulk were to access US official citizenship records, I'm sure he would find himself listed there. :P --Yair rand (talk) 19:50, 10 July 2018 (UTC)
Not really frivolous, although the method I suggested is wrong. But if I made a list of citizens of the United States from Wikidata, I wouldn't expect fictional characters to be included, since their citizenship status is purely fictional. Saying that I should restrict my list to only humans is mistaken, since only humans (as far as I know) are eligible for such status, and anything else returned is just an error. My preference, rather than creating one or more items for fictional versions of the United States, would be to simply delete the statement from Hulk (Q188760). Ghouston (talk) 03:36, 11 July 2018 (UTC)
To put it another way, Hulk (Q188760) is not a person, but work. It's like saying . Valid statements may use depicts (P180), maybe with qualifiers or a variant of it. Ghouston (talk) 03:56, 11 July 2018 (UTC)
@Ghouston: I agree it's a bit odd to need to say "is a human" as part of a query that feels like it should only be about people (and personally, I'm terrible at remembering to do it), but most of the properties used for people are not (currently) defined as "real people only". Strictly speaking, the way to construct a list of "citizens of the United States" is to include a filter saying "and they have to be people" - much as if we construct a list of "things in a geographic area" we need to include a filter to clarify we mean "real things" and don't pick up, eg, historical battles with coordinates.
If we have fictional items, we're going to need to handle them some way or another. I think there are probably four ways (have I missed any others?) to deal with fictional items turning up unexpectedly in unfiltered searches -
  1. - don't have any fictional items, or if we must have them, don't put any "human" metadata on them. This is probably not an approach we want to go down; we have 50,000+ of them already, and it is useful and interesting to be able to run queries for "fictional Canadians", "fictional Presidents", or "fictional dentists".
  2. - normal properties with new fictional values, as @TomT0m: suggested here (country: fictional USA). This is sort of similar to what we do with sex or gender (P21) and animals. One problem here is that it's not just an issue for "country" - there are quite a lot of properties we might want to apply to fictional people (birthplace, educated at, job, etc), all of which might otherwise show up in searches for those topics. It also doesn't work for all types of properties (eg "show me people born on XX-YY-ZZZZ" can't have a value marked as fictional). The biggest problem here seems to be the massive potential for maintenance work to keep the fictional items aligned with the real ones and police their usage.
  3. - new fictional properties with normal values, eg "fictional place of birth: USA". This would be a bit less of a maintenance headache than #2, but we would still need to create parallel "fictional properties" for a wide range of different things.
  4. - define the item as fictional, but otherwise use normal properties and normal values, and explicitly filter searches to human/fictional by adding P31:Q5 in searches. This is more or less the current status quo, but it does mean a bit of extra work in the queries (and we all sometimes forget). We do not have to create any special fictional parallel items, except for eg places that only exist in fictional universes, which we'd need to do anyway.
It's also worth remembering that this isn't just an issue with fictional people. Non-people can have some of these properties - Larry (Q2262318) is undeniably real, for example, and has a real birthplace and date of birth. But you would be a bit surprised to have him returned in a search for "born in London in 2007" if you presumed that meant "people". So the use of fictional values/properties would break here - an instance filter is the only approach that guarantees real-people-not-cats for your searches as well as real-people-not-fictional-people.
Likewise, fictional things aren't just people; we have fictional companies, fictional books, fictional towns, etc. All of these might have various properties to describe them - so we could use the approach of fictional values or fictional properties, but it extends the problem further and requires us to create yet more parallel fictional items or properties (eg fictional co-ordinates?). I really worry that this opens us up to a huge amount of labour and headache.
On the whole, leaving "real values in real properties" for fictional items, explicitly marking them as fictional, and encouraging filtering in searches, seems the least complicated and most sustainable option. It's not perfect, but it's better than the alternatives. Andrew Gray (talk) 12:07, 11 July 2018 (UTC)
« The biggest problem here seems to be the massive potential for maintenance work to keep the fictional items aligned with the real ones and police their usage. » This is a non issue as there is no reason to « keep them aligned » in a strict sense. fictional analog of (P1074) View with SQID is meant to connect fiction items to real world one. If you want a strict alignment, just cross the line and get the real world values. More : fictional items may not by the contruction of the fictional universe involved be aligned with their real world analog, so align them strictly may even be counter productive.
« Likewise, fictional things aren't just people; we have fictional companies, fictional books, fictional towns, etc. All of these might have various properties to describe them - so we could use the approach of fictional values or fictional properties, but it extends the problem further and requires us to create yet more parallel fictional items or properties (eg fictional co-ordinates?). I really worry that this opens us up to a huge amount of labour and headache. » Yes, that’s why we have a « fictional stuff» class tree, rooted in fictional entity (Q14897293) View with Reasonator View with SQID. Restricting the world to fictional entity amounts to search only classes/instances of this class tree. I agree some kind of filtering this way is necessary if we use the same properties for real and fictional items, and I think no one wants to double the property as it’s far a bigger headache to create a property compared to create an item. author  TomT0m / talk page 12:24, 11 July 2018 (UTC)
I suspect things would be simpler if we allowed the redundant use of on all fictional items, even where they also inherit it indirectly. - Jmabel (talk) 16:00, 11 July 2018 (UTC)
Larry the cat born in London in 2007 is fine, since it's a real event that happened. A statement about The Hulk being born any place or any time is not fine, because it's not a real event. It would be no better than saying that a fictional character is an instance of human, just because they represent a human in fiction. Ghouston (talk) 02:32, 12 July 2018 (UTC)
From an purely academic point of view, you're probably right that fictional characters aren't actually citizens of real-life countries, but of fictional versions of those, just like they aren't actually humans/homo sapiens, but members of a fictional version. And Bruce Wayne isn't a billionaire, he's a fictional billionaire. But IMO, trying to actually model that in Wikidata would be a huge ton of work for little to no benefit. You'd have to create separate items for cities, countries, organizations, locations, worlds, professions, species, universes etc. for just about every fictional entitity. And consider, there isn't just one Hulk, there's dozens of versions - the one from the original comics, the ones from each comic book reboot of the universe, the one from the 70s tv show, the one from the 2003 film, the ones from the various animated shows, the one from the Marvel Cinematic Universe, etc. Each one would have its own version of the United States. And then there's stuff like fiction within fiction, dreams, parallel universes, different timelines, etc. After Marty McFly returned from the past in Back to the Future (Q91540), was he still a citizen of the original fictional version of the US, or of a different version because he changed the past and lives in a different timeline now? And you'd have to deal with stuff like canon and shared universes to see whether works take place in the same universe and therefore their characters would be citizens of the same fictional version of the US - which is where you'd have to delve deep into the fandom and canon debates to figure this stuff out. And then there's stuff like the whole rabbit hole that is the Tommy Westphall universe... --Kam Solusar (talk) 13:19, 13 July 2018 (UTC)
@Kam Solusar: Please read my comments above, I think I already answered to this point that was raised. We definitely don’t have to create one item per fiction universe if we avoid that approach. author  TomT0m / talk page 14:19, 13 July 2018 (UTC)
I think the best approach would be to avoid real-world properties like country of citizenship (P27) in favour of depicts with qualifiers. E.g., "depicts human" with qualifiers, date of birth, citizenship, etc. Ghouston (talk) 02:57, 15 July 2018 (UTC)

Modelling a publication schedule[edit]

Following on from Wikidata:Property proposal/chart date:

The tracking week for sales and streaming begins on Friday and ends on Thursday, while the radio play tracking-week runs from Monday to Sunday. A new chart is compiled and officially released to the public by Billboard on Tuesday. Each chart is post-dated with the "week-ending" issue date four days after the charts are refreshed online (i.e., the following Saturday). For example:
  • Friday, January 1 – sales tracking-week begins, streaming tracking-week begins
  • Monday, January 4 – airplay tracking-week begins
  • Thursday, January 7 – sales tracking-week ends, streaming tracking-week ends
  • Sunday, January 10 – airplay tracking-week ends
  • Tuesday, January 12 – new chart released, with issue post-dated Saturday, January 16

How would this be modelled, given that each chart is post-dated and this date (i.e. January 16 in example above) is the canonical point in time (P585) for each edition? (Other charts have different rules, such as being labelled according to the data collection period.) Would an item have to be created for every issue of the chart or publication, or would items like Easter − 2 days (Q14795488) (e.g. "two Fridays before last" or "point in time − 15 days") have to be used in conjunction with statements describing each of the events on the item for the chart? This could also be somewhat applicable for other publications which go to press at some point and are published online beforehand but are labelled with an entirely different date. Jc86035 (talk) 10:30, 11 July 2018 (UTC)

In the proposal at Wikidata:Property proposal/chart date (which is marked "not done") it is stated that publication date (P577) can't be used because the date the website is updated is a few days before the stated publication date. I do not agree that this is a barrier to using "publication date". A great many publications of all types make their publications available to the public before the stated publication date. Jc3s5h (talk) 13:27, 11 July 2018 (UTC)
@Jc3s5h: Yes, this is what I meant: how do we indicate both the actual publication date and the date printed in the work? The date of publication couldn't be indicated with stated as (P1932) or similar, because it's not the correct data type. Jc86035 (talk) 15:32, 11 July 2018 (UTC)
It's not customary, outside the Wikimedia Foundation projects, to state the actual publication date. It's customary to cite the publication date cited within the source. The purpose of a citation is to lead the reader to the source, and once tentatively found, allow the reader to confirm the correct source has been found. So the publication date stated in the source is usually what's important. What is your use case for giving the real publication date? Jc3s5h (talk) 15:48, 11 July 2018 (UTC)
If this reaches a conclusion, could someone make sure it is documented in (or at least linked from) Help:Dates and, possibly, mentioned in Help:Modelling/General#Time & dates? Thanks. - Jmabel (talk) 16:06, 11 July 2018 (UTC)
@Jc3s5h: The main reason I started this section was the other dates, although it might also be useful to have the actual publication date for some purposes (it might help for validation that some other publication hasn't written about the information in some entity before the entity was made public). Being able to model the time periods that data is collected would probably be useful, maybe in comparing different charts. Jc86035 (talk) 16:31, 11 July 2018 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Draft proposal (second level indicates qualifiers):

The weakness of this sort of model is that if any of the qualifiers for a statement change, the statement has to be entirely duplicated to indicate the change, with all but the current statement set to deprecated. I'm also not sure what start time (P580) should be in the example (i.e. July 13, the date that the new data collection period began, or July 25, the date that the chart based on that data is labelled with). Jc86035 (talk) 08:41, 12 July 2018 (UTC)

Relative dates[edit]

Further to the above, how can Easter − 47 days (Q14914941) actually be modelled as being 47 days before Easter? Jc86035 (talk) 10:51, 11 July 2018 (UTC)

Seems to me as the same as Mardi Gras (Q35105)--Jarekt (talk) 13:00, 11 July 2018 (UTC)
@Jarekt: It seems like the item, and the others at Help:Easter related dates, were created solely for the purpose of modelling the relationships, although they don't really do much because there's no data in the statements. Jc86035 (talk) 15:29, 11 July 2018 (UTC)
If this reaches a conclusion, could someone make sure it is documented in (or at least linked from) Help:Dates? Thanks. - Jmabel (talk) 16:07, 11 July 2018 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── @Jmabel, Jarekt: Would it be useful to have "relative date" properties (values like "-2 days"), possibly with either a set qualifier to indicate what the thing is relative to, and a matching pair property for each like "relative date qualifier refers to" for situations where "relative date" properties are themselves used as qualifiers? This could be extended e.g. with "relative start date" like in the section above. Jc86035 (talk) 09:08, 12 July 2018 (UTC)

  • I'm not sure I follow all of that, but, yes, the ability to say "relative to this holiday" (Easter, etc.) and give a number of days before or after would be useful. Similarly, to be able to say something like "three days before the first of any month". - Jmabel (talk) 15:14, 12 July 2018 (UTC)
I think I am fine with Easter − 47 days (Q14914941) kept separate from Mardi Gras (Q35105). I can not think of a better way to model that. --Jarekt (talk) 16:51, 12 July 2018 (UTC)

Scholarly projects and notability policy.[edit]

There has been a lot of interest among the scholarly community in using Wikidata to store linked open data records from academic projects. At the same time, there has been concern about community policies that might veto items scholars wish to add. (This was the main topic of conversation in a 30-person Wikidata break-out discussion at a Linked Data workshop at Digital Humanities two weeks ago.)

We are one such project and are currently evaluating Wikidata as a platform to store biographical facts about people, places, organizations, and events which historians have researched during the course of creating documentary editions. We would want to import these facts from several edition projects into Wikidata, so that they could be re-used by the community and by historians working on documents from similar time-frames. Such a database (of 19th-century USA, in our case) has attracted some support from other scholars who wish to contribute similar datasets of e.g. all soldiers in the United States Colored Troops. But would these contributions be acceptable to the Wikidata community? Does an item like Nathaniel Oldham (a Kentucky barber in 1860 who is mentioned in official documents) belong in Wikidata, or should we look for a different home for this data? Nineteenth Century Digital Cooperative (talk) 17:07, 11 July 2018 (UTC)

@Nineteenth Century Digital Cooperative: Thanks for asking here! Do you have a rough estimate of the number of items that might be created? If it's less than 1 million I think wikidata would be a fine home for this sort of thing. If we are likely talking about millions, tens of millions or more, then it might be better to look into creating a separate wikibase instance (the same software) and looking into federation. I know User:Addshore has done a lot of work on this - for example there's a simple docker image that you can use to run the software now - also see his notes here. ArthurPSmith (talk) 17:30, 11 July 2018 (UTC)
That's certainly encouraging! At the moment we're only talking about tens of thousands of records, though projects like the USCT could bring that into the hundreds of thousands. I suppose that it's conceivable that our project could grow to encompass a census digitization, bringing it into the tens of millions, but that's highly unlikely.
Another question is whether Wikidata would accept items regarding people with incomplete names, such as enslaved individuals referred to without surnames, or individuals who have been researched by historians but whose names are not fully known, since they are referred to as "X's wife", etc. A good example of this from a different project is how Prosopography of the Byzantine World handles "anonymous" people: unnamed strategos of Thrace. Would entries for these people also be acceptible? Nineteenth Century Digital Cooperative (talk) 17:48, 11 July 2018 (UTC)
Oh, and the Docker image by User:Addshore is great! We got that up and running yesterday, since we've been trying to research both options in parallel. Nineteenth Century Digital Cooperative (talk) 18:13, 11 July 2018 (UTC)
If you have a project that adds hundred of thousands of entries, the proper way to go about it, is to create a bot proposal and then the bot proposal is the place where the discussion whether or not the dataset belongs takes place.
Items have to point to clearly identifiable entities. If I create an item called "John Smith" it's not possible for anybody to identity which John Smith the item refers to. If a otherwise notable person has one wife and we don't know the name of the wife, that's however enough to identitfy which human is meant. It always has to be possible for other people to build on the item and enough information need to be provided for that to be possible.
As far as sounces go, I consider any item that's well-enough sourced to be of interest of serious historians to have enough sources for Wikidata. ::When it comes to the example of the barber, please fill floruit (P1317) when you don't know the date of birth or date of death but you do know that the person lived at a particular point in time.
In general it's preferential if the data isn't only hosted on Wikipedia but also on the website of relevant project that submits the data and we can link via an external identifier to the data. ChristianKl❫ 18:33, 11 July 2018 (UTC)
Thanks, @ChristianKl: -- that's very useful guidance. It sounds like we should run our own instance then (which will also allow us to host our legacy narrative biographies) and submit data donations to Wikidata to follow best practices? Nineteenth Century Digital Cooperative (talk) 14:27, 13 July 2018 (UTC)
@Nineteenth Century Digital Cooperative: The way you describe your data sounds to me like it might be a good fit for the FactGrid wiki (described at Wikidata:FactGrid). This is a Wikibase instance that is currently being set up and filled with data (including incomplete information like hypotheses about the identity of a person, location, document or event) related to the Illuminati (Q133957) as a pilot project, but the intention is to make it useful to research in the humanities more generally. Pinging User:Olaf Simons, who runs the project. --Daniel Mietchen (talk) 20:04, 11 July 2018 (UTC)
Thank you for the invitation! It sounds like your project has some interesting parallels to ours. I'm curious whether you imported properties from Wikidata (and possibly some items?) to simplify data donation/reconciliation later? Nineteenth Century Digital Cooperative (talk) 14:27, 13 July 2018 (UTC)
Items for "non-notable" people do get deleted. I nominated Nathaniel Oldham (Q55445723) for deletion at Wikidata:Requests_for_deletions#Nathaniel_Oldham_(Q55445723) so that we can find out if it applies in this case. Ghouston (talk) 06:01, 12 July 2018 (UTC)
You're making a mistake here Ghouston, Wikidata:Requests for deletions is for implementing the policy, not for making or discussing it. Multichill (talk) 14:54, 12 July 2018 (UTC)
I do appreciate Ghouston floating the item as a trial balloon -- it's very much in keeping with the exploratory phase our project is in. Nineteenth Century Digital Cooperative (talk) 14:27, 13 July 2018 (UTC)
Nevertheless, I would hate to see an item being deleted by mistake. If an item has an external identifier on it (eg linking to a master-record on your site), or even just a statement on focus list of Wikimedia project (P5008) = <an item for your project>, those can both usefully flag up that an item may be in ongoing development, or may have significance as part of a wider project, otherwise this might not be appreciated. Jheald (talk) 14:43, 13 July 2018 (UTC)
I find the notability policy really unclear. "It refers to an instance of a clearly identifiable conceptual or material entity. The entity must be notable, in the sense that it can be described using serious and publicly available references. If there is no item about you yet, you are probably not notable." The first part is clear, but is a pretty low bar. The second is vague, and could use a link to a list of examples of notable and non-notable things. The third is just weird, since it implies you can't make an item unless it already has an item. However, Multichill may be right about the deletion request: it will probably just sit there for six months with no resolution. Ghouston (talk) 00:12, 14 July 2018 (UTC)
if you do not understand it, maybe you should not nominate things. maybe we should keep things that "fulfills some structural need". Slowking4 (talk) 03:05, 16 July 2018 (UTC)
I don't see how "structural need" would be relevant for the item I nominated, since no other items link to it. But perhaps I don't understand that clause either. Ghouston (talk) 06:35, 16 July 2018 (UTC)

Off-wiki consensus[edit]

Four months ago, there was a dispute over the validity of community decisions made off-wiki. Some users will occasionally discuss Wikidata issues using off-wiki methods of communication, such as the mailing list, IRC, direct email, social media, and even at IRL events such as Wikimania. I think we need to determine whether decisions made using such off-wiki channels can ever be called "community consensus" and be considered valid. --Yair rand (talk) 23:20, 11 July 2018 (UTC)

  • I'm sure they work for offwiki people, but why should they be valid onwiki?
    --- Jura 08:09, 12 July 2018 (UTC)
  • The more people, the harder it gets to get any consensus. Besides sometimes it is difficult to take any decision on wiki, because people just drop their "oppose" and disappear from the conversation, which is very uncivil but it happens very often. In general if it is an important consensus for the on-wiki world, I guess it is necessary to inform about it at least here on the project chat. --Micru (talk) 11:08, 12 July 2018 (UTC)
  • I would say that any off-Wiki discussion like this should be posted to Project Chat, more or less in the format "In a recent conversation at [place], a group of Wikimedians agreed that [whatever was agreed to]. Notes from that conversation are available at [link]. Is there general consensus here that this should be our best practice? Is there something we failed to consider?" In the final analysis, consensus can only be achieved by making the conversation available to all Wikidatans. Not all Wikidatans will have the opportunity to participate in a discussion at an event, or on Facebook or irc. - PKM (talk) 20:06, 12 July 2018 (UTC)
    • Aren't there projects that have rules banning discussions on Facebook and the like?
      --- Jura 22:19, 12 July 2018 (UTC)
      • Why should discussions on Fb be banned? It is a discussion platform as valid as any other.--Micru (talk) 11:48, 13 July 2018 (UTC)
        • It excludes people without accounts who may be interested in contributing. Ghouston (talk) 02:38, 15 July 2018 (UTC)
        • Also, people with accounts that don't want their real name linked with their Wikimedia activities. Ghouston (talk) 02:38, 15 July 2018 (UTC)
          • you cannot govern off wiki activity, by fulminating here. if people think their deliberations are appropriate, then they will mention it on chat. if you want to encourage more on wiki discussion, then you need to make it more friendly, and easier than wikicode. Slowking4 (talk) 02:58, 16 July 2018 (UTC)

Possible vandalism[edit]

Does anyone actually know the mass of Kylie Jenner (Q1770624) right now? An IP has changed it from 56 to 65, and changed her eye colour statement. I personally think this is not useful information to have, since the mass of a human is highly variable and the statement does not have a time qualifier. Jc86035 (talk) 08:07, 12 July 2018 (UTC)

Added 1 year semi-protection and removed some unreferenced claims which have been vandalized way too often to evaluate the original value and its provenance. —MisterSynergy (talk) 08:28, 12 July 2018 (UTC)

Missing wikidata properties with property Id P0- P5[edit]

Hi all,

Please can anybody give some information about missing properties P0-P5 in the property list. In some text data, I got a mention of properties within this range but here in the official wiki page, I am not able to find them. If they are deprecated, please provide a link from where I can know about them.

Thanks in advance! Parul0912 (talk) 08:35, 12 July 2018 (UTC)Parul

  • I'm curious about that. What does your text say? Occasionally, numbers get skipped, so they may never have existed.
    --- Jura 08:44, 12 July 2018 (UTC)
    • There are indeed no deletion logs for any property lower than P6, so they might not have existed at any time. —MisterSynergy (talk) 09:23, 12 July 2018 (UTC)
      • They never existed indeed. It is because the IDs were reserved for items and accidentally also were reserved for properties. Uuuupppps ;-) --Lydia Pintscher (WMDE) (talk) 13:49, 12 July 2018 (UTC)
        • It exists, because in the wikidata dumps we can find many entries with P0 to P5  – The preceding unsigned comment was added by [[User:|]] ([[User talk:|talk]] • contribs).

────────────────────────────────────────────────────────────────────────────────────────────────────

  • Could you list a few samples?
    --- Jura 07:07, 13 July 2018 (UTC)

Railway depots[edit]

On San Gottardo railway station (Q16609440) I would like to add the property 'Railway depot'. This property does not seem to exist and is quite separate from station. In this case it is also used as tram depot. No distiction should be made between tram or train. Te distinction between tram and train is in some cases unclear, not to mention metro's and ligthrail. I think railway/tram workplaces(heavy maintenance) should be put on the same (new?) property.Smiley.toerist (talk) 08:46, 12 July 2018 (UTC)

motive power depot (Q10283556) could be used, this states it's for locomotives but is being used more widely, reflecting the general nature of the label. There is also tram depot (Q27028153) specifically for trams and similar for buses: bus station (Q494829). Nothing specific for trains though, might make sense to create one as sub-class of motive power depot (Q10283556)? JerryL2017 (talk) 09:03, 12 July 2018 (UTC)
I have added tram depot (Q27028153) for tram depot. There may be some locomotives, but certainly the railway motorcars (see picture
FdS Cagliari Largo Gennari.jpg
) are maintained there. Smiley.toerist (talk) 09:38, 12 July 2018 (UTC)
@Smiley.toerist: See train depot (P834). Thierry Caro (talk) 11:28, 12 July 2018 (UTC)

Wouldn't it be better to link the depot with has facility (P912) instead of instance of (P31)? Ahoerstemeier (talk) 13:58, 12 July 2018 (UTC)

@Ahoerstemeier: My suggestion is that the P912 should better be used e.g. "hey, this electrified railway line has 35 substations". --Liuxinyu970226 (talk) 12:37, 13 July 2018 (UTC)

Festál[edit]

Festál at Seattle Center (Q5445891) is an annual series of (mostly) ethnic festivals at Seattle Center. I've tried to model a citeable statement that there were 24 of these festivals as of 2009. I have no idea if there might be a better way to model this than what I did; can someone have a look? - Jmabel (talk) 15:31, 12 July 2018 (UTC)

  • I added a couple of statements.
    --- Jura 15:50, 12 July 2018 (UTC)

Vashon and Maury Islands[edit]

How can we model that some sources consider Maury Island (Q6793978) part of Vashon Island (Q12834566) while others consider them two distinct islands? - Jmabel (talk) 15:47, 12 July 2018 (UTC)

I would create two items for Vashon Island (Q12834566), one for the smaller physical contiguous island, and a second for the combined area of both. If there's some political/administrative structure associated with the latter maybe that's what its class should be, as it's not strictly an instance of "island". A wikidata item should correspond with a single conceptual entity, so two distinct concepts should have two items. ArthurPSmith (talk) 18:56, 12 July 2018 (UTC)

Something has gone wrong...[edit]

ʻahuʻula (Q8083959) seems to be about a feather cloak, but describes The Death of Captain James Cook (Q7729420)? --Magnus Manske (talk) 21:28, 12 July 2018 (UTC)

@Magnus Manske: This painting infobox has been there for eight years. @Edgars2007, Aktron, Beko: who should be made aware of this blunder (Islandbaygardener is no longer around). Mahir256 (talk) 21:35, 12 July 2018 (UTC)
The hook for the image in the en.wikipedia article being "One of these cloaks was included in a painting of Cook's death by Johann Zoffany." --Tagishsimon (talk) 22:01, 12 July 2018 (UTC)
The painting and the ʻahuʻula (Q8083959) itself need to be separate items. I'll clean this up. - PKM (talk) 21:52, 14 July 2018 (UTC)
@Magnus Manske, Tagishsimon: Okay, now existing The Death of Captain James Cook (Q7729420) <depicts> ʻahuʻula (Q8083959) and mahiole (Q4285713), which are both built out. If anyone has a line on the actual ʻahuʻula and mahiole borrowed by Zofany from "a museum in Vienna", we should give them their own items and add them to the list of things depicted in the painting. - PKM (talk) 23:31, 14 July 2018 (UTC)
@PKM: This stunning picture of a featherwork cloak appeared in my twitter feed today [2]. Now wondering whether to go to the talk on Thursday. [3] -- being advertised with the headline image from your newly sorted-out item ʻahuʻula (Q8083959). Jheald (talk) 16:21, 16 July 2018 (UTC)
@Jheald: Serendipity! - PKM (talk) 19:21, 16 July 2018 (UTC)

Use an image as reference[edit]

วัดหนองเค็ด.jpg

In Wat Nong Khet (Q55225353) I added a image from commons as a reference, because that photo of the temple sign easily (if one is able to read Thai) proofs that location statement. However, this gives a warning that the property image isn't supposed to be used as a reference. Is there any better way to do this? Ahoerstemeier (talk) 08:49, 13 July 2018 (UTC)

Thanks, wasn't aware of the tombstone reference. Is there any subclass of signage (Q1211272) which better describes this sign? Ahoerstemeier (talk) 09:58, 13 July 2018 (UTC)

My property deletion proposal[edit]

Am I allowed to withdraw a property deletion proposal? I proposed number of platform tracks (P1103) for deletion in May 2017 due to the data being polluted due to incorrect usage and mistranslated property labels and descriptions, and the discussion still has not been closed. (I think there should really be guidance for closing discussions as no consensus.)

At this point, since the data is still somewhat useful, if the discussion is closed and the property is kept, I think that there should be a check of all values against values of different Wikipedias, and/or the conversion of values to either has part (P527) or new properties if it is found that the count of "number of platforms" is based not on "number of platform tracks", but something else (i.e. number of usable platform edges; number of platform edges in use; number of distinct platform numbers; number of independently usable boarding areas on platforms; number of continuous floor surfaces which have usable platform faces, …). Jc86035 (talk) 15:59, 13 July 2018 (UTC)

Nobels[edit]

Hello. It's a detail but I would rather have Nobel prize ID (P3188) split into different properties. The IDs would be shorter and would look more like actual IDs than pieces of URLs. Constraints would be tighter. And it would also be useful for the French Wikipedia. We would have articles about scientific laureates displaying links through Template:Research links (Q54913733) while the ones about winning writers would use Template:Literature links (Q54933000), for instance. Again, it's certainly just a detail but would you mind? Thierry Caro (talk) 16:01, 13 July 2018 (UTC)

Hi. I think this would be a positive change. This would indeed refine the properties. Nomen ad hoc (talk) 16:12, 13 July 2018 (UTC).
Ping @Pigsonthewing, YMS, YULdigitalpreservation, Edgars2007, Richard Arthur Norton (1958- ), Pamputt:, who supported the creation of a unique identifier. Nomen ad hoc (talk) 11:12, 14 July 2018 (UTC).

WE-Framework 2.0[edit]

Hello, Wikidata community. Not-well-known WE-F gadget just released version 2.0. There is not a lot of changes outside, but internally it's a full rewrite. Details are following.

New features and secrets[edit]

Aliases editing
  • Gadget supports aliases editing now.
  • There are some changes for datavalues editors:
    • Gadget will not break you data until the first edit of particular claim/qualifier. Old-version bug with quantities (when Wikidata team broken compatibility) will not happen. Thanks to redux library.
new time datavalue editor
    • Internal time datavalue editor now relies on Wikidata API for parsing and formatting of values.
    • wikidata-item editor:
      • will provide some image preview on search.
      • will advice on allowed values for some properties if one-of constraint (Q21510859) constrain defined
      • still editor can select "other" option and enter any value
    • unsupported datatypes will still try to render existing values using Wikidata API (and will NOT break gadget).
qualifier-per-row editing mode
qualifier-per-column editing mode
    • There is no more "columns" modifier for claim editors. Claim editors will automatically switch from per-row view to table-view when some magic condition are met. For example, if you have 2-3 values for population (P1082) property you will see "standard" per-row editor. As soon as you have at least 5 values and at least 20% of them have single qualifier (i.e. point in time (P585)) editor will switch to table view. Magic.
    • There is no "flag" attribute for property editors. Flag will be automatically obtained from country (P17) reference of property.
    • There is no "search" attribute for property editors. Gadget will automatically take source website for the property (P1896) value and create a magic button for Google search.
SPARQL-based property tab
    • There is an additional SPARQL-group of properties. Now some editor tab can hold not a predefined set of properties, but a set of properties from SPARQL query. Very useful for quick-populated property sets like authority control (Q36524).
      • Same applies to external links editor. All tabs there are now SPARQL-groups.
      • Properties are sorted for your convenience. First goes properties with sites in your language, on second place is current project language, other are sorted by label.
  • Saving is now single request operation. You don't need separate request for removing claims (as in 1.x version). Well, you still need additional requests to set tags, but they are faster and safe to skip.
Enabled editors settings for we-framework
  • It is now possible to select which editors are enabled and which are not. Settings are stored in localStorage and bound to particular project. You can have different settings (enabled editors) on different projects.

Technology changes[edit]

  • Instead of single JavaScript file new webpack+babel+(karma+mocha) build system is used. Non-minified version of gadget is 2.7 MiB!
    • Published version is minified to 850 KiB (before gzip). But it load faster than 1.0 version (because of webpack optimizations)
    • There are autotests. Not all scenarious are covered, but some are.
    • Gadget is rewritten with react+redux+reselect libraries. Some third-party libraries are "react-autosuggest", "react-tagsinput", "semantic-ui-react" (for popup) and... jQuery! Yes, mediawiki still relies on JQuery a lot. Dialogs and tabs are "native" JQuery widgets.
  • React is a fast rendering library (because of virtual DOM usage). But be aware of long tables. Some tabs (SPARQL-ones) may have >1400 different claims. They may require fast CPU.

How to migrate[edit]

You don't need to do anything. Update script will handle it for you automatically.

How to try?[edit]

Follow the white rabbit instructions.

Discussion[edit]

Thank you a lot Vlsergey! Very nice improvement, the "editors settings" in particular is a very good idea (on Wikisources, I only use FRBR Edition and sometimes FRBR Work but never the others). Cdlt, VIGNERON (talk) 10:41, 15 July 2018 (UTC)
Wow, congratulations, it looks like you put a lot of work into it!--Micru (talk) 14:02, 15 July 2018 (UTC)

Housing estates and planned communities[edit]

We have

< Category:Housing estates (Q8526469) View with Reasonator View with SQID > category's main topic (P301) View with SQID < housing estate (Q12104567) View with Reasonator View with SQID >
< Category:Planned communities (Q8765027) View with Reasonator View with SQID > category's main topic (P301) View with SQID < housing estate (Q12104567) View with Reasonator View with SQID >

I would presume the latter is simply wrong; I would expect

< Category:Planned communities (Q8765027) View with Reasonator View with SQID > category's main topic (P301) View with SQID < planned community (Q1074523) View with Reasonator View with SQID >

but the existing value is cited as imported from the German-language Wikipedia. As I understand it, I'm not supposed to override a cited statement without a citation, but of course once this information was migrated onto Wikidata it was removed from all the Wikipedias, which now draw it from Wikidata, so I have nothing to cite. Can I just ignore that and make the (I presume) obvious correction? If so, do we really need to keep around the cited, deprecated, (I think) obviously wrong value? How does this work? - Jmabel (talk) 01:16, 14 July 2018 (UTC)

  • Personally, I'd just make the change. I doubt anybody would insist on retaining incorrect connections just because they were imported from Wikipedia. Ghouston (talk) 04:14, 14 July 2018 (UTC)
  • Agree with Ghouston. Though before doing so it is always worth checking the Wikipedia in that cited language, to see whether that can give a clue to how the statement that looks odd in English came about. Sitelinks can sometimes be a bit approximate, so a statement that looks odd in English may seem a lot less odd in the context of the other language. Here though the category Category:Planned communities (Q8765027) in de-wiki includes things like "garden city", so should indeed be identified with a rather broader topic that housing estate (Q12104567). Jheald (talk) 07:08, 14 July 2018 (UTC)
  • Glad to hear it. Do we have any rule about exactly where the line falls between things where we can just make changes like this and where we have to deprecate? Because about a month ago, in a case that didn't strike me as terribly different from this (though it was a bit different), I got reverted as something like "vandalism or ignorance of process". - Jmabel (talk) 08:32, 14 July 2018 (UTC)
  • So... I made that change and ran into a warning because (I didn't notice this before) and . We do not have an item for "planned city" distinct from "planned community". I can think of more than one way one could sort this out, but at this point I think I'm going to let go and defer to those of you who have been doing Wikidata longer than I. - Jmabel (talk) 08:38, 14 July 2018 (UTC)

British Q849811[edit]

In Welsh some of the items listed on the British Q849811 - disambiguation page item would be split in Welsh between Brythonig (relating to things that were on the islands of Britain before the Anglo-Saxon-Jute migration) and Prydeinig (relating to British in its modern sense), which seems to suit the need of other languages who have a Q849811 page. How do I translate the item? Using either / or for the same number wouldn't be correct in Welsh. AlwynapHuw (talk) 02:53, 14 July 2018 (UTC)

In English, I'd use "Brythonic" aliases "Brittonic", "Early British" and "British" for Welsh Brythonig, and just "British" for the modern sense, making the distinctions clear in the descriptions of the various items. - PKM (talk) 22:17, 14 July 2018 (UTC)

Forsythia ovata vs Forsythia koreana[edit]

I wonder if Forsythia ovata (Q428340) (Korean forsythia) is a synonym of Forsythia koreana (Q12583210) or vice versa? My usual methods are failing me. Abductive (talk) 08:12, 14 July 2018 (UTC)

Temple De Hirsch Sinai[edit]

Temple De Hirsch Sinai (Q7698513) is a weird amalgam of data that applies only to the mostly demolished old Temple De Hirsch (built 1908), clearly the subject of an NRHP listing in 1984, the present-day merged congregation Temple De Hirsch Sinai (the inception date given here is 1971); it is described as an instance of a synagogue rather than of a congregation, though, and the congregation has two present-day synagogues (one in Seattle, one in nearby Bellevue). There is no separate item for either precursor congregation (Temple De Hirsch, founded 1899, and Temple Sinai, founded 1961). I'm guessing there should be separate items for at least:

  • The 1908 synagogue
  • Each of the two extant synagogues
    • And I just found out that the current Bellevue campus isn't the original Temple Sinai in Bellevue.
  • The present-day congregation, embracing all of these, plus various schools etc.
  • Each of the two precursor congregations

Anything else? Anything above that seems to be headed the wrong way? - Jmabel (talk) 09:16, 14 July 2018 (UTC)

Looks sensible to me.--Ymblanter (talk) 10:33, 14 July 2018 (UTC)
Trying to move ahead with this. I've pulled out Temple De Hirsch (Q55540966) for the 1908 synagogue, and will work on the two extant synagogues [and the earlier Bellevue synagogue]. However, from what I can see, we don't currently have a class suitable to a Jewish congregation. This seems remarkable after so many years, so perhaps I am missing something. We have congregation (Q2638480) that is specifically Christian; congregation (Q2135977), which might be OK judging by the linked Spanish-language article, but some of the other linked articles (e.g. nl:Gemeente (kerk) imply a specifically Protestant usage; also, the current description for congregation (Q2135977) doesn't seem to match the linked articles. Do we have a more general item (perhaps under some other name) that would include a Jewish congregation? - Jmabel (talk) 00:09, 15 July 2018 (UTC)

Father and son - same image[edit]

https://commons.wikimedia.org/wiki/File:Gulden_Cabinet_-_Raphael_Sadelar_p_465.jpg 85.182.58.222 09:30, 14 July 2018 (UTC)

The picture itself says he was born 1555, which probably means this is the Elder (now generally said to be born ~1560). - PKM (talk) 00:06, 15 July 2018 (UTC)

Places along a Tour de France stage[edit]

Is there a way to include (major) cities along the route of a Tour de France stage? Right now only start and destination point are documented (example: today).--95.222.168.248 13:32, 14 July 2018 (UTC)

The property via (P2825) will work well for this, so, yes. --Tagishsimon (talk) 22:47, 14 July 2018 (UTC)

i make web for noval .information about Noval..[edit]

Hello i am shiv can you help for serval knowledge about "Noval" like what is "Noval" ?? which type of Noval Famous noval writher and about Noval reply as early as posible.

What questions concerning the strategy process do you have?[edit]

Hi!

I'm Tar Lócesilion, a Polish Wikipedia admin and a member of Wikimedia Polska. Last year, I worked for Wikimedia Foundation as a liaison between communities and the Movement Strategy core team. My task was to ensure that all online communities were aware of the movement-wide strategy discussion. This year, my task similar. Phase II of the strategy process was launched in April. Currently, future Working Groups members are being selected, and related pages on Meta-Wiki are being designed.

I’d like to learn what questions concerning the strategy process would you like to be answered on the FAQ page? Please answer here, on my talk page, or on a dedicated talk page on Meta-Wiki. Thanks!

If you have any questions or concerns, please, do ask!

Thanks, SGrabarczuk (WMF) (talk) 18:21, 14 July 2018 (UTC)

One objective is to ensure that we have a proper coverage of significant subjects. At this time Africans are less than 1% of Wikidata humans. We do not know its politicians, we do not know the territorial administrative divisions. Our support became worse thanks to the end of Wikipedia Zero. What are we going to do about subjects that have a massive impact on the audience we want? No, Wikimania 2018 does not count as long as there are no results it is window dressing. Thanks, GerardM (talk) 06:38, 15 July 2018 (UTC)
@GerardM: is this a FAQ thing? SGrabarczuk (WMF) (talk) 17:05, 15 July 2018 (UTC)
@SGrabarczuk (WMF): No idea what you are saying... The strategy talks about coverage and we are very much only doing the same stuff but in more detail. Thanks, GerardM (talk) 18:04, 15 July 2018 (UTC)
@GerardM:, we're building the pages concerning Phase II. Among others, FAQ. Please have a look at FAQ page with questions about Phase I that took place last year. The page we're currently building will be similar, but as the organizational frames are different (e.g. there will be working groups), questions that need to be answered should be different as well. SGrabarczuk (WMF) (talk) 18:22, 15 July 2018 (UTC)

publication date vs inception (in general)[edit]

--Micru (talk) 21:46, 24 August 2014 (UTC) Tobias1984 (talk) TomT0m (talk) Genewiki123 (talk) Emw (talk) 03:09, 9 September 2014 (UTC) —Ruud 16:15, 9 December 2014 (UTC) Emitraka (talk) 14:32, 14 October 2015 (UTC) Bovlb (talk) 19:10, 21 October 2015 (UTC) Peter F. Patel-Schneider (talk) 22:21, 23 October 2015 (UTC) ArthurPSmith (talk) 15:51, 5 November 2015 (UTC) --Daniel Mietchen (talk) 20:53, 3 January 2016 (UTC) --Harmonia Amanda (talk) 22:00, 27 February 2016 (UTC) --Lechatpito (talk) --Andrawaag (talk) 14:42, 13 April 2016 (UTC) --ChristianKl (talk) 16:22, 6 July 2016 (UTC) --Cmungall Cmungall (talk) 13:49, 8 July 2016 (UTC) Cord Wiljes (talk) 16:53, 28 September 2016 (UTC) DavRosen (talk) 23:07, 15 February 2017 (UTC) Vladimir Alexiev (talk) 07:01, 24 February 2017 (UTC) Pintoch (talk) 22:42, 5 March 2017 (UTC) Fuzheado (talk) 14:43, 15 May 2017 (UTC) YULdigitalpreservation (talk) 14:37, 14 June 2017 (UTC) PKM (talk) 00:24, 17 June 2017 (UTC) Fractaler (talk) 14:42, 17 June 2017 (UTC) Andreasmperu Diana de la Iglesia Jsamwrites (talk) Finn Årup Nielsen (fnielsen) (talk) 12:39, 24 August 2017 (UTC) Alessandro Piscopo (talk) 17:02, 4 September 2017 (UTC) Ptolusque (.-- .. -.- ..) 01:47, 14 September 2017 (UTC) Gamaliel (talk) --Horcrux92 (talk) 11:19, 12 November 2017 (UTC) MartinPoulter (talk) Bamyers99 (talk) 16:47, 18 March 2018 (UTC) Malore (talk) Wurstbruch (talk) 22:59, 4 April 2018 (UTC)


Pictogram voting comment.svg Notified participants of WikiProject Ontology

Currently, publication date (P577) isn't a subproperty of anything. Should it be a subproperty of inception (P571), or at least of start time (P580)?--Malore (talk) 22:40, 14 July 2018 (UTC)

  • Maybe point in time? Publication is after inception.
    --- Jura 23:02, 14 July 2018 (UTC)
@Jura1: You're absolutely right, I hadn't thought something exists also before being published.--Malore (talk) 23:32, 14 July 2018 (UTC)
I've been told that for public artworks, like statues, the unveiling date is recorded in Wikidata as inception, not publication date. Ghouston (talk) 02:05, 15 July 2018 (UTC)
There's also an issue that publication date (P577) is not merely "date of publication", but also bearing the alias "date of first publication". There is no means to distinguish date of first publication for an item from the date of printing for a specific version of that item. Many books have a print date for the particular print run as well as an edition date for release of that edition.
There is also no means to distinguish the date printed on a particular publication from the date an item was actually published. Such differences of dating are critical in biological nomenclature, where the actual date of publication is used to determine which author has priority for naming a species or other taxon. --EncycloPetey (talk) 02:23, 15 July 2018 (UTC)

publication date vs inception (for software)[edit]

MichaelSchoenitzer
dachary
Metamorforme42
Ash Crow
OdileB
John Samuel
Jasc PL
Daniel Mietchen


Pictogram voting comment.svg Notified participants of WikiProject Informatics/Software

I noted Wikidata:WikiProject Informatics/Software/Properties suggests to use both the properties. Wouldn't be better if we decide to use only one of them? Personally, I think publication date (P577) is better because it allows to specify the version type.--Malore (talk) 22:56, 14 July 2018 (UTC)

I think there is a difference between the two. inception (P571) is when work started on it while, publication date (P577) is when it was published to the open. These two dates can differ for month or even weeks. I also think I already saw items where this was the case and modeled like this. -- MichaelSchoenitzer (talk) 23:24, 14 July 2018 (UTC)
@MichaelSchoenitzer: You're right, I hadn't thought about it. However, in the case of software products, most of the time what we mean is the publication date, not when the first or the last line of code was written. And currently the use of "inception" is not restricted to these cases, but it's considered equivalent to "publication date".--Malore (talk) 23:44, 14 July 2018 (UTC)
Is the inception date of a building the date when work started on it, or the date when it was completed? Software can be published before it's completed, e.g., on Github. Saying that software is "completed" at any particular point in time would be difficult in some cases. Ghouston (talk) 01:41, 15 July 2018 (UTC)
I suppose you could use the inception date for when work started on it, and various publication dates for different versions (if the software has version releases.) Ghouston (talk) 01:43, 15 July 2018 (UTC)

When sourcing is most relevant[edit]

When sources differ of opinion, when both cannot be correct, it is a matter of sourcing the right answer that will ensure that Wikidata has the correct information. I have been told that information I added based on en.wp should be removed because it is "Falschinformationen". Obviously the German Wikipedia thinks differtly from en.wp. Based on German information information is completed but we now have a half way house of correct and incorrect information.

In my opinion, this is exactly the situation where sources are vital. How to deal with this and how to reconcile differences. Thanks, GerardM (talk) 06:43, 15 July 2018 (UTC)

I'd start by referencing the original sources, instead of the Wikipedia articles. If the sources conflict, there are some properties that can be used like statement disputed by (P1310). Ghouston (talk) 06:55, 15 July 2018 (UTC)
What you describe is what I could come up with. What I am doing is adding "administrative territorial entities" for Africa. I am bold; relate information based on available items in Wikidata (yes, I do use Arabic, Cebuan, Italian links).
Adding sources to Wikidata like in this instance does NOT solve the issue I raise. The issue is how to curate multiple sources, particularly Wikipedias that disagree on factually singular information. That is the issue, not what I can do, I am aware what I could do. Thanks, GerardM (talk) 08:17, 15 July 2018 (UTC)
We can look at the question of whether Tumana district exists in Gambia, or not. It's mentioned at [4] as a constituency in the Upper River Region, as of Jan 2018. So it doesn't seem to be fake information. Perhaps there's a difference between electoral constituencies and administrative districts, more information is needed. That article also mentions Foday N. M. Drammeh, who is also named as the winner of Tumana at en:List of NAMs elected in the Gambian parliamentary election, 2017. Ghouston (talk) 10:09, 15 July 2018 (UTC)
There's a table of 2017 National Assembly election results (cached) at [5]; I think items could be created for all those constituencies and winning politicians, if not already present. Ghouston (talk) 10:21, 15 July 2018 (UTC)

Property for "known for"[edit]

In infoboxes of a person there is usually the "known for" parameter. What would be the most suitable property to collect this information ?. I have discarded using notable work (P800), because they are not always "creations or works" and even significant event (P793), because they tend to have a chronology of the individual. In each case it is different: it can be known for a work, but also for a discovery or for a malicious act. Any suggestions will be welcome. Thanks, Amadalvarez (talk) 09:52, 15 July 2018 (UTC)

@Amadalvarez: The best you can do is to suggest a property with that label and provide some examples. Maybe we need it!--Micru (talk) 13:15, 16 July 2018 (UTC)
notable work (P800) has "known for" as alias. I would suggest to swap that alias with the label. --Pasleim (talk) 13:24, 16 July 2018 (UTC)
I don't agree with that. P800 is mainly intended for works not for general accomplishments.--Micru (talk) 14:20, 16 July 2018 (UTC)
Agree with Micru. Looking at eg Albert Einstein (Q937) we have notable work (P800) = special relativity (Q11455), general relativity (Q11452) rather than specific papers. However the ontology of the items permitted by the "allowed values" constraint on P800 might need a look: at present scientific theories appear not to be included under the classes work (Q386724) or rule (Q1151067) to which the property is (currently) restricted.
On the other hand, the values for Charles Darwin (Q1035) are all printed works. But then maybe those are the best way to sum up Darwin's contribution. Jheald (talk) 14:33, 16 July 2018 (UTC)
@Pasleim: P800 is only coincident with "known for" for writers or other cultural skills. For instance, Lee Harvey Oswald (Q48745) is known to be the killer of John F. Kennedy (Q9696); Neil Armstrong (Q1615) is known to be the first person to walk on the moon. In both exemples is difficult to fit this actions as a P800. If you and @Micru, Jheald: will support me, I'll prepare a property suggestion. Thanks everybody. Amadalvarez (talk) 16:36, 16 July 2018 (UTC)

Rank / external tools[edit]

Is there a bot which replaces a statement with preferred or deprecated rank, like how placeholder for <somevalue> (Q53569537) is replaced automatically? Jc86035 (talk) 10:37, 15 July 2018 (UTC)

Another cebwiki flood?[edit]

After a stop of new items with cebwiki-sitelinks, ‎GZWDer started creating more of them. Despite their earlier plans to improve these items (what we were told last time), nothing has happened AFAIK. What do you think? Should this continue or be stopped. In cebwiki and/or svwiki, if I recall correctly, these bot stubs stopped being created as there were too many quality issues.
--- Jura 17:37, 15 July 2018 (UTC)

Lsjbot's creation of cebwiki article is completed and the articles in cebwiki is still being maintained. Note I can not improve any created items simutaneously with creating new ones until phab:T198396 is fixed - for now on I am planning to complete all creation then improve them. Other users (e.g. NikkiBot) may like to help improving before the creation is completed.--GZWDer (talk) 17:44, 15 July 2018 (UTC)
Please do not import elevation above sea level (P2044) for mountain(hills anymore - in 80% of the cases these are totally bogus! Those few I could edit manually I set the value imported (which often was without reference) to deprecated, see e.g. [6]. There are also still many items imported which don't have the coordinates set, hopefully these are now addressed by NikkiBot by matching them with original the GNS database entry. Ahoerstemeier (talk) 18:27, 15 July 2018 (UTC)
GZWDer, I strongly request you to stop such nonsense duplicates, duplications should be merged first, not just mark them. --Liuxinyu970226 (talk) 04:57, 16 July 2018 (UTC)
As I have said, it's much harder to find duplicates in advance before they are imported to Wikidata.--GZWDer (talk) 06:37, 16 July 2018 (UTC)
@GZWDer: "much harder"? Even {{CC-BY-SA 4.0}} and {{Cc-by-sa 4.0}}? --Liuxinyu970226 (talk) 10:01, 16 July 2018 (UTC)
Simple facts are not copyrightable.--GZWDer (talk) 10:05, 16 July 2018 (UTC)
This example is not copyright related, is however how can you give us an example that, if any single wiki has both templates, how can't they be merged? --Liuxinyu970226 (talk) 10:10, 16 July 2018 (UTC)
It is much easier to find such duplicates if the data is stored in a structured way.--GZWDer (talk) 10:18, 16 July 2018 (UTC)
  • As you seem unable to fix them now anyways, can you avoid creating them until this can be done and you actually plan to do it within a reasonable timeframe (e.g. 1 month). One could get the impression that you create a bunch of duplicates every 6 months and then expect other people to clean them up.
    Creating these items also make it difficult to import directly high quality datasets.
    --- Jura 07:26, 16 July 2018 (UTC)
There are already 33.000+ items which are either a 100% duplicate, or at least a semi-duplicate due to the administrative unit/populated place separation in geonames. Nobody seems to care to get [7] back to normal numbers. And these are only the duplicates which can be noticed due to the identifier, there are even more which need to be checked by name and coordinates. And while it may be easier to check for duplicates after creating the items - if this is not done even for the trivial cases then better not import. Ahoerstemeier (talk) 07:57, 16 July 2018 (UTC)
The creation will be completed in ~20 days and no further items will be created.--GZWDer (talk) 08:38, 16 July 2018 (UTC)
There doesn't seem to be any support for this.
--- Jura 08:39, 16 July 2018 (UTC)
Again, I don't think it's easier to deal with duplicates before they are imported to Wikidata.--GZWDer (talk) 08:40, 16 July 2018 (UTC)
They are already here, so please deal with them.
--- Jura 08:47, 16 July 2018 (UTC)
Pictogram voting comment.svg Comment (Personal view.) I have no problem with cebwiki imports (or at least smaller than with research papers). Still, a mass of new items with an only sitelink and an only statement, which is just an external identifier, is too few, though. Matěj Suchánek (talk) 08:48, 16 July 2018 (UTC)
1. leaving duplicates unlinked in cebwiki does not encourage anyone to fix them. 2. I plan to add more statements to these items, but I can not do it currently due to phab:T198396.--GZWDer (talk) 08:49, 16 July 2018 (UTC)
The addition of these low quality item discourages the addition of high quality datasets in the same field.
--- Jura 08:55, 16 July 2018 (UTC)
NikkiBot is importing GEOnet Names Server data to many items.--GZWDer (talk) 09:09, 16 July 2018 (UTC)
I asked Nikki to comment here. I'm sure a direct import would be easier for her.
--- Jura 09:50, 16 July 2018 (UTC)
Symbol oppose vote oversat.svg Strong oppose continue creating items for cebwiki, GZWDer, you're just polluting Wikidata, just stop that work, you're feel free to use your bot on other areas. --Liuxinyu970226 (talk) 10:05, 16 July 2018 (UTC)
I have to agree that avoiding duplicates is very tricky. How is a bot supposed to work it out? The GeoNames IDs we already had are full of mistakes. The old-style interwiki links Lsjbot added to the pages were also full of mistakes. There are lots of very similarly named things which are near each other. While the duplicate items are annoying, I think merging items which are the same is much easier than splitting items where the sitelinks were added to the wrong item. So in my opinion, creating new items is the best of a bad set of options.
As was pointed out, I'm currently running a bot which tries to improve these items, although my priorities are to reduce the number of missing statement violations and add references for the original source GeoNames used. For me, the easiest items to work with are the items with only a cebwiki sitelink and no statements other than the GeoNames ID, because then I know the sitelink hasn't been added to the wrong item and I don't have to worry about conflicting statements, so I don't have a problem with what GZWDer is currently doing.
- Nikki (talk) 10:47, 16 July 2018 (UTC)
Cleaning up the mess needs time, but it's easier to do it at Wikidata than on cebwiki as there're more tools for help (constraint violation report, projectmerge, SPARQL, etc.)--GZWDer (talk) 10:55, 16 July 2018 (UTC)



Proposal A[edit]

Here are my two proposals:
  1. Remove Phase 1 and Phase 2 support from cebwiki, unless and until if they actually do something to stop Lsj-isms; and/or
  2. Add exception to WD:N/EC that we don't allow creating places items with only GeoNames property.

--Liuxinyu970226 (talk) 10:05, 16 July 2018 (UTC)

The items will not only have GeoNames property, but also have country (P17), instance of (P31), located in the administrative territorial entity (P131) and coordinate location (P625) etc. Also, Lsjbot does not create any new articles anymore so no more issues will appear.--GZWDer (talk) 10:20, 16 July 2018 (UTC)

@GZWDer: In support of what many people have said above, saying that such duplicates are much easier to deal with once they are on Wikidata is plainly not true, if you have no method of dealing with them.

In my experience, dealing with duplicates is a very slow, tiresome, manual process because even having done a merge the resulting merged item then always seems to need further manual investigation and manual clean-up. I am therefore very strongly of the view that creation of duplicate items should be avoided if at all possible.

At the very least, will you not run them through OpenRefine first, to try to first find an existing matching item, rather than just bulldozing ahead with this item creation? Jheald (talk) 10:56, 16 July 2018 (UTC)

If you have no method of dealing with them on Wikidata, you have no method of dealing with them on cebwiki either. Information in cebwiki is not structured.--GZWDer (talk) 10:59, 16 July 2018 (UTC)
Here's another reason that you should really stop it, that your actions result The SPARQL query resulted in an error. happened on image (P18) of every items, *at least five times per item*. --Liuxinyu970226 (talk) 11:27, 16 July 2018 (UTC)
Evidence please. Matěj Suchánek (talk) 11:31, 16 July 2018 (UTC)
@Matěj Suchánek, GZWDer: #Why these format contraint violations in references?. --Liuxinyu970226 (talk) 13:55, 17 July 2018 (UTC)
I wanted evidence of direct influence. If your accusations prove invalid, I won't fade it away. Matěj Suchánek (talk) 14:24, 17 July 2018 (UTC)
Symbol support vote.svg Support as nom. --Liuxinyu970226 (talk) 11:29, 16 July 2018 (UTC)
Symbol oppose vote oversat.svg Strong oppose Nonsense, contests goals of Wikidata. Matěj Suchánek (talk) 11:31, 16 July 2018 (UTC)
@Matěj Suchánek: So you support spam by Lsj? --Liuxinyu970226 (talk) 11:36, 16 July 2018 (UTC)
I don't (and actually I see nothing like that). What happens on cebwiki isn't our business, is it? Matěj Suchánek (talk) 11:38, 16 July 2018 (UTC)
  • Pictogram voting comment.svg Comment There seems to be some activity at ceb:Espesyal:Bag-ongGiusab. Not sure if all users are doing maintenance for other projects. If yes, maybe a question for Meta.
    --- Jura 12:23, 16 July 2018 (UTC)



Proposal B: moratorium for new items[edit]

  • Pictogram voting comment.svg proposal B Based on the planned activity by Nikki, I suggest we create only new items when they are being referenced with another identifier than Geonames (at item creation).
    If there is interest in dealing with the current backlog and this can actually be done, we might want to review if more items for the 600,000 bot pages should be created.
    Items for manual articles at cebwiki can always be created or articles connected to existing items (if there are any contributors there).
    --- Jura 12:23, 16 July 2018 (UTC)

@Jura1, GZWDer, Liuxinyu970226, Ahoerstemeier, Matěj Suchánek: @Nikki: See my comments on WD:AN. TL;DR Blocked for three days pending a thorough, contiguous explanation of further actions. Mahir256 (talk) 13:36, 16 July 2018 (UTC)

I have filed a new request at Wikidata:Requests for permissions/Bot/GZWDer (flood) 2. Please comment on this.--GZWDer (talk) 13:58, 16 July 2018 (UTC)
Symbol neutral vote.svg Neutral While this matches my second point above (excluding any creation of items that only have Geonames property), I wouldn't support this kind of solution easier, as there are still no potential good sanctions to cebwiki, which is required to prevent more wikis from such behaviors. --Liuxinyu970226 (talk) 14:38, 16 July 2018 (UTC)
I am getting concerned about your mental health: "Sanctions to cebwiki." There's nothing like sanctions to a wiki, no. They are not our business. If you don't like it, don't care. Matěj Suchánek (talk) 17:23, 16 July 2018 (UTC)
You have to copy paste your same comment to @ArthurPSmith: to explain that why that user marked some Victoria Park as possibly invalid entry requiring further references (Q35779580). --Liuxinyu970226 (talk) 23:44, 16 July 2018 (UTC)
I don't understand how it is related. Maybe because he believes it is a possibly invalid entry requiring further references? Matěj Suchánek (talk) 07:06, 17 July 2018 (UTC)
I'm not sure I used the right solution there, but those were cases where I was trying to find a location in Australia and wikidata had a huge number of them almost all imported from cebwiki; when I looked at (a few?) coordinates they specified in Google maps, there seemed to be no park and nothing matching that label at the given locations. So I marked ones that looked wrong to me, as I recall. ArthurPSmith (talk) 13:53, 17 July 2018 (UTC)

Property to declare the language of the reference url?[edit]

Do we have a property to state which is the language of the article/page/url of a reference? I know one could create an item for the reference, but is not always very straightforward. Could language used (P2936) be a candidate for this? Could you ping me if you answer directly to me on this thread? --35.192.51.123 16:01, 16 July 2018 (UTC)

@35.192.51.123: I think language of work or name (P407) is what you're looking for. Bovlb (talk) 17:24, 16 July 2018 (UTC)
@35.192.51.123:, @Bovlb: so I'm adding "language of the reference item" to labels. --Ogoorcs (talk) 19:23, 16 July 2018 (UTC)

Wikidata weekly summary #321[edit]

Country: Northern Ireland vs United Kingdom[edit]

  1. The country of Derry [8] is Northern Ireland.
  2. For example, the country of Belfast [9] is United Kingdom.
  3. I thought the right value of the country field is UK, and I had tried to change the country of Derry to NI, but an anonymous user reverted it to NI again and again. I think Nothern Ireland's cities must have identical value because of standardization (and Northern Ireland is a part of the UK). Dhārmikatva (talk) 00:19, 17 July 2018 (UTC)
  • This anonymous [10] is doing something wrong, I think. Dhārmikatva (talk) 00:23, 17 July 2018 (UTC)
A property interpreting in different ways is not how Wikidata works.--GZWDer (talk) 01:17, 17 July 2018 (UTC)
If you take a liberal interpretation of what countries can be used with country (P17), then some places are parts of more than one country. There are also 250 items with country (P17) Aruba (Q21203), so it's not just a UK issue. Ghouston (talk) 01:39, 17 July 2018 (UTC)
Sadly, Flanders (Q234) doesn't seem to count as a country. Otherwise, Baarle (Q797512) could be in four different countries. Ghouston (talk) 02:16, 17 July 2018 (UTC)

Items about Wikimedians[edit]

I have witnessed two instances of a group of items about Wikimedians being nominated for deletion due to lack of notability. The first was in September, in which Jarek argued for lack of notability but MisterSynergy opted to keep due to the presence of Commons creator templates. (At the time—when I wasn't yet a sysop—I tended to agree with deleting all but those of @Moheen Reeyad, Emijrp: since they seemed to stand alone better than the others.) The second was several days ago, in which Bodhisattwa argued for lack of notability and which Maarten deleted on such a basis—not the only major difference between the two acting admins in these instances. As I had begun taking the liberty of pinging frequent, accustomed editors whose items were up for deletion, I didn't think it appropriate to take any action in this recent set of requests until the affected users were able to opine about them. Now there is a new, isolated instance which is only prompting yet another long notability discussion to occur (@Mazuritz, Ash Crow, Infovarius:, as those involved in this instance, plus @Esh77: as I ran across your item some time ago).

It is clear that there is something unresolved about WD:N as it pertains to Wikimedians making items about themselves. Of course there may be something I don't recall in P4174's proposal, among all which related to privacy issues and whatnot, on the topic of how that property applied to notability criteria, but what logically comes to mind about this is as follows. On Wikipedia we routinely RfD clear self-promo articles by amateur rappers, actors, SEO folks, and the like, all of whom very likely think they are sufficiently notable for Wikidata, about themselves—and often do the same to the appropriate Wikidata items much to their chagrin—so ideally items by Wikimedians who may very likely think they are sufficiently notable for Wikidata about Wikimedians should nearly equally share the same fate. A strict interpretation of WD:N under the second and third points tends to preclude their deletion, however, but I can certainly understand the mindset of those who find it odd that contributors to this knowledge base find it in them to add themselves to it (I seem to recall @Liuxinyu970226: complaining about @Shizhao: on this front). Some may find it unfair that those most familiar with those workings of Wikidata available to them, who otherwise would most likely fail the notability criteria had others tried making items about them, can "weasel" their way to meeting said criteria by the aforementioned strict interpretation, whereas others who try to do the same—and who could very well meet the notability criteria had someone more thoroughly familiar with Wikidata created their item—get blocked and their items nuked. Let me make clear that I am not condoning the presence of self-promo/spam with this last statement.

Thoughts on the presence of items created by Wikimedians about themselves? Mahir256 (talk) 04:12, 17 July 2018 (UTC)

I think the items should be permitted for consistency, since authors of notable works can have items, and Wikimedians are authors of Wikipedia. Ghouston (talk) 04:30, 17 July 2018 (UTC)
When I notice Wikimedians creating items about themselves I delete them on sight as non-notable. If I notice items being created about other Wikimedians, I check if they're really notable and usually deleting them. Far to many privacy and conflict of interest issues. I deleted items about some people more than once and I even deleted an item about myself. Multichill (talk) 08:43, 17 July 2018 (UTC)
Then it's not so much a notability problem but a biography of living persons issue, with information that isn't properly referenced. There are some items like Jimmy Wales (Q181) and Magnus Manske (Q13520818) for notable Wikimedians, although even these have unreferenced statements. E.g, Jimmy Wales' spouses and Magnus Manske's date of birth and education. Ghouston (talk) 09:37, 17 July 2018 (UTC)
  • If we don't have an item, we're not going to be able to write a depicts (P180) statement on CommonsData for any of the photos of this person. Is that a problem? Jheald (talk) 10:02, 17 July 2018 (UTC)
There was a proposal to have some hybrid data type for author fields on Commons data that allows us to specify people either be item ID, username or URL. I assume we will use something similar for depicts (P180). For me the issue with items for not-so-notable wikimedians is that in such items most or all the information comes from the subject with no proper references. If there are external identifiers they are self-created. On the ohtr hand I do not want to punish otherwise notable people who might want to edit Wikipedia, so the thershold should be: is there enough non-self-referencing information out there about individual. We should delete non-referenced and self-referenced info from such items and see how much is left. --Jarekt (talk) 12:13, 17 July 2018 (UTC)
  • In a previous discussion, it was suggested that we can create items for authors even if practically nothing is known about them, e.g., the thousands of authors of the CERN Higgs Boson article. These would fulfil a "structural need" in case there was another article with the same author. The same could just as well apply to Wikimedians. Ghouston (talk) 12:29, 17 July 2018 (UTC)
Maybe introduce Template:Connected contributor (Q7646869) here? --Liuxinyu970226 (talk) 14:00, 17 July 2018 (UTC)
When asking for this discussion, I was thinking of a loophole in WD:N: many items about French Wikimedians where created last year by @Deansfa: (@Envlh: warned me that one had been created about me) on the ground that there is a Commons category for these contributors. I wanted to delete all those items but found that I could not, as they are neither excluded by the current redaction of the 1st criterion of WD:N, nor listed in WD:Notability/Exclusion criteria. If we are to delete all those items, we should explicitely exclude them in one of those two pages to avoid them recreated after, either by someone acting out of good faith like Deansfa, or by someone with bad intentions (to be clear, I fear that such items can be used for targeted harassment far more than I fear self-promotion from contributors.) -Ash Crow (talk) 15:12, 17 July 2018 (UTC)
  • It's useful to have the items where there are categories of photos of the specific Wikimedian on Commons - although I agree that they shouldn't be created by the Wikimedian themselves. Thanks. Mike Peel (talk) 15:13, 17 July 2018 (UTC)
  • Many long-term wikimedians have category on Commons and some even have Creator pages but that does not make them notable. The fact that shch pages exist might make it harder to clean up but should not be a reason to justify creation of items on Wikidata. Maybe we need to clarify this on WD:N. As for authors of articles we have author name string (P2093) for that with 17M uses. Maybe we should replace some items with no other info than name with those. --Jarekt (talk) 17:58, 17 July 2018 (UTC)
  • @Multichill, Ash Crow, Infovarius:, The element Q40676142 Benoit Soubeyran was deleted today. For me, it's not to make self-promotion, it's just a good way use the tools and templates linked with Wikidata on Commons and on my wikipedia user page. It's not right because there is a lot of elements on wikimedians : Benoît Prieur, Moheen Reyad,... @Agamitsudo, Moheen Reeyad: Of course, I'm not against these and I think they have clearly right to exist. But if you want to delete the wikidata elements of non-notable wikimedians, you must delete all this elements and not only one !!! I prefer a large discussion before to decide if we must keep or delete all these elements and what the wikimedians thelmselves have right to edit or no. --Mazuritz (talk) 20:14, 17 July 2018 (UTC)
  • I found an interesting issue with Wikimedia username (P4174) in it's proposal, Wikidata:Property proposal/Wikimedia user name. User:Innocent bystander opposed it, and felt if it was created, there should be a requirement that a reliable source should be required. I think I'm qualified for a Wikidata item because I am a low level elected government official, and this can be verified on state and town web sites. But I do not have a famous blog, and none of the handful of reliable sources about me mention any of my Wikimedia user names. Therefore, if an item ever got created about me, no one would be able to add Wikimedia username (P4174) to my item. If I were to make any on-wiki statements about my item, they could only be used to help decide whether to delete it, not to decide that the Wikimedia userid belongs to the person described by the item. Indeed, it is rare for an interview or third-party biographical piece about a person to mention a wikimeda userid, so virtually the only people eligible to have Wikimedia username (P4174) in their item are those who are reliably associated with a website or blog, and self-publish it there. Gerry Ashton (talk) 21:13, 17 July 2018 (UTC)

Why these format contraint violations in references?[edit]

Last year, I've inserted some economists via quickstatements. While, e.g., in Casey Mulligan (Q41805010) the references look fine, in Thomas Crossley (Q41799970) (inserted in the same batch) the reference for instance-of shows an error near the source id ("Issues - format constraint - The SPARQL query resulted in an error."). Similar the reference for the occupation, with an warning symbol near the title ("Potential issues - format constraint - The SPARQL query resulted in an error.") I cannot recognize a violation of the linked format constraints, and have no idea what triggers the messages, but would be happy to adapt my statements generation if somebody could explain to me whats wrong. Jneubert (talk) 11:27, 17 July 2018 (UTC)

As far as I know, this is a problem with the Query Service or the constraint gadget, and likely a temporary one. There is nothing wrong with your data. @Lucas Werkmeister (WMDE) can probably explain this in more detail. —MisterSynergy (talk) 11:43, 17 July 2018 (UTC)
I’m not sure what’s going on, but I filed phabricator:T199787. --Lucas Werkmeister (WMDE) (talk) 12:20, 17 July 2018 (UTC)
In any event, just to clarify – yes, “the SPARQL query resulted in an error” is never an error with your data (I think). I suppose we shouldn’t show it like a regular constraint violation :/ (edit: filed phabricator:T199788 for that) --Lucas Werkmeister (WMDE) (talk) 12:20, 17 July 2018 (UTC)
@Lucas Werkmeister (WMDE): A second issue is: why are so many of these errors happening now? I'm getting them all the time. Presumably, they indicate WDQS queries failing in some way. Over the weekend I was also seeing a lot of queries that typically run in 30 to 40 seconds now for some reason timing out. Is this related? Is it possible that eg some of the memory management has got really got really messed up on one or more of the WDQS servers? Is there a dashboard that would make it visible, if one or more of the WDQS servers were suddenly underperforming? Jheald (talk) 13:07, 17 July 2018 (UTC)
@Jheald: there is a Grafana dashboard for WDQS, but note that WikibaseQualityConstraints uses an internal endpoint, so problems on the public query service shouldn’t directly affect it. --Lucas Werkmeister (WMDE) (talk) 13:38, 17 July 2018 (UTC)
I don't believe that this was happened before @GZWDer:'s cebwiki importing, so GZWDer, this is really one of reasons that you should stop it! --Liuxinyu970226 (talk) 13:57, 17 July 2018 (UTC)
In July I have created ~20000 items. This is much less than ~500000 items in Febrary.--GZWDer (talk) 14:08, 17 July 2018 (UTC)
This happened on all items, just all items. --Liuxinyu970226 (talk) 05:37, 18 July 2018 (UTC)
Thank you, MisterSynergy and Lucas Werkmeister (WMDE). Perhaps it would make sense to prefix messages like these with "Internal system error: ", or even filter out constraint violation messages completely, when a system error occured. Since the end user can't do anything but puzzle about them, they probably would be better fetched in some log file. Jneubert (talk) 15:46, 17 July 2018 (UTC)

"20. century" ≠ 2000–2099[edit]

For those who are affected by this, I have proposed at Wikidata:Bot requests#Normalize dates with low precision for a bot to automatically convert all dates with low precision to the range that Wikibase interprets them as being (i.e. to convert a date which currently says "20. century", but is actually +2000-00-00T00:00:00/7, to +1951-00-00T00:00:00/7). This would fix interpretation of date values by external tools (e.g. people stated as having a date of birth from 2000–2099 and date of death in 1996). I believe it is technically feasible with pywikibot and similar tools, although I would not be able to do it myself and I wouldn't do it without consensus for it. Jc86035 (talk) 11:39, 17 July 2018 (UTC)

Property for relation Church parish and administrative parish[edit]

I think I have seen a property for connecting a Church parish with a corresponding "administrative" parish but cant find it... Anyone who can help? - Salgo60 (talk) 15:43, 17 July 2018 (UTC)

Would it be located in the administrative territorial entity (P131)? --Tagishsimon (talk) 18:23, 17 July 2018 (UTC)
No sorry - Salgo60 (talk) 19:09, 17 July 2018 (UTC)

With regard to religious entities I'm only aware of diocese (P708). strakhov (talk) 18:37, 17 July 2018 (UTC)

A query (tinyurl.com/y7mc5r73 find only 4 cases of such items being related: two by located in the administrative territorial entity (P131), one by part of (P361) (which probably should be P131 instead), and one by followed by (P156).
In the opposite direction (tinyurl.com/y6wkyvts) there is only one such link, a has part (P527).
separated from (P807) might be conceivable, though it might normally represent a spatial carve-out, rather than a carve-out of roles.
named after (P138) might also be conceivable, though more likely would be considered named after a village or area.
territory overlaps (P3179) might also link the two, if one does not perfectly nest inside the other. Jheald (talk) 18:41, 17 July 2018 (UTC)
Thanks for trying - maybe I just saw it in proposed properties
It is different WD objects that should be connected using this property
The basic idea is in Sweden we had civil parish or a administrative parish socken (Q1523821) then in parallell we had Parishes of the Church of Sweden. As those administrative units are related this property should make this connection...
Example
- Salgo60 (talk) 19:09, 17 July 2018 (UTC)
Such could be usefull for Norwegian parishes too. Breg Pmt (talk) 21:29, 17 July 2018 (UTC)

Documentation of and history of interactive user interface[edit]

If I enter an item number in the search box, like Q1, and push enter, I am taken to an interactive user interface where I can view and edit the item.

  1. Does this user interface have an official name?
  2. Is there documentation for this user interface?
  3. Is there any readable description of the development history, that can be understood by someone who was not intimately involved with the development?

Jc3s5h (talk) 19:12, 17 July 2018 (UTC)

It's the Wikibase Client. Not sure how much documentation there is on the current interface or history, but I'm sure it's findable somewhere! ArthurPSmith (talk) 19:19, 17 July 2018 (UTC)
The description at Wikibase Client makes it sound like it is an extension that allows the use of Lua or parser functions to extract data from a repository. It seems nothing like the interactive user interface. Jc3s5h (talk) 19:29, 17 July 2018 (UTC)
Hmm, I guess you're right. Wikidata is sort of both the repository and the client - there's some more documentation here. I'm sure somebody else knows a lot more. ArthurPSmith (talk) 19:49, 17 July 2018 (UTC)
You may want to look at Wikidata:Development plan/Done, Wikidata:UI redesign input but those are relatively new; digging in the history and archives may lead to more. ArthurPSmith (talk) 02:46, 18 July 2018 (UTC)

Duplicate item and renaming[edit]

It seems that an unusual sequence of renaming lead to the creation of a duplicate I thankfully caught early: https://www.wikidata.org/w/index.php?title=Q55597385&redirect=no . It’s about a french scholar who had an article prior the Wikidata era. The initial item was created with the sitelink fr:Jean-Paul Delahaye, so far so good. Then it was renamed fr:Jean-Paul Delahaye (mathématicien) because a soccer guy has the same name, the sitelink was moved accordingly. Then trouble starts as it appears the footballer is far less notorious than the mathematician, so the article was renamed back by Thibaut120094 (without leaving a redirect) and the sitelink was not updated. GZWDer then create a duplicate item as the article became orphan. What went wrong ? How can we prevent/detect that ? It seems to me that there is a redirect with an item to this orphaned article. Is it a good heuristic? author  TomT0m / talk page 20:03, 17 July 2018 (UTC)

Code editor for modules[edit]

When I edit modules in the module namespace at Danish Wikipedia, the edit window will change to a code editor with syntax highlighting and auto indent. That doesn't happen here at Wikidata where module editing is done in a normal text editor window like in all other namespaces. Could the code editor for modules also be enabled here please? --Dipsacus fullonum (talk) 20:14, 17 July 2018 (UTC)

You need to enable Enable enhanced editing toolbar in your preferences, possibly globally. Matěj Suchánek (talk) 07:52, 18 July 2018 (UTC)
That helped. Thank you very much, --Dipsacus fullonum (talk) 08:58, 18 July 2018 (UTC)

Dispute resolution: Sigmund Freud and Nobel Prize nominations[edit]

Sigmund Freud was nominated for the Nobel Prize multiple times, according to the website of the Nobel Prize organisation. https://www.nobelprize.org/nomination/archive/show_people.php?id=3209

https://www.wikidata.org/w/index.php?title=Q9215&oldid=707381639 showed these nominations, well-cited, until User:Pierrette13 came along and deleted them all. Asked why, Pierrette13 explains on User talk:Dorades: "Regarding his nominations for the Nobel Prize, please see https://www.nobelprize.org/nomination/archive/show_people.php?id=3209 (the official site of the Nobel Prize) -- being nominated doesn't mean he had to be close to win it. If". The data removed was in nominated for (P1411).

Asked about the Nobel nomination deletions again, Pierrette13 says (according to google translate) "As for the "almost Nobel", obviously I will not go back to reinsert, it's great anything in my opinion. Can you tell me the list of others 'almost Nobel' and I will make a group protest, thanks to you".

I'm at a loss to know how to deal with a user who seems, to me, to have a thing for removing data on a whim. It's discouraging to see wanton destruction of people's work. @Pierrette13:. --Tagishsimon (talk) 22:07, 17 July 2018 (UTC)

  • My own opinion is that nominations for the Nobel are awfully easy to come by, and not notable. - Jmabel (talk) 23:52, 17 July 2018 (UTC)
  • Nominations for the Nobel Prize are well-cited and thus can be added to Wikidata. --Pasleim (talk) 00:04, 18 July 2018 (UTC)
    • I'm relatively new here, so perhaps I don't understand Wikidata's criteria. To be on Wikidata, doesn't something have to be citable and notable, not merely citable? - 04:26, 18 July 2018 (UTC)
I agree, nominations for the Nobel Prize are not easy to come by, are notable and useful to have. JerryL2017 (talk) 06:06, 18 July 2018 (UTC)
  • Have a read of en:Nobel_Prize#Nominations - nominations are from quite a restricted pool of people. Given that they can be well-cited, I'd say they're notable enough to be included here. Thanks. Mike Peel (talk) 06:36, 18 July 2018 (UTC)

P27 and UK citizenship[edit]

I've started a thread Multiple UK values - reality check pls on Property talk:P27 asking what the consensus is on people items having multiple Country of Citizenship values, for citizens of the UK who were alive before & after 1927, or before and after 1801 or 1707, when the nature and name of the state changed. Grateful for input. --Tagishsimon (talk) 22:28, 17 July 2018 (UTC)

Wikidata:Technical administrators[edit]

As we will have this user group soon (since July 23), I have drafted a policy about it. Comments welcome. For more information of this user group see m:Creation of separate user group for editing sitewide CSS/JS.--GZWDer (talk) 22:53, 17 July 2018 (UTC)

Link to how to merge[edit]

Hello! Wikipedia administrator/Wikidata newbie here. When I add a translation to an item (Q55605143) which reveals the item to be a duplicate, I get the message "The link zhwiki:北京首都旅游集团 is already used by item Q55065197. You may remove it from Q55065197 if it does not belong there or merge the items if they are about the exact same topic." I would strongly suggest adding a link to Help:Merge to this message, as finding how to merge two items was not easy. Or, better yet, a link to Special:MergeItems with the two IDs prefilled.

Cheers! Audacity (talk) 02:47, 18 July 2018 (UTC)

Hello Dan! This would be the original template message, changing which requires modifying this list of English messages. I'd modify the message myself to at least link to Help:Merge, but I have yet to actually start keeping a clone of the Wikibase source code handy. I am sure that if you file a bug report on Phabricator, someone there will gladly oblige. Mahir256 (talk) 02:58, 18 July 2018 (UTC)

Semi-protection[edit]

Are there any guidelines for when and for how long pages should be protected? I think it should be at least standardized, and making semi-protection a little more common would make it quite helpful for editors on other wikis (e.g. the English Wikipedia) who get complaints about how this anonymously transcluded Wikidata content has been vandalized and not fixed for two weeks. Personally I would prefer allowing semi-protection for 3 years or more of any pages that normally get vandalized more than about once every two months on average, because that would at least take care of most of the random innocent vandalism by people who don't know any better. (Should there be an RfC for this?) Jc86035 (talk) 08:17, 18 July 2018 (UTC)