Shortcuts: WD:PC, WD:CHAT, WD:?

Wikidata:Project chat

From Wikidata
(Redirected from Wikidata:PC)
Jump to navigation Jump to search

Wikidata project chat
Place used to discuss any and all aspects of Wikidata: the project itself, policy and proposals, individual data items, technical issues, etc.
Please take a look at the frequently asked questions to see if your question has already been answered.
Please use {{Q}} or {{P}}, the first time you mention an item, or property, respectively.
Requests for deletions can be made here. Merging instructions can be found here.
IRC channel: #wikidata connect
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2018/11.

Project
chat

Lexicographical
data

Administrators'
noticeboard

Development
team

Translators'
noticeboard

Request
a query

Requests
for deletions

Requests
for comment

Bot
requests

Requests
for permissions

Property
proposal

Properties
for deletion

Partnerships
and imports

Interwiki
conflicts

Bureaucrats'
noticeboard

Contents

Some fundamental questions about modeling properties for statistical data[edit]

Discussion[edit]

Dear all, before I make more requests for permissions to import more statistical data using User:WDBot, I think that there are issues about data modeling that need to be discussed. I would give you some example to make it clearer.

Nominal GDP Property: P2131:

  • The value is usually listed in US Dollar but also in the local currency.
  • Even for the same currency there are different sources with different values - for example for values in USD there are different estimates from the IMF and the World Bank for the same point in time.

There are actually two ways to model this:

  • Use separate properties for different data, that are structurally different - this means here that we would have two separate properties from GDP in USD and Local Currency. If we want two additional sources we should have different properties for each issuer.
  • Use only one property for different data.

The first option is easier to browse and is consistent to the ranking concept - the more actual value will be shown in queries. Browsing the data using the interface is also easier because the user moves in the same structure and methodology of data among one property. The second option is more pragmatic regarding the fact that we use only one property. Otherwise it makes the use of the data and the query more difficult because the user does not know which data is he seeing, especially in queries and I suppose in wikipedia. If the user queries the data on 3 dates and on each of these dates different values are the most actual - and thus preferred (USD from the World Bank, USD from IMF and EUR from Eurostat etc.) - then the user gets every time systematically different values.

I think this is some fundamental data modelling issue, where we need a good discussion and consensus before we begin to import large amount of statistical data.

In which way should we model this sort of statistical data? Maybe we could start a discussion in two directions - short term (using the actual structure of wikidata) and long term consensus (proposing some further changes probably in the modelling, use and visualization of properties). Cheers! Datawiki30 (talk) 09:48, 27 October 2018 (UTC)

@Datawiki30: Nothing prevent you to build queries with several conditions: data about nominal GDP WITH currency in dollar WITH sourced from World bank for example. If all the required data (unit, source,...) are provided, you only need to build the corresponding data you want. But this means you have to know what ou want before writing the query. There is no modelling issue, just query definition issue. Snipre (talk) 05:28, 29 October 2018 (UTC)
@Snipre: Thank you for your comment. Indeed this is possible, but the query builder does not support this. Users that do not have advanced skills in SPARQL will have some difficulties to do this. What do you think the other questions:
1. Is the ranking concept applicable to properties with two or more values for the same point in time (for example nominal GDP for 2017 from IMF, World Bank, CIA Fact Book, Eurostat...)?
2. If I want to use the most actual value for the nominal GDP with reference "stated in" = "World Bank database" on the wikipedia page of the country (infobox), how do I do this?
3. When we add more data with different sources, then we have wild mixed values in the same properties on the wikidata GUI. What do you think about this usability issue? Datawiki30 (talk) 13:34, 29 October 2018 (UTC)
This never really was a problem. We always went with one property for a specific use case, say "nominal GDP", which is nominal GDP (P2131) in this case. I can’t remember that we ever started to create properties such as "CIA Fact Book nominal GDP". Thus, all claims regardless of their value, source, rank, qualifiers, etc. use this single property. If you want to restrict data retrieval to a particular subset of the claims, you need to add additional criterial to the data retrieval process. Either by elaborating a more specific SPARQL query that for instance only considers one particular source, or by some kind of post-filtering (e.g. in Wikipedia: loop over all P2131 statements and ignore all which do not fulfill the desired criteria). Both approaches are not overly complicated to implement.
Ranks are more a tool to manage multiple value situations (as in this case), but due to the very limited number of ranks and the many scenarios how to use them, there is not always a clear consensus how exactly to do it, and thus the outcome might be varying for different items. I strongly recommend not to model anything crucial around the usage of ranks.
The Wikidata UI is basically a tool for editors, not really an interface to retrieve data, or to "read some data". What you see there is indeed a bit unpolished, disordered, and so on, but this does not really matter. Mind that there is no intrinsic order of statements in an item, or multiple values of a statement in an item defined in the data model. Consider that they just appear in a random order (although they technically aren’t fully randomly displayed). If you want ordered data, this is something you need to do as a data user once again, after data retrieval. You can for instance order your nominal GDP data by date, by source, by nominal GDP, alphabetically by state name, or by whatever.
MisterSynergy (talk) 15:00, 29 October 2018 (UTC)
  • Multi-valued fields are tricky to use in Wikidata. I don't think it has been done yet, but I think it could worth considering having separate properties for local currency values. The problem with many of these economics properties is that people kept creating them despite the proposer not using them and quantity datatype not being entirely functional. It could also be that some additional properties should be created and some existing ones deleted. --- Jura 17:11, 29 October 2018 (UTC)
@Datawiki30: The ranking is not an appropriate tool to distinguish between different sources. The main role of the rank is to define is the value is valid or not.
For the retrieval of WD data in WP using specific sources, the French WP developed the lua functions (see here in French). But when I looked at the corresponding English module, I don't find a similar function (see [1] here). A function getValueBySource is missing in the toolbox of the lua functions. Better ask the English community about this possibility. Snipre (talk) 10:14, 30 October 2018 (UTC)
Thanks to all the users for the comments. Before I make a summary I would just like to let make a test, if we can easily retrieve the proper GDP value in Wikipedia. @Snipre: At the Sandbox Q4115189 you can find examples for nominal GDP: 1) values in USD from the World Bank, 2) values in USD from the IMF and 3) Values in the local currency (here Euro) from the World Bank (@Jura1: I know that you would prefer new property for nominal GDP in local currency, but just for the case that we don't get the property support). The data are from France. I've set the preferred rank for the most actual value of each source. This should be done by a bot and should ensure, that only the actual value is retrieved in Wikpedia - automatically even when we have new data for a new point in time. In this example we (still) don't have data from the IMF for 2017. @Snipre: Can you please try to retrieve the preferred values from the each of the sources in Wikipedia separately (or can you ask someone to do this for us)?
- 2,582,501,307,216.42 United States dollar for 2017 (World Bank)
- 2,466,152,225.25 United States dollar for 2016 (IMF)
- 2,291,705,857.98 euro for 2017 (World Bank) Datawiki30 (talk) 00:03, 31 October 2018 (UTC)
@Laboramus: operates a bot that sets preferred ranks. Maybe it can be fine tuned for this. --- Jura 12:05, 31 October 2018 (UTC)
@Jura1: Thank you for the tip. The PrerefentialBot already operates the ranks for the nominal GDP. @Laboramus: If we have different sources, the most actual values from these sources have different actual point in time values (see example above) and the most actual value of each source should be preferred, then we need a slight change in the bot script for the property nominal GDP. Right?
We are now trying to retrieve each value separately according to the different source in Wikipedia (see here). @RexxS: Thank you for the support. If you need some additional information just write us. Datawiki30 (talk) 14:56, 31 October 2018 (UTC)
There are some results on display in the en-wiki thread linked above. --RexxS (talk) 18:12, 31 October 2018 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Using an extended function we can now retrieve the preferred values from nominal GDP (P2131) using a filter on the source "stated in" and the unity (for example USD, Euro etc.). @RexxS: Thank you for your engagement. This allows the user to retrieve each of the values separately - for example in the infobox of the country etc. Take a look here. I think this should be sufficient to retrieve the data. There are also other properties that could use this function.

  1. Any concerns about accessing the data (with different sources and units) this way in Wikipedia?
  2. The values are in trillions and this too long for the infobox. I think that we need some function to truncate the values in Wikipedia. For example using a 10⁹ power and taking the value 2,582,501,307,216.42 USD would get truncated value 2,582 billion USD. What do you think about this?
  3. Can we use a SPARQL query to replicate the table here with source World Bank in Wikipedia?

Cheers! --Datawiki30 (talk) 13:34, 3 November 2018 (UTC)

@Datawiki30: When you indicate 2,582 billion USD, did you intend that the value displayed be rounded to an integer? If so, then I have a solution in the sandbox. See en:Module talk:WikidataIB/sandbox/testing #Scaling quantities. --RexxS (talk) 15:38, 3 November 2018 (UTC)
Thank you @Datawiki30: - that's fantastic! That looks for me very good - I have now upadted the sandbox table on wikipedia. I esitated to use the automatic scaling, because it would round trillion-values "too much". Maybe the automatic approach could be something like trying to get at least 4 digist:
  • 19,390,604,000,000 -> 19,391 (actual value for the United Stated) instead of 19
  • 3,677,439,129,776.6 -> 3,667 (actual value for Germany) instead of 4
@Jura1:, @Snipre:, @MisterSynergy: and all the other users interested in this topic: Do you think that we need some improvement of the automatic approach to get at least for example 4 digits in Wikipedia? What about the other two points mentioned above? Cheers! --Datawiki30 (talk) 22:45, 3 November 2018 (UTC)
Only providing a SPARQL query here: link. It lists 2017 data for nominal GDP (P2131) with a reference stated in (P248): World Bank database (Q21540096) and sorts descending by GDP. Can be tweaked further if necessary, but generally it is not that complicated to collect such results sets with SPARQL. —MisterSynergy (talk) 22:59, 3 November 2018 (UTC)
@Datawiki30:. The automatic facility now retains a minimum of 4 digits. I may need to tweak that, but try it out first. Sadly we can't run SPARQL queries in Wikipedia pages. --RexxS (talk) 23:52, 3 November 2018 (UTC)

@RexxS: - thank you for the implementation. You're faster than the light :). I think this is now perfect. @Jura1:, @Snipre:, @MisterSynergy:: After the discussion here I would like to start votings about some of the topics to try to get consensus. --Datawiki30 (talk) 18:30, 4 November 2018 (UTC)

Voting for consensus[edit]

Property with different sources for the same Topic[edit]

There is consensus, that different values from different sources on the same topic should be edited in the same property. For example if there are different estimations for the nominal GDP in US-Dollar from the World Bank and the IMF with different estimates for the same point in time, then the values should be edited in the same Property P2131. Arguments:

  • All the values with the same informative value are consistently in the same property.
  • There is a Lua function to retrieve values depending on the source in Wikipedia (for example statement with preferred rank fromthe source World Bank database).
  • SPARQL-queries can handle multiple values from different sources.

Symbol support vote.svg Support-- Datawiki30 (talk) 18:30, 4 November 2018 (UTC)

Pictogram voting comment.svg Comment "edited in the same property" isn't really the way Wikidata works. What I think you mean is that if there are different estimates from different sources for a quantity, it is appropriate to add statements for each one. I doubt anyone would contest that -- it was envisaged in the design for Wikidata from the start. Jheald (talk) 18:01, 5 November 2018 (UTC)

Better structure of the statements in the Wikidata UI especially for statistical properties with multiple values[edit]

There is consensus, that the random visualisation of statements for statistical properties with multiple values in the Wikidata UI is not adequate and that there is need for further development. For example (according to the implementation in https://tools.wmflabs.org/sqid/#/):

  • deprecated statementes should be vizualized separately from nomal and preferred statements
  • statements with preffered rank should be vizualized more highlighted
  • statements with point in time qualifier should be sorted on this qualifier

Arguments for this consensus:

  • statistical properties could have many statements (for example more than 50 edit for different years) with different sources. With the actual (not sorted) vizualisation it is difficult for the Wikipedia user to find and comprehend a value, that is retrieved from Wikidata.

Symbol support vote.svg Support-- Datawiki30 (talk) 18:30, 4 November 2018 (UTC)

Symbol support vote.svg Support Sorted statements would be a dream come true. I consent! Moebeus (talk) 01:15, 5 November 2018 (UTC)

Pictogram voting comment.svg Comment In my view Wikidata is not well-suited to time-series data. If one has a time-series, IMO better to upload it to Commons in the data: namespace, and link it with some appropriate property from here. Jheald (talk) 18:03, 5 November 2018 (UTC)

@Jheald: Thank you for your comment. What is the argument, that Wikidata is not well-suited for time-series? Performace?
Can you give me some example for your alternative? For me Commons means media files like pictures, audio and video files... Cheers! --Datawiki30 (talk) 20:30, 5 November 2018 (UTC)
@Datawiki30: See c:Help:Tabular_Data. The Wikidata user interface doesn't scale very well with lots and lots of statements -- multiple individual atomic statements are not the best way to store tabular data such time-series with lots of data-points. Better to store it as a whole dataset. Jheald (talk) 21:06, 5 November 2018 (UTC)
  • Pictogram voting comment.svg Comment Please bear in mind that non-current value wouldn't have deprecated rank, but normal rank. --- Jura 21:59, 5 November 2018 (UTC)
  • I actually agree that deprecated statements should be much easier to identify in the Wikidata item pages. I wonder if there is an HTML class on those claims. --Izno (talk) 20:30, 6 November 2018 (UTC)
    • Yes, there is. I use .wb-deprecated { background-color: mistyrose } and .wb-preferred { background-color: lavender } in my common.css to make the different ranks easier to see. There's also a ticket at phab:T198907. - Nikki (talk) 09:35, 9 November 2018 (UTC)

Can we merge Q7216841 and Q30747696 on Russian Wiktionary[edit]

I don't know where differences between two links of ruwiktionary, but both are having same translation in Bosnian, Danish, Finnish, Hebrew, Croatian, Hungarian, Japanese, Korean, Georgian, Lithuanian, Macedonian, Norwegian Nynorsk, Romania, Simple English, Serbian, Swedish, Ukrainian, and Urdu. How are same in these 18 languages not able to change the different from (P1889)? --36.102.227.22 22:33, 4 November 2018 (UTC)

If you think two items at Russian Wiktionary are the same, then you need to have that discussion at Russian Wiktionary. Wikidata does not control what is separate or merged at the other Wikis. --EncycloPetey (talk) 02:48, 5 November 2018 (UTC)
Note: This pair of "duplications" were reported to WD:IC, and I'm still waiting explanations from @Infovarius:. --Liuxinyu970226 (talk) 04:28, 5 November 2018 (UTC)
I was downgrade to said to be the same as (P460) but recently undid by @Infovarius:, still I'm waiting for response from him. --Liuxinyu970226 (talk) 00:29, 11 November 2018 (UTC)
@Liuxinyu970226: These categories are different. Категория:Части речи contains words about part of speech (Q82042) (like "noun", "werkword" or "междометие") while Категория:Слова по частям речи distributes all words of all languages to one of category per part of speech (Q82042) - so it contains categories like "Category:Nouns", "Category:Verbs" and so on. --Infovarius (talk) 20:48, 15 November 2018 (UTC)

Dividing objects into items with example Aichach (Q55133909), Aichach station (Q57174619)[edit]

There is a train station called "Bahnhof Aichach" Aichach station (Q57174619) which all its train station related properties. In Bavaria there is a term called "Gemeindeteil" - this term stands for any named place in a municipality like towns, villages, hamlets, hermitages and sometimes castles, sanatoriums, mills, forester's houses, train stations and so on. These "Gemeindeteil" are named by the municipality and sometimes it revokes some of the names too.

Aichach (Q55133909) is a train station too per source, proved as train station (Bhf.) for example in the "Amtlichen Ortsverzeichnis von Bayern" (official gazetteer from Bavaria) 1964 page 9 [2] under the municipality "Algertshausen". From the same gazetteer, page 6* [3] I cite "Die Bahnhöfe sind mit ihren bahnamtlichen Bezeichnungen und der Höhenlage über Normalnull in den Gemeindeteilen angegeben, in denen sie liegen. In manchen Gemeinden bilden sie eigene Gemeindeteile." (The train stations are specified with their train-official name and absolute altitude in the Gemeindeteilen, in which they are. In some municipalities they form own Gemeindeteile). So the train station themself is a Gemeindeteil.

User:MB-one divides this train station in two items. His reason is, that train station and the other item subject two different concepts. Is this the idea of wikidata, to create concepts and divide one physical existant object in more items? I cant't follow this. He distributes the properties to the two items and ignores the sources. Why is inhabitants a property of his created "Gemeindeteil" but not of his train station? Like the source of 1964 shows, there live 11 inhabitants in the train station (perhaps the station master with his family and so on). Why is located in the administrative territorial entity (P131) with Algertshausen municipality value only in the "Gemeindeteil", the train station is there until first of January 1974, too. Why he deletes the alias name "Bahnhof Aichach" from the "Gemeindeteil" I don't know - you can see at [4] that the "Gemeindeteil"/train station is named so at 1980.

I'm importing for a long time many information from the Bavarian official gazeteers from 1877 to 1987 and there are many such objects (train stations, castles, mills, even powerhouses, and so on). I don't think, it's a good idea to divide all of them into more items without any need.

Is it really reasonable to create concepts and then divide real objects in more items? Where are the limits? And how you can distribute the properties? Or don't do so and make large quantities of redundancy? What are the concepts of my example: one is "train station"? The other? If you think of a train station as part of a timetable, then I would say, it's another concept. But train station also have properties of a geographical object located in the administrative territorial entity (P131) or coordinate location (P625). Is it then a geographical object too. But what's with the other object Aichach (Q55133909)? What is it then? Only a geographical object? What else? --Balû (talk) 03:22, 6 November 2018 (UTC)

@Balû: I can't comment on MB-one's edits aside from the edits to the Aichach items, but I would think in most places (particularly urban areas) it is standard for every railway station to get its own item and its own Wikipedia articles (and both Aichach and its station have English Wikipedia articles). Usually Wikidata items are only about one distinct concept; perhaps it would even be appropriate to separate the building from the station (e.g. Pennsylvania Station (Q14707174) is the demolished station building of Pennsylvania Station (Q54451)) if it is desirable to indicate that people lived in it, although I'm not sure if this would be necessary. Jc86035 (talk) 17:27, 6 November 2018 (UTC)
@Jc86035: you didn't read it correctly: both items Aichach (Q55133909), Aichach station (Q57174619) are for the railway station - the town Aichach has its own item Aichach (Q251678). I agree with you to separate even the building from the railway station. But what should be the concept behind Aichach (Q55133909)? It's the railway station Aichach station (Q57174619) but mentioned in a source in another way. --Balû (talk) 03:43, 7 November 2018 (UTC)
de:Aichach#Ortsteile seems to list Q55133909 as part of Q251678.--- Jura 03:53, 7 November 2018 (UTC)
@Jura1:: no, it lists the chef-lieu (Q956214) Aichach (Q31872780) - it's the former town Aichach and now one of the Ortsteile of the municipality Aichach Aichach (Q251678). The older "Gemeindeteilname" (name of municipality part) "Aichach Bahnhof = Aichach (Q55133909)" was revoked at 1980 but is furtheron the railway station. But there is also the railway station Aichach station (Q57174619), which shall be another item because of concepts I can't see. --Balû (talk) 18:17, 7 November 2018 (UTC)
You can read it in the "Amtlichen Ortsverzeichnissen". All railway stations are listed there, too. Namely at the town the station is located. Look at page 9. There is the the town "Aichach" and in the text stands "railway station Aichach see municipality Algertshausen". There then stands "Aichach, railway station, 11 inhabitants". It is the railway station of Aichach. --Balû (talk) 18:31, 7 November 2018 (UTC)
Complicated. Sounds like @MB-one: might have missed the cebwiki bot item. Not sure if we should also have two "A. station", one for the station and another one for the locality (pre-1980). --- Jura 21:02, 7 November 2018 (UTC)
@Jura1: Aichach (Q31872780) (connected to cebwiki) seems to be different from Aichach (Q55133909) (the item in question here; which probably needs a better label). --MB-one (talk) 09:07, 8 November 2018 (UTC)
@Jura1: a station isn't a locality? What then? What about castles, mills, forester houses, and so on. Aren't they localities, too. And what about farms, solitudes, hamlets? All of them are sometimes in the gazeteer "Gemeindeteile". --Balû (talk) 05:49, 10 November 2018 (UTC)

Inclusion of redirects[edit]

Note that I closed an old RFC, Wikidata:Requests for comment/Allow the creation of links to redirects in Wikidata. The close is lengthy and is posted there, I am not going to repeat it. In short, there is consensus that redirects should be allowed, but we are not yet close to actually starting adding them, and other discussions should happen.--Ymblanter (talk) 16:49, 6 November 2018 (UTC)

Thanks. As I understand, there is need for further discussion to have an actual improvement at some time. Ideally we involve @Lydia Pintscher (WMDE) & team from the beginning this time, in order not to discuss a proposal that lacks consideration of several important aspects of the problem (i.e. unlike the RfC itself). Any idea where we should start a discussion? —MisterSynergy (talk) 16:57, 6 November 2018 (UTC)
I would think we need to identify crucial issues for discussion here (possibly even in this thread) and then see may be some of them are obviousl and for others we might need another RfC (which I hope will not stay open for another two years).--Ymblanter (talk) 17:02, 6 November 2018 (UTC)
If we look at the past RfC on a more abstract level (i.e. without considering any specific solution such as “give us redirects”), I think we can conclude that the community requests a more sophisticated sitelink management to improve interwikilinking. The current situation is a consequence of a conflict of goals: Wikidata as a knowledge base vs. Wikidata as a sitelink hub for Wikimedia projects, with the latter goal being somewhat constrained by the former goal. In most situations this works out, but sometimes not and this needs to be improved (Bonnie and Clyde problem). The original proposal of the RfC is pretty dangerous for the knowledge base goal, which is why substantial opposition was raised as well, but I think we have already seen several good ideas in the proposal. What does Lydia think meanwhile? —MisterSynergy (talk) 17:15, 6 November 2018 (UTC)
She posted an extensive comment on that RfC, and i would be surprised if she thinks differently. In any case, to implement the RfC we ultimately need to change the interface, which she must directly approve.--Ymblanter (talk) 17:17, 6 November 2018 (UTC)
Hey :) Yeah it still represents my thinking pretty well I'd say. And I still believe we should explore the option of generating some of the sitelinks from statements further. Don't get me wrong. I understand the current situation sucks for some cases. I am just not convinced that the other option sucks any less. So we need to get creative somehow. --Lydia Pintscher (WMDE) (talk) 12:38, 9 November 2018 (UTC)
@Ymblanter: Thanks for finally closing this! Here are my responses to your queries regarding further issues, I wonder if perhaps we need another RFC to settle these though? Anyway - (1) "only those redirects which help to solve existing problems are welcome" - I think this is simply a matter for the notability policy. A redirect item should only be created if it describes a concept or entity that is clearly distinct from the item the redirect points to, and otherwise notable. In particular, redirects that are simply alternate names or aliases of the primary entity should never have their own items (instead they could be added as alternate labels on the item). (2) "constraint violations are not created" I don't think we are at all talking about automatic creation of redirect items; only editors should be creating redirect items, and they should satisfy constraints just as they do now. In particular when a page is moved on a client wiki, that should NOT automatically create a redirect item. (3) "must be clearly visible and discriminated" - sure we should discuss how to do this, but I agree it's useful. User interface design can be proposed and hashed out as part of implementation, I don't think it needs to be specified in detail up front. (4) "other items" - well, we already had a very long discussion about it, do you really think there's more that could be an issue? ArthurPSmith (talk) 19:01, 6 November 2018 (UTC)
My experience shows that if smth is not on the policy there will always be users pushing it through. If the policy does not say that redirect can not be added en masse and does not specify when the can be added next week there is a user adding thousands per day and claiming it is allowed per policy.--Ymblanter (talk) 19:13, 6 November 2018 (UTC)
I agree we should have a policy - do you think we would need a whole section on redirects in Wikidata:Notability, or something else? I was thinking just a sentence or two there would be sufficient. ArthurPSmith (talk) 19:51, 6 November 2018 (UTC)
I do not particularly mind, as soon as it is clear enough. But I am not really an interested party. I merely summarized what I read on the RfC.--Ymblanter (talk) 20:14, 6 November 2018 (UTC)
  • Oddly, editors at Wikipedia aren't actually interested in implementing already existing solutions to the problem. Sometimes, it seems they just come to Wikidata as they haven't bothered solving the problem at Wikipedia. --- Jura 17:09, 7 November 2018 (UTC)
    Some editors at Wikidata are unable to accept that there are those of us who are interested in both projects. Quite often we find that bashing editors who are active on Wikipedia is considered acceptable here. It should never be. Sitelinks on Wikidata are designed for use on Wikipedia projects and when you are told that artificial restrictions on Wikidata – such as the inability to programmatically read a sitelink to en:Archeologist from archaeologist (Q3621491) – are causing unnecessary problems, you ought to be taking those issues seriously, not blaming other projects for deficiencies here. --RexxS (talk) 20:08, 7 November 2018 (UTC)
    I don't understand your inability to link archeologist to w:archeology programmatically. Where do you want to do it and what approaches have you tried? --- Jura 20:52, 7 November 2018 (UTC)
    Jura are you being deliberately obtuse? The problem is linking from all the other wikipedias to an enwiki page; there is nothing enwiki can do to allow that, and so for all those languages that have a page for archaeologist (Q3621491), it looks like there is nothing relevant in enwiki. Creating a redirect link is what solves the problem. What else do YOU propose to do such a thing? ArthurPSmith (talk) 21:41, 7 November 2018 (UTC)
    Can you provide a sample use case and explain how you attempted to solve it? --- Jura 22:01, 7 November 2018 (UTC)
    @Jura1: It seems to me he just did. He's suggesting that archaeologist (Q3621491) ought to be able to sitelink en:Archeologist, even though the latter is a redirect. - Jmabel (talk) 00:30, 8 November 2018 (UTC)
    I've manually added the sitelink by temporarily blanking the page. Jc86035 (talk) 16:35, 8 November 2018 (UTC)
    @Jmabel: I don't see how this would be a usecase of enwiki one tries to solve, but maybe it's just theoretical one. --- Jura 04:06, 8 November 2018 (UTC)
    I don't want to ventriloquize User:ArthurPSmith, so he should probably step back in here, but from what he wrote, he seems to be saying that it is an existing case, and proposing this as the solution. How is that "theoretical"? - Jmabel (talk) 16:35, 8 November 2018 (UTC)
    Thanks Jmabel. I think Jura was asking for in what way implementing a redirect link would actually help an enwiki user in a case such as this one. To me the important use case here is for users of other language wikis, if somebody who speaks both languages happens to go to the archaeologist (Q3621491) page in that language, the interlanguage links do not indicate there is an English-language article that is closely related. Allowing a redirect link in Wikidata would enable that interlanguage link to appear, and so the bilingual user could easily look at both and perhaps understand better the topic than from just looking at one language page. However, the interlanguage link to those other languages does NOT appear on the enwiki page right now, and it would presumably require some development work to allow that in one form or another. It would be nice to do that in the long run, but this first step does at least fulfill a real purpose. ArthurPSmith (talk) 16:58, 8 November 2018 (UTC)
    @Jmabel: I think it would still be good to see an actual usecase for enwiki. Some like "I'm an English Wikipedia user and I have built an infobox for biographical articles about archeologists and want the word "archeologist" in the infobox to link to "archeology" (if there is no article for archeologist)". --- Jura 16:03, 9 November 2018 (UTC)
    @Jura1: I'm an English Wikipedian (as well as being active on other Wikimedia projects) and I have built an internationalised module used on 50+ wikis that contains functions to read Wikidata into infobox fields. If I want to use it to create a Wikidata-aware infobox for en:Howard Carter, I want each of his occupations to link to the relevant English article. On Wikidata he has three occupations: anthropologist (Q4773904), archaeologist (Q3621491), egyptologist (Q1350189). The first one has a sitelink to the English article en:Anthropologist and the other two need to have sitelinks to the English redirects en:Archaeologist and en:Egyptologist. Now multiply that by hundreds of occupations and by dozens of other properties (such as educated at (P69)) that may point to Wikidata entities that correspond to redirects; and then multiply that by the number of wikis that use the module or some similar module. I hope you're not going to suggest that I blank the redirects, then create the sitelink, then put the redirect back for thousands of items on Wikidata. Why not simply allow egyptologist (Q1350189) to link to en:Egyptologist? There's a user case – and it's not the first time I've expounded it here – now it's time for you to do some work as well and explain why forbidding those sitelinks makes any sense to folks re-using Wikidata in the other projects. --RexxS (talk) 23:38, 10 November 2018 (UTC)
  • Thank you so much! Implementing that decision will enable enwiki to remove lots of existing errors, and should enable Wikipedias to use Wikidata in places where they currently avoid using Wikidata because of the errors caused by the lack of redirects. There is a one-to-one correspondence here. For the field: archaeology (Q23498) = en:Archaeology. For the profession: archaeologist (Q3621491) = en:Archaeologist. The problem has been that Wikidata did not permit the second link to be recorded because en:Archaeologist is a redirect. (enwiki has no article on the profession; it's covered in the article on the field.) In many cases, such as infobox templates, enwiki editors have coded workarounds which link to the enwiki page whose title matches the Wikidata label in English, and hope that that page is an article on a relevant topic. This works in the case of archaeology. However, it regularly causes problems elsewhere in enwiki. The title may match an article on a different topic with a similar name (en:Michael Jackson when the link intended en:Michael Jackson (writer)) or a disambiguation page (a list of en:John Smiths). It will be wonderful to be able to start recording accurate links instead of guessing and hoping. Certes (talk) 16:51, 8 November 2018 (UTC)
  • I reread Lydia post and want to say that it confuses the interests of data consumers. Say a data consumer or example wants to host data about mental illnesses and identifies those internally with IDC codes. We have Wikipedia pages that discuss multiple IDC codes and saying that it's bad for the data consumer to be linked to the page that discusses the illness that corresponds to the IDC code, because that page also discusses other IDC codes misidentifies the interests of that data consumer.
For all data consumers who want to be able to link to information if information is available in Wikidata, linking to redirects is good. It's only bad if a data consumer doesn't want the link if they don't like that other subjects get discussed as well at the linked page.
@MisterSynergy: suggestion that the RfC is a result of conflicting goals between Wikidata as a knowledge base vs. Wikidata as a sitelink hub is misleading. As the other of the RfC I think it's useful for both the knowledge base because it allows better discription of entities that don't have their own Wikipedia page but that are described in a paragraph on a page, the otherwise couldn't be linked. The conflict is more between inclusionism and exclusionism.
As far as the points by ArthurPSmith go:
(1) I agree that it makes sense to change the notability policy in a way to explicitely say that redirect don't produce notability via #1 of the notability policy. Given that this is clarification of existing policy I don't see a need for an RfC to do that.
(2) Redirect creation should be done with attention to detail and as such I support to limit it to human editors and forbid bots from doing so. I don't think that anybody specifically argued for the ability of bots to create redirects and as a result I don't think we need to start an RfC for that.
As far as redirect creation through moving, it's already the status quo that moving creates redirect links today. There are good arguments that we might want to handle that differently, but I they are a separate issue from this RfC.
(3) Having a user interface where the redirect is highlited is a good idea. I also think that the details shouldn't be decided via a discussion but should be hashed out by a WikimediaDE UI person. ChristianKl❫ 18:55, 8 November 2018 (UTC)
  • @Ymblanter: I very much welcome your close, and your decision which seems to me to make a lot of sense. However, the following line is not correct: Given that we currently have no mechanism of redirect addition (except for twice moving articles on the projects, which is a clear disruption and I am sure will be perceived as such on the projects).
All that is required to sitelink to a redirect is to temporarily comment out the #REDIRECT on the wikipedia page, then create the sitelink, then restore the #REDIRECT. I don't think any wikipedias would see this as disruption; I doubt that many would even notice.
Alternatively, when a redirect page does not yet exist, it is easy to create it with some holding text, sitelink it, then finally add the #REDIRECT that turns it into a fully functioning redirect.
It would be nice to have a better interface for doing this; but the lack of such an interface is not a blocker, neither for humans nor for bots.
Given the community decision, two further priorities stand out (at least IMO): Firstly, to distinguish sitelinks connecting to redirects that are good targets for sitelinks (ie en:archaeologist -> en:archaeology) from sitelinks connecting to bad targets for sitelinks (eg most redirects caused by page moves). On en-wiki and many others, "good" redirects for sitelinks can/should be marked after sitelinking by Template:Wikidata redirect (Q16956589). This template should be spread to more wikipedias; existing uses should be reviewed and confirmed; and sitelinks to redirects that don't have the template should be investigated. These are all things the community/communities can do.
The second thing is to mark on Wikidata and in the sidebar of Wikipedia articles when a sitelink goes to a redirect. This is something we need the developers to do; but now the community has decided that sitelinks to redirects can be okay, and we are going to start creating them for certain types of cases more systematically, the developers should understand that the community now want such redirect badges as a priority. However this is a desideratum not a requirement; there is IMO no need for the community to hold off and wait for this before creating redirects, now that the RFC has been closed and the community has actualised its decision. Jheald (talk) 12:40, 9 November 2018 (UTC)
#2 now suggested at m:Community Wishlist Survey 2019/Wikidata/Indicate when Wikidata sitelink is to a redirect Jheald (talk) 16:36, 9 November 2018 (UTC)
Indeed, technically there are also other means to insert redirects, but I am sure none of them would be really accepted on the projects. A user mass-moving pages twice solely to create Wikidata redirects, or a user mass-removing and readding redirects solely to add them to Wikidata items, on English Wikipedia will likely be first dragged to ANI and warned, and eventually blocked.--Ymblanter (talk)
Sorry @Ymblanter:, but I simply don't think that's true. As noted above, enwiki even has a template, en:Template:Wikidata redirect, to mark redirects created to accommodate incoming wikidata sitelinks. It currently has 25,000 uses. A user running a mass scheme of removing #REDIRECT lines, adding a Wikidata sitelink, then restoring the #REDIRECT line, each done in perhaps less than a minute, wouldn't be warned, blocked, or dragged to ANI. They would simply be seen as going through the hoops that need to be gone through to do something useful.
As for mass-moving pages twice, why would anyone do that, when they could simply edit the page? Jheald (talk) 20:57, 9 November 2018 (UTC)
Well, we obviously have different perception and different experience concerning the relation of the English Wikipedia and Wikidata.--Ymblanter (talk) 21:01, 9 November 2018 (UTC)
If one was adding wikidata-based content to an actual reading page then that might ruffle some feathers.
But adding a sitelink to a redirecting page that no-one on en-wiki is going to see anyway? No, unlikely that anyone would even notice; still less likely that they would care. Jheald (talk) 21:57, 9 November 2018 (UTC)

Rename official website (P856) to "homepage" or "official URL"[edit]

ChristianKl
ArthurPSmith
d1g
JakobVoss
Jura
Jsamwrites
MisterSynergy
Salgo60
Micru
Pintoch
Harshrathod50


Pictogram voting comment.svg Notified participants of WikiProject Properties

This property says to refer to the "URL of the official website of an item". To me, this is the domain name of the website.

However, this property seems to points to "official homepage" rather than "official website":

Another question is that "http://schema.org/url" and "http://www.w3.org/2006/vcard/ns#url" are listed between the equivalent properties, but they seems to be equivalent to URL (P2699) rather than official website (P856).--Malore (talk) 18:30, 6 November 2018 (UTC)

I don't mind if its called official website or homepage, the meaning is the same. About your other question, http://schema.org/url is exemplified with a "home page" and vcard urls are also used for home pages. -- JakobVoss (talk) 21:50, 6 November 2018 (UTC)
@JakobVoss: Currently, http://schema.org/url is stated as equivalent of official website (P856) and URL (P2699). Since they are two different properties, one of the two equivalences is wrong. Given that http://schema.org/url is described as "URL of the item", I think it's "more equivalent" to URL (P2699).
As regards the naming, I think the difference between homepage and website is as follows:
  • Symbol support vote.svg Support I've thought that for quite some time already, was just too lazy to bring it up. --Marsupium (talk) 16:34, 9 November 2018 (UTC)

The use of "Structural reasons"[edit]

I am raising the concern of adding persons for structural reasons and may not be deleted as an item linked as the spouse of a notable person, with references, and mentioned in multiple Wikipedia article. I am afraid that this will lead to the start of adding childrens and fathers/mothers as well. Is there any opinion about this?
As an example I do have abt. 800 notable persons from the Norwegian wikipedia with references to cencuses. And in these cencuses I can also find the spouses, children and parents of the notable person. I do not find those spouses, children or parents notable. I have made a suggestion for deletion on Alexander Borodin (Q164004) wife Ekaterina Protopopova (Q57975541). Please have a look at that proposal also. Pmt (talk) 19:59, 6 November 2018 (UTC)

Items are allowed to be just containers for labels. It's information on the notable person, the subjects of the individual items don't need to be notable in their own right. (I think we should rename Wikidata:Notability to something like "Inclusion criteria", really.) --Yair rand (talk) 20:33, 6 November 2018 (UTC)
@Yair rand: By this you mean that I can create items for a notable persons children, parents and spouses? Pmt (talk) 21:24, 6 November 2018 (UTC)
@Pmt: You can create items for almost any dead person if you provide scholarly sources as References These are missing for Alexander Borodin (Q164004) at the moment! -- JakobVoss (talk) 21:57, 6 November 2018 (UTC)
@JakobVoss: and @Yair rand: I am still a bit confused and would like to have some further advises. As a scolary source I have the Norwegian 1801 census. And if I then can create an item for an person with described by source (P1343) in no label (Q11969384). Example no label (Q58333479) (not completed, but have all references). I am focusing on persons but what about say, individual ships, found in different (Norwegian) Archives or in Lloyds register? And again sorry for insisting but I find it usefull to have this discussed before doing wrong Things. Pmt (talk) 19:24, 7 November 2018 (UTC)
Generally yes. If you however want to enter all people in the whole Norwegian 1801 census you should first create a bot approval request. ChristianKl❫ 21:09, 7 November 2018 (UTC)
@ChristianKL: Are you kidding? We all are described by some census, so you are to allow adding all humans? --Infovarius (talk) 20:56, 15 November 2018 (UTC)
  • Is there a way to indicate that somebody was married / divorced etc., on a particular date without naming their spouse? I omit such information because I don't want to create another item just to hold a spouse name. I suppose significant event (P793) could be used. Death of a spouse would also be a significant event. Ghouston (talk) 21:39, 7 November 2018 (UTC)
    • I think there are cases where one should make an item for the spouse, even if one doesn't know their name: wife of St. Peter (Q22340337). --- Jura 05:40, 8 November 2018 (UTC)
  • Often I do know the name of the spouse, but if they don't seem to be notable for anything it doesn't seem worth creating an item. Ghouston (talk) 08:43, 8 November 2018 (UTC)
  • Part of the way Wikidata is designed that if the information about the spouse is worth recording, then that's ground for the person being notable under our rules. This way the data structure is much easier to work with for people who work automatically with it. This way querying for the name of the wife of a given person works the same in every case. ChristianKl❫ 20:03, 8 November 2018 (UTC)

How to connect occupation and employer?[edit]

Q58325707

She is a columnist for the Evening Standard. I entered both of those things, but how can I connect them? She has more than one occupation and she isn't a chef for the Evening Standard. Face-wink.svg Alexis Jazz (talk) 11:15, 7 November 2018 (UTC)

I thought of Lee Manjai (Q21263395) item way of doing this, but this example has missing property (date for each employer), and maybe something else I missed. — regards, Revi 13:12, 7 November 2018 (UTC)
"Employer" can be qualified with "position held", which may help. - PKM (talk) 22:31, 9 November 2018 (UTC)

Add data relating to the GDPR about companies[edit]

The General Data Protection Regulation obliges companies to provide some information publicly, such as a privacy policy and a contact person. Different organisations have started mapping out this data, but there is little joint work organising this data into a coherent whole, or even initial organisational effort into making sure this gets mapped the right way. My organisation, PersonalData.IO, would like to work on this. What is the best way to get started on this effort? This is the core of the data that would need to be mapped: organisation name, organisation id elsewhere (Open Corporates), organisation website, organisation privacy policy, contact information. An example data source would be here.

Accessory questions: the UK Information Commissioner's Office has a register of fee payers, or in fact it had one publicly downloadable, now only searchable. Assuming it is under the [5], is this something Wikidata would want? Pdehaye (talk) 14:45, 7 November 2018 (UTC)

The Open Government Licence is not directly compatible with CC0. ChristianKl❫ 20:54, 7 November 2018 (UTC)
Thanks! This link says "The other Wikimedia hosted projects including Wikipedia, Wikibooks and Wikiquote may also find new material they can incorporate into their projects". Shouldn't it be changed then? Pdehaye (talk) 22:02, 7 November 2018 (UTC)
If that is the case then yes it sounds like that page should be updated ·addshore· talk to me! 12:30, 8 November 2018 (UTC)
I don't see any mention of Wikidata in that quote. Wikidata is licensed with a more permissive license then Wikipedia and as such not everything that can be used in Wikipedia can also be used in Wikidata. ChristianKl❫ 18:41, 8 November 2018 (UTC)
It says "The other Wikimedia hosted projects including...", and there was also the prior quote "It is considered a free content licence and compatible with the requirements of all Wikimedia projects including Wikipedia, Wikisource and Wikimedia Commons". This is a clear contradiction to what you were claiming. I dug a bit deeper, and indeed you seem to be correct. I made some changes over there, please do have a look. Pdehaye (talk) 08:35, 9 November 2018 (UTC)

Anyone able to return to the core of the question? Thanks! Pdehaye (talk) 01:43, 16 November 2018 (UTC)

Differentiate between primary sources and other sources in references[edit]

Is there a way to point out that a particular reference is the primary source of a statement (e.g. "https://blogs.windows.com/windowsexperience/2014/09/30/announcing-windows-10/" is the primary source for Windows 10 (Q18168774)inception (P571)  1 October 2014)?. Maybe something like a "reference type" property is sufficient, maybe it's better something more integrated in the Wikibase software like ranks or badges.--Malore (talk) 15:44, 7 November 2018 (UTC)

@Malore: There is type of reference (P3865).--Micru (talk) 15:50, 7 November 2018 (UTC)
Wikidata:Property proposal/reference has role could also be relevant. —MisterSynergy (talk) 15:54, 7 November 2018 (UTC)
Thank you, they are both interesting.--Malore (talk) 16:57, 7 November 2018 (UTC)
I think the wording of this discussion is very confusing. When discussing references, "primary" is most often used in contrast with "secondary". A primary source is a source written by someone or some organization directly involved in the subject matter, or written close to the time of an event. "Secondary" is a source written by someone who was not directly involved in an event, not the proposer of some idea, etc., and who consulted a number of sources to come up with a description of the matter. The Windows 10 example would be a primary source because it was written by Microsoft.
But this discussion is about identifying the reference that actually contains or directly supports the statement. We should find some other word than "primary" to describe this source.
Perhaps a better example would be geographic coordinates where the reference gives the coordinates in a system not supported by Wikipedia. So the containing reference would be the one that contains the coordinates, and an explanatory reference could be a website that can convert from the coodinate system in the containing reference to Wikidata's coordinate reference system. Jc3s5h (talk) 17:53, 7 November 2018 (UTC)
@Jc3s5h: No, maybe I didn't explain it right but I was referring to "primary" sources as opposed to "secondary".--Malore (talk) 04:04, 8 November 2018 (UTC)
I don' understand what you mean by primary and secondary. I understand what w:Wikipedia:Identifying reliable sources means by those terms, but I don't understand your meaning. Jc3s5h (talk) 12:56, 8 November 2018 (UTC)
@Malore:, what Jc3s5h wrote above is certainly the normal distinction of a primary and secondary source in English. (Wikipedia and other encyclopedias would be tertiary, because they are, at least in principle, drawn from a survey of secondary sources. If you mean something else, what exactly do you mean? - Jmabel (talk) 16:38, 8 November 2018 (UTC)
@Jc3s5h, Jmabel: I mean exactly what Jc3s5h means for primary, secondary and tertiary sources: in particular, what I mean by primary source is "the original source, the source the information comes from" and what I mean by other sources is secondary and terziary sources--Malore (talk) 18:58, 8 November 2018 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────In that case, I don't think we should seek to designate the source the information comes from. Secondary sources are normally considered the best sources to use in English Wikipedia, and I think the same should normally apply to Wikidata. I think the circumstances where it would be better to use a primary source over a secondary source would be too complex to convey with any of the usual Wikidata properties, qualifiers, etc. The only place I can imagine where you could explain such a situation would be the item's talk page. Jc3s5h (talk) 19:07, 8 November 2018 (UTC)

"Principal" source might be a useful phrase in English to indicate the key source supporting a particular fact. A principal source might be a primary source, secondary source, or tertiary source. Which of the latter groups a source belongs to can often be inferred from what it is -- for example if the cited source is a general history, that's likely to be a secondary source; if it's a letter or a personal memoir, that's quite likely to be a primary source. So in many cases the instance of (P31) statement on the item for the source, or a genre (P136) statement if available, should give quite a strong clue. Jheald (talk) 12:06, 9 November 2018 (UTC)
I think Wikidata should contain as many sources as possible (primary, secondary and tertiary). IMO, pointing out which are the primary sources is useful if someone is looking for the original source of the information.--Malore (talk) 13:43, 10 November 2018 (UTC)
The problem with the word "primary" as used in this discussion and w:Wikipedia:No original research#Primary, secondary and tertiary sources is that it does not correlate well with being the original source of the information. A source could have been written close to an event, and perhaps involved, but not viewed as the definitive source on the matter. For example, a newspaper published on or just after the day of a death might quote a police chief, who stated the death occurred on a certain date. But the definitive source for a date of a death in a modern first-world country would be the birth certificate. If it's available to the public, later authors of secondary sources would probably examine the death certificate to say when the death occurred. Jc3s5h (talk) 17:37, 10 November 2018 (UTC)
@Jc3s5h: A primary source doesn't have to be reliable or official. Often, secondary and tertiary sources are more reliable than primary ones. As regards your examples, the death certificate is both a primary source and an official source, while the newspaper article quoting the police chief is a secondary source. The unedited video interview of the police officer would be a primary source, but not an official one. Your example made me think it would be good if we'll identify also "official sources" of statements (e.g. the death certificate in the case of the date of a person death).
@Jheald: It's not automatic that a personal memoir is a primary source and a general history is a secondary source.
A personal memoir is a primary source only where it talks about the personal experience of the author, but it may report also other information about events that he didn't experienced. Likewise, a general history can be a primary source of the author opinion or the book itself.
Regarding the "key source supporting a particular fact" there is already statement supported by (P3680).--Malore (talk) 20:03, 12 November 2018 (UTC)
I disagree that a newspaper article written within a few days of an event can ever be a secondary source for the event; it is too close in time to be a secondary source, even if the reporter did not witness the event. It's certainly true that primary sources are not always reliable or official, but ones that are not reliable have no place in Wikidata. Primary sources that are unofficial but reliable could be cited. Jc3s5h (talk) 21:00, 12 November 2018 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── The idea that secondary sources are superior to primary sources is a concept peculiar to Wikipedia. Other Wikimedia projects, such as Wikiquote, Wiktionary, Wikisource, and Wikispecies place emphasis upon primary sources. We want the first place a quote was published, or original text demonstrating the definition of a word, or the first publication of a text. On Wikispecies, the primary literature, where a new scientific name or new combination was first published, and the supporting scientific for the name, is desirable over a secondary source interpreting or reviewing previous studies. --EncycloPetey (talk) 21:25, 12 November 2018 (UTC)

As regards the newspaper article, even an edited video of an event is considered a secondary source because it somehow modifies a primary source (the video of the event).--Malore (talk) 05:47, 13 November 2018 (UTC)

Consensus is needed for Q48928408[edit]

Some people misunderstand the definition of located in the administrative territorial entity (P131). The present "protected" version of Q48928408 is swearing black is white.

The description of location (P276) stated that: "In case of an administrative entity use P131". It is clear that the Mainland Port Area is an administrative entity, and it is a common knowledge that the Mainland Port Area is located in the Hong Kong SAR. 210.3.92.210 09:46, 8 November 2018 (UTC)

@Deryck Chan, Jc86035: if they know how to handle this. --Liuxinyu970226 (talk) 10:48, 8 November 2018 (UTC)
@Liuxinyu970226: I think the property doesn't really work for this particular item. The port area is administratively part of Futian District but is physically located within Yau Tsim Mong District. The English description of the property refers specifically to the "territory of the following administrative entity", which would imply the value should be Yau Tsim Mong if the area is not truly an enclave in every respect (Shenzhen Bay Port (Q5972370) is in Nanshan District). Jc86035 (talk) 10:55, 8 November 2018 (UTC)
That's why I re-changed its Chinese translations as "位于行政领土实体/位於行政領土實體" as "行政领土实体/行政領土實體" (administrative territory entity) do not necessarily same as "行政区/行政區" (administrative area). --Liuxinyu970226 (talk) 10:58, 8 November 2018 (UTC)
Could we not have two statements with qualifiers (e.g. applies to part (P518))? - Nikki (talk) 12:23, 8 November 2018 (UTC)

This is a problem Wikidata isn't equipped to solve. Normally located in the administrative territorial entity (P131) works because a geographical location or a physical artefact is both located inside and administered by the same territorial administrative entity. But this breaks down when the item itself is about a enclave (Q171441) - by definition a enclave (Q171441) is "located in" and "administered by" two different entities. This isn't a translation problem, but rather that located in the administrative territorial entity (P131) can't cope with this edge case unambiguously. We can list both values (Yau Tsim Mong and Futian) and qualify each with either object has role (P3831) or nature of statement (P5102) = geographic location (Q2221906) and administrative territorial entity (Q56061); or we can avoid using located in the administrative territorial entity (P131) altogether. At any rate we need a bespoke solution for this bespoke problem. Deryck Chan (talk) 13:12, 8 November 2018 (UTC)

The different ways subdivisions can be related to countries and each other is very complicated. See Wikidata:Project_chat/Archive/2018/10#Countries_and_their_subdivisions_and_territory, where I tried to outline some of the issues involved. Also note the existence of the properties exclave of (P500) and enclave within (P501). --Yair rand (talk) 19:21, 8 November 2018 (UTC)

According to this government notice, "《国务院关于同意广东广深港高铁西九龙站口岸对外开放的批复》(国函〔2018〕33号)明确规定,广深港高铁西九龙站内地口岸区由深圳市人民政府负责管理". That means the Mainland Port Area is managed by the government of Shenzhen City, not Futian District. It is solely managed by the city level government. It is not a part of Futian District. 210.3.92.210 03:03, 9 November 2018 (UTC)

In this aspect, it seems that Shenzhen is more accurate than Futian. Also, I think Mainland jurisdiction is what need to be emphasized here. CommInt'l (talk) 11:29, 12 November 2018 (UTC)

[Consultation] Using Q-items for senses[edit]

To link between equivalent senses of different language lexemes we have translation (P5972). This works quite well when the number of linked senses is small, however as the number grows, it becomes harder to maintain. For an example of this you can check Lexeme:L12912#S1, where each one of the linked senses with translation (P5972) is linked to the others; if I would add a new translation I would have to update at least 20 other lexemes. To make it simpler, in this case it is possible to link from each equivalent sense to the Q-item Tuesday (Q127) using item for this sense (P5137), then it is superfluous to add translation (P5972). In this case it is very straightforward because the item already exists, however we don't have an item to represent the senses of other words like but (L1387) or important (L4147). On en-wiktionary you can see the translations of "but" and "important". If we would create items for those senses, then it would be possible to link all lexemes to them. Even if the senses could be considered clearly identifiable conceptual entities that can be described using serious and publicly available references (and as such notable), this is a novel application of the Q-items, that is why I would like to invite more colleagues to express their view on this. On the previous discussion about this topic, ArthurPSmith estimated that we would need under 100,000 new Q-items. Another possible way to estimate the number of items could be to count the transclusions of Template:trans-top, considering that we might already have items that represent the senses of the nouns.--Micru (talk) 13:41, 8 November 2018 (UTC)

@Micru: To me it seems a bit of an oversight to not allow items to share glosses from a centralized item (since many words can be translated exactly), although the situation is somewhat similar for labels and descriptions. I think it would be sensible to use items to indicate translations, although perhaps it would be more suitable to use a new property to indicate items about senses, separate from the items about concepts which the senses describe. Jc86035 (talk) 14:09, 8 November 2018 (UTC)
I think "sense" == "concept", no? I'm not clear on the distinction you're trying to make here anyway. Certainly the current arrangement works well for nouns (for almost all of which we already have Wikidata items representing at least some of the different concepts associated with those nouns). The problem is for verbs, adjectives, adverbs, and other parts of speech, which aren't "entities" in the usual sense, but their conceptual meanings are actions or qualities or modifiers of some sort. ArthurPSmith (talk) 16:28, 8 November 2018 (UTC)
@ArthurPSmith: Somewhere up the page, white (Q23444) is used as an example of an item linked to from both noun lexemes and adjective lexemes. I thought it would be potentially messy if multiple groups of senses would be linked to the same item. Jc86035 (talk) 17:00, 8 November 2018 (UTC)
@Jc86035: I think it's ok. Many nouns are used as adjectives with the same conceptual meaning, but of course a differing syntax and meaning within the sentence. "This white" vs "white table" seem to me little different from "This garden" vs "garden table". I think it would be fine to link both noun and adjective senses to the same Q item, I don't think that would hurt translation. But descriptive adjectives like "important", "long", "deep", etc. don't have that close relation to nouns and I think would need their own items. If we proceed with this approach... ArthurPSmith (talk) 18:33, 8 November 2018 (UTC)
@ArthurPSmith, Jc86035: It all depends what we want to achieve. If we want to be able to support wiktionaries in a way that when a translation is updated it shows up in all wiktionaries, then the most pragmatic approach is to follow their practices and create as many items for senses as needed.--Micru (talk) 20:35, 8 November 2018 (UTC)
Not all words have senses referring to some concept. There is no concept that "but" refers to, its meaning is functional rather than conceptual. —Rua (mew) 18:13, 8 November 2018 (UTC)
@Rua: The conceptual meaning of conjunctions is not something I've really thought about, but maybe one could argue for it? There aren't so many of them, anyway. ArthurPSmith (talk) 18:33, 8 November 2018 (UTC)
For "and" there is logical conjunction (Q191081) I suppose. Probably we'd want to add a conceptual form that wasn't purely focused on logic though. ArthurPSmith (talk) 18:39, 8 November 2018 (UTC)
@Rua: Even if the meaning of a lexeme is functional, it still has meaning which perhaps some day we will be able to capture with statements. On a more practical level, it reduces complexity to link all translations to a central Q-item instead of linking them between them.--Micru (talk) 20:35, 8 November 2018 (UTC)
FWIW, "but" has the same literal meaning as "and", with the additional connotation that the fact that both conjoined statements are true might go counter to expectations (or other similar connotations of contrast). - Jmabel (talk) 21:10, 8 November 2018 (UTC)
I wouldn't worry so much about conjunctions and more functional lexical categories. The real question is whether verb, adjective and adverb senses should be added as Q Items to the Main namespace of Wikidata, since these categories undoubtedly contain conceptual entities, each with many synonyms and translations. Once we have decided on that, perhaps there could then be more discussion about taking it further with more lexical categories. Liamjamesperritt (talk) 01:02, 9 November 2018 (UTC)
Another point to consider: many languages have things like frequentatives, passives and other verb-to-verb derivations that don't change the overall conceptual meaning of the verb, but add a semantic nuance or change how the verb behaves grammatically. For nouns, languages might have diminutives or augmentatives. Handling such variations will need some thought: do we consider them all to have the same sense item, or different ones? The former would allow terms to be more easily found, since finding the basic meaning is enough, but would make translations less specific because potentially many verbs in such a set could translate a single English (or other language) verb. The latter would be more specific, but then you end up with a lot more sense items, which are harder to search through. —Rua (mew) 18:41, 10 November 2018 (UTC)
@Rua: The wiktionaries have faced that problem long ago, and they have found consensus around translation lists that we can import. There is no need to overthink this, we'll do our best in our capacity, and we can discuss individual cases. For now I think it is just enough to follow the steps of what others have done in the past, and create the senses where there is a consensus. For the rest there is no need to create items, the senses in the lexemes might be enough for those.--Micru (talk) 23:13, 11 November 2018 (UTC)

Instances of term[edit]

Is instance of (P31)music term (Q20202269) correct on items? It seems unnecessary and possibly incorrect to me, since it's usually used on items which describe concepts which aren't words (e.g. human voice (Q7390), tonality (Q192822), big band (Q207378)). Jc86035 (talk) 13:57, 8 November 2018 (UTC)

  • In English, at least, a "term" is not necessarily a single word. I can't speak for how the cognates of "term" are used in other languages. - Jmabel (talk) 16:42, 8 November 2018 (UTC)
    @Jmabel: Notwithstanding terms often not being single words, it seems to me that if the item describes a concept rather than the term itself, then it shouldn't be classified as a term (but perhaps the relevant senses of lexemes should be; e.g. tonality (L36186)). Jc86035 (talk) 16:54, 8 November 2018 (UTC)
  • I agree that the "term" should be avoided in the Q-namespace and terms should be modeled in the lexeme namespace. ChristianKl❫ 18:04, 8 November 2018 (UTC)
  • It is permitted for items to have instance of (P31) → term, but only where the term is the actual topic. (See for example, Q18352004.) None of the three examples mentioned above are about terms, they are about the concepts and should not have the statement. --Yair rand (talk) 21:22, 8 November 2018 (UTC)
There are lots of these for legal concepts whose EN wiki articles start off like "xxxx is a legal term for yyyy". They should be fixed. - PKM (talk) 22:41, 9 November 2018 (UTC)

Family / Ancestor Data[edit]

I want to create a family history database for my own Family/Ancestors and also for others. I believe that i can get access to a data source which can have this information available which is not digitized yet. Please suggest if this will be good project or not, what kind of technical challenges can come up eg. Sharing some one's personal information can be objectionable but i think this will help people connect to their families in their family tree and after a few decades and centuries this data will be very valuable.

Such data can be very useful in carrying out any type of research in future. Please suggest.  – The preceding unsigned comment was added by Dalbirmaan (talk • contribs) at 2018-11-09 05:41 (UTC).

  • Are you saying you want to do this within Wikidata (I'd oppose that) or just in general (lots of such things already exist, such as at ancestry.com). - Jmabel (talk) 06:00, 9 November 2018 (UTC)
  • @Dalbirmaan: Wikidata items have to be based on public sources. In this case this might mean publishing the data source that isn't yet digitalized. I'm not sure whether or not WikiSource is open to publishing the kind of document you want to publish but otherwise you would need to find another space online to publish it.
After that step is done, I think it's great to add the data to Wikidata. Having sourced data on Wikidata that's not available elsewhere is great. ChristianKl❫ 16:00, 12 November 2018 (UTC)
  • @Dalbirmaan: -- you shouldn't reinvent the wheel; there are existing genealogy database formats (most famously/notoriously GEDCOM (Q667761)) and genealogy websites that would be glad to incorporate your data... AnonMoos (talk) 07:55, 13 November 2018 (UTC)
    @AnonMoos: Reinventing the wheel isn't required to add genealogy to Wikidata and there's already some of it inside Wikidata. There are even two tools to display genealogical trees that already exists. We also have Wikidata:WikiProject_Genealogy. The ability to link people to other items like employers is valuable to capture information that won't be displayed in traditional genealogical websites. Apart from that none of the popular genealogical websites has a free license. ChristianKl❫ 18:47, 14 November 2018 (UTC)

Category or article?[edit]

I have a category of pictures on Commons. On Wikidata, should i link it to the category on wikipedia or to the article itself? --Guérin Nicolas (talk) 07:50, 9 November 2018 (UTC)

  • @Guérin Nicolas: If Wikidata already has an item for the category, then link to that. Otherwise, link to the item connected to wikipedia articles, unless that is already linked to a Commons gallery. Jheald (talk) 11:32, 9 November 2018 (UTC)

Reminder: Community Wishlist Survey is open until November 11th[edit]

Hello all,

Just a reminder that the Community Wishlist Survey is open until November 11th. You can add your ideas about Wikidata here. This is a great opportunity to let us know what are your ideas and priorities in term of improvement of the software. Feel free to add your input to other discussions as well.

Cheers, Lea Lacroix (WMDE) (talk) 14:46, 9 November 2018 (UTC)

Railway - platform height, floor height - people with disabilities[edit]

How to store

  • platform height of railway station
  • floor heigth of trains

? Both are important for many people with certain disabilities if they want to enter or leave a train. 95.116.22.185 14:58, 9 November 2018 (UTC)

What do you mean by "store"? I've searched items with these titles but I did not find results. Esteban16 (talk) 23:19, 9 November 2018 (UTC)
I think that person means which properties to use to store that information. I don't think we have any way at the moment. Where could we find the sources of that data? The only property we have for people with disabilities is wheelchair accessibility (P2846).--Micru (talk) 23:48, 9 November 2018 (UTC)
@Micru: I think this would need several new properties (for rolling stock, probably from the parameter list of w:en:Template:Infobox train); and might require the creation of items for individual platform edges, especially for stations where platforms have had their height changed, or where platforms serving a single train type have different heights. Jc86035 (talk) 07:07, 10 November 2018 (UTC)

Presentation of property pages[edit]

I really suggest changing the presentation of Wikidata pages to help distinguishing immediately that we are on a "Property" page (Pnnn), whose creation is restricted (and where there should be no other wiki article associated to them), and not on an entity page (Qnnn), whose creation is free (and can list wiki articles about roughly the same topic as the item described in Wikidata).

There are frequent errors where properties of properties are added/modified/deleted that should instead be done in properties of the associated item.

A "Property" page (Pnnn) should have a distinguishing background color (e.g. light blue instead of white), and its "associated Wikidata item" property should be at top of the list of properties (it should be ranked highest within the set of properties we should give to any well-defined property "Pnnn"), just below the translated labels, then followed by constraints (that should be ranked second within the set of properties we can give to a property).

Ranking properties of defined "Pnnn" properties can be based on a specific Wikidata item "Qxxx" (labelled "Wikidata property") describing how Wikidata properties can be specified (i.e. the list of properties each property must, should, or may have). "Qxxx" should have itself the "nature of item"="property" (qualified with "of"="Wikidata") where "property" is also another Wikidata item (not a property), or "Qxxx" can be a "subclass of"="property" (another Wikidata item whose "nature of item"="entity" or one of the subclasses of "entity" which is more specific)

All the other properties of properties "Pnnn" are just informative (but may be checked) to subclassifies the set of all properties (i.e. for handling metaclasses as entities, whose properties are refereing to the "Pnnn" properties that indicate which "Qnnn" item can use that property and which value we can assign to them).

Verdy p (talk) 15:20, 9 November 2018 (UTC)

@Verdy p: Distinguishing property pages might be easily done by adding something like this CSS to MediaWiki:Common.css, although I'm not sure whether it would help or whether there would be consensus for it. Jc86035 (talk) 10:14, 11 November 2018 (UTC)
.ns-120 .mw-body {
	background-color: #e8f2ff;
}

This info should be made visible also in the search engine when it gives results we can select from, when entering new declarations, or in the interface presenting the data (notably in properties). Some icons/coloring would help also (we should be able to distinguish properties that are imperative, and must be respected, from those that are indicative. As well we should be able to distinguish constraints that are just "suggestions" (best available choices) from those that are exhaustive (allowing no exception to the constraint without a formal discussion to extend it or add new cases as it can severely break the rest of the infered semantics).

Presentation of results in Wikidata is the only thing that can help users creating new contradictions (or solving them by adding new redundancies that will be maintained separately and will later introduce contradictions/aberrations in inferences) because Wikidata is extremely permissive and in fact allows storing any declaration in the dataset, and can also itself make false inferences (too many "automatic guesses" when entering data where some entered data is automatically replaced by something else; a famous example, in the "easiest" part of Wikidata: enter a valid unique language code, press TAB to enter the name of an article on a wiki, the code is immediately replaced by another language whose name in the current user's language contains that code, so "es" may be instantly replaced by another language than Spanish; the same thing happens everywhere you enter a label name to select an entity: when there are several choices, Wikidata arbitrarily chooses the first one it finds, without asking the user, and without even informing it correctly in the choice list to allows the user to choose correctly). Verdy p (talk) 11:51, 11 November 2018 (UTC)

Imperial State of Iran[edit]

Can someone with the proper expertise disentangle Imperial State of Iran (Q207991)? The country and the Pahlavi dynasty should be two separate items. Thanks. - PKM (talk) 22:38, 9 November 2018 (UTC)

Donating data[edit]

Before starting a process? From the National Archival Services of Norway (Q6516420) there is an Excel spreadsheet listing all archive actors/creators of the The National Archives of Norway With names of the dcreator, ID and URI. The list contains about 27000 actors/creators. The property Archiveportal archive ID (P5888) already exist as an identificator. As an example the Archive of The Norwgian Garrison on Saoth Georgia has Archiveportal archive ID (P5888) no-a1450-01000001366069 and http://live.arkivportalen.no/entity/no-a1450-01000001366069 as URI. Should this data be donated as a whole to Wikidata? Pmt (talk) 23:03, 9 November 2018 (UTC)

@Pmt: I would add it to Mix'n'Match. That way it can easily be linked up. You can do that yourself at https://tools.wmflabs.org/mix-n-match/import.php . For these kind of sets we generally don't import just everything because you'll just end up with a ton of orphaned items. Multichill (talk) 12:12, 10 November 2018 (UTC)
@Multichill: @Jeblad: Of course! Thank you. Seems reasonable. Pmt (talk) 12:18, 10 November 2018 (UTC)
@Multichill, Pmt: I'm not sure Mix'n'Match will work in this case, as the names can diverge quite much from what you expect. It is often not the formal name of the creator, but a short description of the creator. Often the creator has a relation to the archived material, and not an equivalence. Jeblad (talk) 14:32, 10 November 2018 (UTC)

Merge problem[edit]

I have been attempting to merge Q1196915 and Q18393795, which are clearly about the same play. No matter what I try, there is always some kind of error. The cause seems to be that Q1196915 is linked to "The Son" in the English Wikipedia, which is incorrect ("The Son" is a redirect to a disambiguation page), but I get an error when trying to remove the link. Q18393795 is linked to the correct page. How can this be resolved? Kevinsam2 (talk) 13:02, 10 November 2018 (UTC)

✓ Done. I removed the enwiki sitelink in Q1196915 first before merging. --Bluemask (talk) 13:11, 10 November 2018 (UTC)
@Bluemask OK, thanks. Did I not have permission to do that for some reason? That's exactly what I tried to do. Kevinsam2 (talk) 13:15, 10 November 2018 (UTC)

Adding coordinates[edit]

Elementary question; Commons has the easy Locator-tool in the editor, or I can paste coordinates in DMS or DD format into the Location template (with a little more editing). I don't see a way to do either here, so what are the usual or best ways, and is there a page about this? Jim.henderson (talk) 19:01, 10 November 2018 (UTC)

@Jim.henderson: You can use coordinate location (P625). Pasting coordinates in as they are displayed should work; the software will try to autoformat them (even "11.2222 -33.4444" will work). Jc86035 (talk) 10:02, 11 November 2018 (UTC)

First World war causes[edit]

(About this claim)

Am I the only one to think that a formulation like is weird and that would be far better ? Does anyone really understand the rationale of the first formulation ?

Imho we should focus of how to represent the fact that causes of World War I (Q310802) is a composite item, which consist of several events. The fact it’s a compound item implies that the causes are multiple on his own and make the claim more direct.

author  TomT0m / talk page 12:52, 11 November 2018 (UTC)

single-value constraint with title property[edit]

Hello. Some months ago I have asked "Why title (P1476) should only contain a single value?" Wikidata:Project chat/Archive/2018/04#Title. I have removed the single-value constraint (maybe I was wrong). Today a user (@Jura1: undo that. What we should do? Have the single-value constraint or not? Xaris333 (talk) 15:16, 11 November 2018 (UTC)

  • Thanks for the link to the discussion. There are 20 million uses of this property. Do we know many may have several titles?
In any case, the constraint isn't and wasn't mandatory (having several statements is a potential issue, but isn't an issue as such). --- Jura 17:11, 11 November 2018 (UTC)

Incorrect interpretation of dates in Q1165538 leading to persons still alive getting P570 by Reinheitsgebot invoked by Mix'n'match[edit]

For a number of items related to still living people, Reinheitsgebot has added a date of death (P570). It seems various dates in the source Nationalencyklopedin (Q1165538) have been incorrectly used as date of death (P570). I have undone these 19 edits.

Is there anything else that needs to be done in order to prevent these faulty Mix'n'match statements/changes to be reinserted? --Larske (talk) 22:53, 11 November 2018 (UTC)

@Larske: Best is probably to notify @Magnus Manske: about the parsing issue with this code mentioning examples like this one, so that he can fix it. I'm not very familiar with entries for that encyclopedia and swedish language, but it may be worth swapping the logic to give precedence to dates actually labelled as birth/death rather than unspecified date ranges that may be anything else than life span.--Nono314 (talk) 18:03, 12 November 2018 (UTC)

Wikidata weekly summary #338[edit]

Amanda Vickery: different versions?[edit]

When I look at the data item Amanda Vickery (Q4739786) for w:Amanda Vickery on an iPad and on a laptop I see two different occupations in the statements. On the iPad it shows up as Sociologist, but on the laptop it is historian.

I have been known to miss the obvious, so I hope I am not wasting anybody’s time.

Thanks in advance, Ottawahitech (talk) 16:11, 12 November 2018 (UTC)

Someone changed the label for "historian" to "Sociologist". The edit was undone, but pages are cached for a while, so edits to linked entities don't always show up immediately. - Nikki (talk) 10:27, 13 November 2018 (UTC)

Mix-n-Match failures[edit]

I'm not sure what's causing this particular failure, but twice now (1 & 2) a Swedish translation of Euripides' play Orestes has been added to Orestes (Q663886).

All translations of works must have their own separate data items, like this: Orestes (Q58487976). They are never placed on the data item for the work as a whole. --EncycloPetey (talk) 19:15, 12 November 2018 (UTC)

It's very frustrating when bots are used to make edits like these, and the bot user says there was no way to know that it was wrong. This kind of edit is never right, so it should be possible in many ways to identify the edit as wrong and not make that edit. --EncycloPetey (talk) 14:51, 14 November 2018 (UTC)

Hebrew speaker needed[edit]

New and absent User:Brandy256 added an interwiki link for https://he.wikipedia.org/wiki/פייראנג'לו_גרניאני to Pierangelo Metrangolo (Q20684796). However the Wikipedia article is about someone who lived 1930-2011; the Wikidata item is about different person born in 1972. I've removed the link, but a new Wikidata item is needed, and I can't read the Wikipedia article or some of its sources. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:09, 13 November 2018 (UTC)

Link it to Pierangelo Garegnani (Q1746797) -- AnonMoos (talk) 21:09, 13 November 2018 (UTC)

Change coming to how certain templates will appear on the mobile web[edit]

CKoerner (WMF) (talk) 19:35, 13 November 2018 (UTC)

Military rank[edit]

In Stan Lee (Q181900) his military rank (P410) is given as playwright (Q214917) which reflects the (marvelous) fact that he was a playwright for the United States Army during WWII. However, this was not his rank, it was his military occupational specialty, with many examples such as Gunnery officer (Q55613109) or drill instructor (Q5307556). This is my long-winded way of saying Wikidata needs a military occupational specialty Property (or Item). Anybody more experienced than me know the best way to go about creating and populating this property/item? Abductive (talk) 22:25, 13 November 2018 (UTC)

Maybe use an Employer statement with employer US Army and position held playwright? Or the same on a military branch (P241) statement. Ghouston (talk) 22:31, 13 November 2018 (UTC)
playwright (Q214917) isn't a valid position (Q4164871), either. Well, I did what I could and it's more plausible than a rank. Ghouston (talk) 03:17, 14 November 2018 (UTC)
"Field of work" as a qualifier? Andrew Gray (talk) 22:03, 15 November 2018 (UTC)

Shareen Blair Brysac[edit]

This person is showing (in Google) as born in 1900. She has just e-mailed in OTRS Ticket:2018111310011801 to say she is still alive, and not born in 1900! - see https://www.wikidata.org/w/index.php?title=Q7489571&diff=789236222&oldid=735004332. I didn't want to edit it out, in case the bot puts it back in. I cannot see a birth date on the internet, and she has not said, so I'm guessing from her profile, it's around 1950, not the 1900 shown in http://snaccooperative.org/ark:/99166/w6p98t8p. Could someone please fix the data?  – The preceding unsigned comment was added by Ronhjones (talk • contribs).

@Josve05a: Why did you remove the statement instead of marking it as deprecated? The reference given does appear to claim that the date of birth is 1900. - Nikki (talk) 03:37, 14 November 2018 (UTC)
Hmm, not sure...must have completely slipped my mind. believe I though the source was about another person entirely. Feel free to revert me. (tJosve05a (c) 11:52, 14 November 2018 (UTC)
I added her correct birth date from the BGMI as 15 January 1939. Cheers. --RAN (talk) 17:07, 14 November 2018 (UTC)

Scholarly articles that are book reviews[edit]

If a scholarly article is a book review, should the “main subject” be the edition of the book, or the subject of the book, or both? - PKM (talk) 02:32, 14 November 2018 (UTC)

@PKM: In the absence of a single value constraint on P921, I see no harm in using both. Mahir256 (talk) 03:10, 14 November 2018 (UTC)
@PKM:Typical problem when mixing different classifications: scholarly article is a text format, book review is about the content of a text. WikiProject Books should once fix the classification by analyzing in detail the characteristics of a book. Snipre (talk) 07:37, 14 November 2018 (UTC)
@Snipre: While a "scholarly article" as used in WD is clearly more like an "edition" than a work (based on its properties), currently, "scholarly article" is <subclass of> "article" is <sublcass of> "work" (while also being a subclass of "publication"). If you have an idea of a way to separate "publications" from "works", please lay it out so we can discuss - I agree this area is fuzzy. But I would say that a "book review" is a type of article either way. In any case, I would not recommend or support making separate "work" and "edition" items for every scholarly article in WD. That way lies madness. - PKM (talk) 20:47, 14 November 2018 (UTC)

Quantity with range[edit]

Currently there are several approaches to enter these:

  • Add quantity as 1 +/- 0.5 (property with quantity datatype, the precision isn't necessarily meant for this).
  • Use two quantity datatype properties: e.g. quantity max = 1.5, quantity min = 0.5 (simple to enter, but could get messy if there are several values, especially if a maximum is associated with a minimum)
  • Use some random property and added to qualifier: one for the maximum, one for the minimum. I think we only have this with item datatype, but I guess it could be with any datatype.

The last approach seems the cleanest. The reference can be attached to the entire statement. I wonder if we should have a datatype that caters directly to this approach. Contrary to all other datatype, it needn't have a main value. --- Jura 05:24, 14 November 2018 (UTC)

In Aachen (Q1017) I used the another approach for elevation above sea level (P2044) by putting two statements using the qualifier applies to part (P518) to indicate whether the statement is maximum or minimum value. This allows to give additional qualifiers for each value, in this case the coordinates where that value applies. Ahoerstemeier (talk) 10:02, 14 November 2018 (UTC)
  • There's a big difference between a range, and uncertainty values. Abductive (talk) 10:35, 14 November 2018 (UTC)
  • Using the first option would require some qualifier like uncertainty corresponds to (P2571) to be added with an item describing not an uncertainty but a range. Without this qualifier the uncertainty will be probably interpreted incorrectly. Wostr (talk) 12:22, 14 November 2018 (UTC)

Editing statements with the keyboard works better now[edit]

For over a year now, editing statements with the keyboard has been partially broken: when you tried to add or remove qualifiers or references by tabbing to the corresponding link and pressing Enter, the whole statement would be saved, making it impossible to create a statement with qualifiers and/or references in a single edit without using the mouse. This bug has finally been fixed: you can hopefully edit more efficiently now :) --Lucas Werkmeister (WMDE) (talk) 23:39, 14 November 2018 (UTC)

File:T154869.webm is a screencast demonstrating the bug and how it’s now fixed, in case you want to see it in action. --Lucas Werkmeister (WMDE) (talk) 23:44, 14 November 2018 (UTC)
Excellent. This should make editing much more efficient. Well done. --Yair rand (talk) 08:01, 15 November 2018 (UTC) Thank you! This makes a real difference, my mouse will be happy Moebeus (talk) 12:10, 16 November 2018 (UTC)

Does Wikidata recycle redirected items?[edit]

I refer to Nicole Wong (Q24213300). User 202.40.137.199's action on 11 June 2016‎ makes me wonder if recycling or reusing is appropriate.--Roy17 (talk) 11:21, 15 November 2018 (UTC)

It's highly inappropiate but also quite untracable. Sjoerd de Bruin (talk) 11:30, 15 November 2018 (UTC)
"quite untracable"? I don't follow. "if recycling or reusing is appropriate", reusing or recycling items is a very bad thing to do. ·addshore· talk to me! 14:34, 15 November 2018 (UTC)
I think Sjoerd means that recycling items is not wanted, and that it's also hard to track and find which items may have been recycled. Lea Lacroix (WMDE) (talk) 14:38, 15 November 2018 (UTC)
It should be possible to come up with a list with possible items looking at items that were redirects at one point in history but are no longer. Could be useful! ·addshore· talk to me! 16:20, 15 November 2018 (UTC)
I wondered about this - but wouldn't 99% of it be merges that were undone later on? I suppose you could look for redirects which have completely different labels now to the pre-redirect versions, or something. Andrew Gray (talk) 20:25, 15 November 2018 (UTC)
  • Q24213300 wasn't actually a redirect, but a blank item, as for some reason the merger didn't go through. In any case, please move the statements to a new item and delete this.
    BTW we could attempt to track items that were de-redirected with an edit filter, if that isn't already being done. --- Jura 06:10, 16 November 2018 (UTC)

creating item record from WorldCat/OCLC record[edit]

Is there currently a way to create an entry for a book by ingesting bibliographic data from WorldCat? Rather than entering several data fields manually, it would be nice to be able to enter an ISBN and to semi-automatically ingest the relevant fields. Is there a way to do that? - Kenirwin (talk) 15:26, 15 November 2018 (UTC)

I've asked about this as well for the importing of Wikisource editions. The answer I've gotten is "no", nothing like that exists and it's not a priority. --EncycloPetey (talk) 05:59, 16 November 2018 (UTC)

Comment for requesting a bot for creating an item for all archive-actors from the norwegian national archives[edit]

I want to have a bot creating items for the archive creators listed from the Norwegian national Archives. The creators will have the Arkivportalen agent ID (P5887). The creators are persons and govermental istitutions. There will be about 27.000 creators.May I have the community opinion this matter before requesting a bot?  – The preceding unsigned comment was added by [[User:|]] ([[User talk:|talk]] • contribs).

(the above was written by Pmt (talk) 10:50, 16 November 2018 (UTC) )

Yes much the the same, but have done further reading and checking on the sources and are therefor asking once again. Pmt (talk) 10:50, 16 November 2018 (UTC)
See Topic:Uokgrg225uvr49eu. Multichill (talk) 12:44, 16 November 2018 (UTC)

Q58689252[edit]

Please delete Q58689252, it was just used to spam Wikidata and redirect the url for Google Knowledge Graph. --RAN (talk) 20:00, 15 November 2018 (UTC)

✓ Done by Ymblanter. Mahir256 (talk) 20:50, 15 November 2018 (UTC)

Preparing a data import for companies from the first half of the 20th century[edit]

After checking for existing items with Mix-n-match, I'm preparing a data import of companies from 20th century press archives (Q36948990). It would be great if you could have a look at the results of the input script, which is creating input for QuickStatements. Example items are Ruberoidwerke (Q58712849) or Banque d'Anvers (Q58718259). A few questions on which feedback would be particularly welcome:

  • In the archive, there is only on name for the company, normally the official name, derived from German, French, English or whereever the company stems from. Currently, I insert that (same) name as German and English label. I wonder if I should do so for more languages in latin script. Does that make sense, even when parts of the company name could be and perhaps commonly are translated in other languages?
  • Currently, I uniformly insert a German and English description ("Unternehmen"/"business"), which might be quite helpful in some cases (particularly if a company is named after the founder), but on the other hand may prevent the later automated insertion of some more precise description. Thoughts?
  • I generate a quite excessive reference for all statements (besides the PM20 folder ID (P4293), where it would be circular). Too much?

Cheers, Jneubert (talk) 21:33, 15 November 2018 (UTC)

Kopiersperre Jklamo ArthurPSmith S.K. Givegivetake fnielsen rjlabs ChristianKl Vladimir Alexiev User:Pintoch Parikan User:Cardinha00 User:zuphilip MB-one User:Simonmarch User:Jneubert Mathieudu68 User:Kippelboy User:Datawiki30

Pictogram voting comment.svg Notified participants of WikiProject Companies


  • nl label for Banque d'Anvers (Q58718259) seems to be "Bank van Antwerpen". If you can determine what sector a business is in, that would be helpful. How many should this import? --- Jura 04:22, 16 November 2018 (UTC)
@Jura1: Your example nails the point - "Bank van Antwerpen" probably is the official name in nl, "Banque d'Anvers" would be plainly wrong in that case. So this is about trade-offs - adding a label which is generated wrong in some cases, or better no label for now?
The business sector is known (controlled vocabulary), but still has to be mapped to WD items.
For now, about 20, which I plan to use in a demo in a Wikidata workshop at SWIB. Potentially thousands - more info about the dataset here. The big bottleneck is checking against existing companies, which is much harder than with persons.
-- Jneubert (talk) 07:25, 16 November 2018 (UTC)

P460 Said to be the same as[edit]

Property:P460 is "said to be the same as" wouldn't it be less confusing if the primary wording was "similar to". I took a quick look at a tranche of 50 and in almost every case I would say that the concept is similar to the other concept. Currently it reads: "said to be the same as" "this item is said to be the same as that item, but the statement is disputed" which reads to me as "disputed synonym". Look at 50 yourself and see if "similar to" best describes the relationship between the two. --RAN (talk) 23:11, 15 November 2018 (UTC)

"Similar to" won't always work. This property is also used in cases where the two items are not similar at all, yet are confused with each other because of labelling. There are also situations where the two items are sometimes considered the same as each other, but sometimes not. --EncycloPetey (talk) 05:57, 16 November 2018 (UTC)
@EncycloPetey: The case, you described is, what different from (P1889) was created for. said to be the same as (P460) should not be used for this. --MB-one (talk) 13:59, 16 November 2018 (UTC)
A good example is . In some senses they are the same stock character, but the latter is specific to Romanian culture. - Jmabel (talk) 06:33, 16 November 2018 (UTC)

gutenberg books not all linked[edit]

Through a number of manual searches, I've discovered a number of books that don't include links to their Project Gutenberg editions. (Some of the original Project Gutenberg books are not linked -- it's not just new ones.) Is anyone working on importing these? I noticed in adding a few by hand that there's suppose to be one per entry; should some of these have their own edition entry made? I notice there are actually quite a few that violate this that don't seem to be in the list of violations (or the list of exceptions). Even if I did have the motivation to work on an importer, there's a lot of issues I don't currently know how to resolve... --Ssd (talk) 05:35, 16 November 2018 (UTC)

Most (all?) of the Gutenberg editions are editions, and do need a separate data item created for them. They should never be added to the item about a literary work, but should be treated as editions. This is the subject of the thread "Mix-n-Match failures" further up this talk page. --EncycloPetey (talk) 05:55, 16 November 2018 (UTC)

Same brand used by different entities; brand hierarchies[edit]

Kopiersperre Jklamo ArthurPSmith S.K. Givegivetake fnielsen rjlabs ChristianKl Vladimir Alexiev User:Pintoch Parikan User:Cardinha00 User:zuphilip MB-one User:Simonmarch User:Jneubert Mathieudu68 User:Kippelboy User:Datawiki30

Pictogram voting comment.svg Notified participants of WikiProject Companies

If the same brand (Q431289) is used by different entities (over time, or even at the same time), should we create items for each user or just keep one item? An example: Mercedes-Benz (Q36008) has historical been used by Daimler-Benz (Q541477) and is currently used by different division (Q334453) of Daimler AG (Q27530). Thus we have the specific items Mercedes-Benz cars (Q1921162), Mercedes-Benz (Q55383469), Mercedes-Benz buses (Q1427345) and Mercedes-Benz Trucks (Q24928813). It even gets more complicated, where the same brand is used by different independent legal entities (e.g. Renault Trucks (Q840045) and Renault (Q6686)).

Maybe this is very automotive specific, but nevertheless we should agree on a unified approach.

--MB-one (talk) 13:56, 16 November 2018 (UTC)

Anything special need done?[edit]

I am not a novice editor, I am not looking for a "how to". But I have never edited this situation, so I am wondering where I begin for this situation.

https://www.wikidata.org/wiki/Q58497098

I am knowledgeable about the author, not so much about the paper. Do I create a new item for the author and then populate it? Or do I populate this item? Or, since this is a temporary item (according to the editing history), is there something else that needs done entirely? 2601:983:8100:CE48:ED22:488C:3B45:EE1A