Help talk:Dates

From Wikidata
Jump to navigation Jump to search

BC section[edit]

Just curious .. do we have any references or actually items that contradict the section? --
--- Jura 11:07, 10 August 2016 (UTC)

(edit conflict) I dispute the contents of the "Years BC" section. All the information I can find about the RDF output and the JSON output suggest they should produce the same output for the same item, except that if the item is in the Julian calendar, RDF output will convert it to Gregorian and JSON won't. If we take any BC date that isn't in January or December, they produce years that are different by one. The present advice presumes that the user interface is correct, and therefore the JSON must be wrong. But I've read elsewhere that developers regard the JSON is canonical. I don't think this section should exist until we obtain an ironclad answer about whether the user interface, the JSON, or the RDF is correct.

See further discussion at WD:Project chat#458 BC. Jc3s5h (talk) 11:12, 10 August 2016 (UTC)

There are three statements in the section. There is a reference for 2nd, so I take it you agree with that. It also applies to the 3rd. If you enter a date as 457 BCE, it wont work for these. JSON isn't mentioned.
--- Jura 11:18, 10 August 2016 (UTC)

(edit conflict) Jura1 (talkcontribslogs), there are a bunch of Phabricator tickets on this topic, and for all the ones I've found, the discussion is leaning toward recognizing the year zero, which makes the user interface wrong. But none of these tickets have been finally resolved, so I can't find anything definitive. Jc3s5h (talk) 11:20, 10 August 2016 (UTC)

Answering Jura1's post of 10 August 2016 (UTC), I'd agree the RDF seems consistent with the user interface. But if the user interface is wrong, then it was an error to write the RDF converter to agree with it. Jc3s5h (talk) 11:24, 10 August 2016 (UTC)

There is https://phabricator.wikimedia.org/T94539 for RDF.
What would be your advice to users who want to add BC years? As a sample, take a user who just wants to enter a date correctly.
--- Jura 11:26, 10 August 2016 (UTC)
Unfortunately, the ticket you mention claims to have been resolved, but there is no explanation of how it was resolved, unless you read the code, which is next to impossible unless you understand the whole system. All I can say is if this patch has been installed on the live Wikidata system, everything I complained about is still true. There is one spot in the code comments that claims that RDF can do date arithmetic accurately according to XDF 1.1 (recognizing year zero). But the user interface does not store "1 BCE" as year zero, so an obvious conflict exists. Jc3s5h (talk) 11:45, 10 August 2016 (UTC)
I don't see how this helps users entering dates. The question is rather basic: Do we type "458 BC" or "457 BCE"? Maybe we should ask a person who recently added some.
--- Jura 11:53, 10 August 2016 (UTC)
My view it's so broken, we shouldn't enter any dates, or read any dates, before AD 1, until this is fixed (unless the precision is a decade or worse). Jc3s5h (talk) 13:34, 10 August 2016 (UTC)
I have changed my view based on this edit to the JSON data model, made today. I believe the different formats of the JSON and RDF are adequately explained. Users should be entering dates into the user interface using the CE/BCE notation or the AD/BC notation, just as it is found in the source. The ability to input a date in the form "0000-01-01" should not be exercised, and any dates displaying the year 0 in the user interface are invalid. Ideally the user would look up the correct date, edit the date, and add the source. Jc3s5h (talk) 20:00, 11 August 2016 (UTC)

Days are in Universal Time[edit]

I have undone this edit because it removes a statement indicating the time zone that applies to a date with precision day.

The relevant data model pages make it clear that date/time representations use Universal Time. The most precise time that can be stored for the time being is 1 day. One could argue that when dealing with precisions of a month, year, decade, etc., the distinction between Universal Time and some other time zone is rarely important. But since the offset between Universal Time and a particular time zone can be as much as 14 hours, this distinction cannot be treated as negligible.

An example where this comes into play is Hale Telescope (Q2471197) property service entry (P729), 27 January 1949, precision day. The cited sources allow us to determine that the actual date and time of service entry was 10:06 pm Pacific Standard Time, January 26, 1949. When this time is converted from Pacific Standard to Universal Time the day of the month changes from 26 to 27.

Let me point out a further use case: ever since the wide availability of mechanical clocks and watches, medical facilities have made a reasonable effort to identify the correct date and time of birth in the local time of the place where the birth or death occurred. The sources available to Wikidata editors will usually omit the time of day, but the dates will be based on the local time dates given in birth and death certificates (even if the path from the certificate to the source is indirect, such as doctor --> certificate --> mother --> son --> reporter interviewing son, who has become important --> book author who read interview).

I have proposed we fix this by regarding dates as being in local time, with the time zone being determined by context, at mediawikiwiki:Talk:Wikibase/Indexing/RDF Dump Format#Time revision and Phabricator task T146499. These proposals have not been acted upon.

For the time being, "Help:Dates" should reflect the status quo of the Wikidata documentation, that all dates and times are Universal Time, and should make this widely-violated statement stand out like a sore thumb so everyone will know that the majority of the dates in Wikidata are wrong. If and when changes are made to the documentation to specify that dates are in local time, "Help:Dates" can be revised to reflect the change. Jc3s5h (talk) 14:26, 30 June 2017 (UTC)

  • So you are trying to say that, e.g., the Japanese should use the day before because Greenwich hasn't had midnight yet ?
    --- Jura 17:56, 30 June 2017 (UTC)
To elaborate on what I said above, I believe the bedrock is what is stored in the database. The general meaning of the data in the database is given by mediawikiwiki:Wikibase/DataModel, but that contains ambiguous generalities, so actionable information is contained in mediawikiwiki:Wikibase/DataModel/JSON and mediawikiwiki: Wikibase/Indexing/RDF Dump Format. The "RDF Dump Format"" specifically says the "xsd:dateTime dates follow XSD 1.1 standard" which in turn that dateTimes are optionally marked with a time zone. However, the Wikidata dateTime model does not allow one to mark the time zone as absent, unspecified, or the like; it's always there, and it's always 0.
The JSON data model document does not have as clear a statement about time zones. It says the hour, minute, second, and time zone are currently unused and must be set to zero. Looking at the first version, times and time zones were described as supported and no provision was mentioned for indicating they were unspecified or absent. My view is that given that history, a huge announcement with bright loud fireworks is required to demonstrate that time zones have been abandoned and we are now treating dateTimes as local times. In the absence of such a fireworks display, I interpret the document as meaning that since the time zone is always 0, the dates are always in Universal Time.
The most common situation is that a source gives a date of an event, such as a birth, but no time of day. If, for example, a source says an event occurred in the eastern time zone of the United States on June 28, 2017, the correct representation would be 2017-06-28T00:00:00Z, precision 10 (months), with qualifiers earliest date 2017-06-28T00:00:00Z (precision 11, days) and latest date 2017-06-29T00:00:00Z (also precision 11, days). Obviously that is unpleasant and it would be much more pleasant of abandoning all hope of ever representing times or time zones with this datatype, and declare all dateTimes are local times.
Finally, I would not attach any particular importance to the mapping from user interface input to what is stored in the database, or what is stored in the database to what is diplayed in the interface, since the user interface is only one of several interfaces, and has a long history of errors. Jc3s5h (talk) 18:45, 30 June 2017 (UTC)
  • Could you just answer my question with yes or no and point me to a place where this explicity agreed or specified?
    --- Jura 07:05, 1 July 2017 (UTC)
Your question was "So you are trying to say that, e.g., the Japanese should use the day before because Greenwich hasn't had midnight yet ?" Your question is not specific enough. Do you mean, how should a reader in Japan interpret the birth date contained in Wikidata for Hirohito (Q34479)? Do you mean if a Japanese editor reads in a Japanese book that a person was born in Tokyo on January 1, 2000, at 7 AM, should the Japanese editor enter the birth date for that person in Wikidata as January 1, 2000, or December 31, 1999?
I would answer the first interpretation of your question that if the reader believed the documentation was being faithfully followed, the birth date and time of Hirohito should be interpreted as between 9 AM April 29, 1901, and 9 AM April 30, 1901, local time.
I would answer the second interpretation of your question that the Japanese editor should enter the birth date as December 31, 1999.
The JSON and RDF documents mentioned previously are the clearest documents I'm aware of, but I admit they are not as clear as I would like. Jc3s5h (talk) 12:29, 1 July 2017 (UTC)
Maybe Q4565560 can help answer it. Should it be 13 January 1945 or 12 January 1945 ? (for point in time (P585) as value with precision 11).
--- Jura 13:43, 1 July 2017 (UTC)
The source provided is the English Wikipedia, which states the date and time were 03:38 AM on January 13, 1945, JST which is 9 hours ahead of Universal Time year-round. Therefore the time would be converted to 6:38 PM January 12, 1945 UT and should be entered into Wikidata as January 13 12, 1945, precision 11. Jc3s5h (talk) 14:06, 1 July 2017 (UTC) modified 09:25 2 July 2017 UT
13? Ok, at least we agree on that. I'm not sure if the addition to Help:Dates really makes this clear.
--- Jura 04:16, 2 July 2017 (UTC)
Sorry, I typed wrong. The date of the month should be 12 because that is the day in Universal Time. Jc3s5h (talk) 13:26, 2 July 2017 (UTC)
You seem to feel the right day of the month is 13. If so, can you find an authoritative statement that days are given in local time? Jc3s5h (talk) 13:32, 2 July 2017 (UTC)
If nothing is specified, people assume it's the local date, not the date somewhere else. IDL partially serves to specify that. Specifically at Wikidata, mw:Wikibase/DataModel/JSON#time mentions that timezone is unused and precision 11 excludes the time part of the time/date-field.
--- Jura 15:03, 2 July 2017 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────I don't know what "IDL" means so I can't address that point.

One concern I have is your phrase "people assume". I feel there are two classes of people, those who directly add data from sources by looking it up in a book and adding it with the user interface, who I'll call editors. Then there are programmers, who either develop the Wikidata system, or add vast amounts of data by creating bots. I agree that editors will assume the times are local times, but I'm concerned that the programmers will write the Wikidata system, and tools that send or receive vast amounts of data to or from Wikidata, according to the specs, without ever ploughing through the existing data and finding out what editors are doing in these edge cases.

I feel my impression is supported by the general tenor of the tasks listed in Bugs related to time datatype (tracking) Phabricator task, which generally take the attitude that the lack of support for hours, minutes, seconds, and time zone is a temporary state of affairs. Allow time more precise than day, and the fact that it is open rather than rejected, supports this view.

Although time zone is described as unused, it is also stated that it must be 0 and there is no method available to say that the time zone is unspecified. Also, the RDF Dump Format follows the XSD 1.1 standard (unless the date is Julian and can't be converted to Gregorian). In that document, we find that the time zone may be absent, but if it is present, it should follow this lexical mapping fragment:

timezoneFrag ::= 'Z' | ('+' | '-') (('0' digit | '1' [0-3]) ':' minuteFrag | '14:00')

This indicates that "Z" is a legitimate time zone representation.

If we look at the RDF for the Hale Telescope (Q2471197) Hale telescope example and use the browser search facility to search for 1949-01-27, we discover the service entry date is represented as "1949-01-27T00:00:00Z". Thus, the RDF is explicitly declaring the time zone as Universal Time, when it could have been absent if the intent was an unspecified local time that the reader was expected to figure out from context. Jc3s5h (talk) 15:51, 2 July 2017 (UTC)

Looks like we don't agree on this, so let's remove the text from Help.
--- Jura 16:14, 2 July 2017 (UTC)
No. I want to apply pressure on this issue to get somebody to resolve it, so if you remove the text, I will pursue every dispute resolution avenue I can find.
If you want, you could add a passage indicating there is a dispute about whether dates are Universal Time or local time, and therefore every date with day precision in Wikidata is suspect. Jc3s5h (talk)
You could try a Request for Comment. It seems a fairly simple question to sort out. Maybe a more positive event than Q4565560 could serve as sample.
--- Jura 16:49, 2 July 2017 (UTC)
By "more positive", I take it you mean a happy event rather than a calamity. How about 2016 World Series (Q24906838). The item indicates the end time was 2 November 2016. The final game was held in Cleveland beginning at 8 PM. Eastern daylight time was in effect, so the corresponding UT time and date was 0 hours, 0 minutes, 3 November 2016.
The game was notable because the Chicago Cubs had not won the World Series since 1908. I tried to think of a less country specific happy event, but the event would have to happen in the morning in Asia, and events scheduled for the morning tend to be political. Happy evening or late afternoon events are harder to find because the west coasts of Canada and the US are the western-most heavily-populated area before you get to the International Date Line. (I hope I didn't miss any heavily-populated islands.) Jc3s5h (talk) 18:11, 2 July 2017 (UTC)
Sounds good, though I'm not sure if times at midnight are ideal. Maybe Q30637192? We could also ask User:Daniel_Kinzler_(WMDE) if the "Z" and "timezone=0" in the dateformat are meant to meaningful. He had revised mw:Wikibase/DataModel/JSON#time last year.
--- Jura 04:56, 3 July 2017 (UTC)

How to get raw date/time value[edit]

This will return the publication date (P577) from A higher level classification of all living organisms (Q19858624) as '29 April 2015':

{{#property:P577|from=Q19858624}} → 29 April 2015

But, it is filtered through the interface language settings. If I change the interface language settings to something else, perhaps русский, then the date renders as: '29 апреля 2015'.

Is there a way to get the raw, unfiltered timestamp string (+2015-04-29T00:00:00Z)?

Trappist the monk (talk) 13:43, 23 March 2018 (UTC)

I don't know, but I would suggest getting the precision too, because some parts of the raw string may not be meaningful if the precision is set looser than "day". Jc3s5h (talk) 18:07, 23 March 2018 (UTC)
Dates can be very complicated if you add calendars, BC dates, and qualifiers. I wrote c:Module:Wikidata date which relies on c:Module:Complex date and few other modules to deal with all of those. In that module you will find timestamp function. Actually the module is also present on Wikidata so you can call {{#invoke:Wikidata date|timestamp|item=Q19858624|property=P577}} which will give you "+2015-04-29T00:00:00Z/11" timestamp string and precision, in the format used by Help:QuickStatements. --Jarekt (talk) 18:21, 23 March 2018 (UTC)
Thank you, that should work.
Trappist the monk (talk) 09:19, 24 March 2018 (UTC)

Century?[edit]

@VorontsovIE: thanks for trying to help out improve this documentation. We seem to have some inconsistencies. In this edit +1900-00-00T00:00:00Z with Precision 100 years renders to 19. century, in this edit 1800 renders to 19th century. Multiple systems probably get mixed up. As the English Wikipedia says: 1800 is the last year of the 18th century, and the 1st year of the 1800s decade. Multichill (talk) 14:19, 7 April 2018 (UTC)

Thank you for cleaning up my edit formatting. Unfortunately there are different conventions about what century is. Wikidata implemented one of these conventions but it's inconsistent through different precisions. Actually, any year between 1801 and 1900 will be rendered to 19th century. I am not sure that we should stick to wikidata default which reports the last year of a century instead of the first year of the century which is more human-readable. BTW, millennium precision had the same problem and Complex date template also reported text inconsistent with wikidata. -- VorontsovIE (talk) 15:53, 7 April 2018 (UTC)
Added a related discussion to a talk page of Module:Complex_date. -- VorontsovIE (talk) 16:20, 7 April 2018 (UTC)
I have reverted the recent changes because the documentation MUST reflect the models which are referenced in the help page. That is what code is, or should be, written to.
It doesn't matter what correct English usage is. If the user interface accepts ways of writing dates, or outputs written forms of dates, that conflicts with the model and the fundamental coding, the user interface is wrong and must be rewritten.
Bear in mind that the date model is derived from ISO 8601 and XSD specifications (which were also based on ISO 8601). Unfortunately these specs didn't handle year 0 until the most recent revisions, which has contaminated the computer industry through and through.
Leaving that aside, in ISO 8601, you indicate precision by omitting digits. Want to indicate a sort-of-century, say, the one that is currently passing? You write "20". It means the years 2000 through 2099. You ask, is that called the 20th or 21st century? ISO 8601 has nothing to say about that. It makes no comment about names of centuries that include the letters "st", "nd" "rd", or "th" after the number. It makes no mention of the span 2001 through 2100. We should not pretend in this help page that spans like 2001 through 2100 have any meaning to the Wikidata model, because they don't. Jc3s5h (talk) 18:32, 7 April 2018 (UTC)
@Multichill:, I will discuss the first post you made in this thread in detail.
First, you mention an edit to the sandbox in which the +1900-00-00T00:00:00Z with Precision 100 years renders to "19. century". The fact that it renders to "19. century" should be regarded as a fault of the user interface. This fault cannot justify editing the part of the help page that describes how Wikidata models dates.
Then you mention an edit in which 1800 renders to 19th century. The code in the edit that deals with rendering a date is {{#invoke:Complex date|complex_date|date=1800|precision=century}}. Sorry, but this page is not about #invoke nor is it about "Complex date". Making mention of that edit requires readers of this thread to spend hours learning about topics that seeming have no bearing in this page. I feel justified in just ignoring #invoke and "Complex date".
Please tell me where the English Wikipedia says "1800 is the last year of the 18th century, and the 1st year of the 1800s decade." There is no consensus for that. The lack of consensus is discussed at w:Century. Jc3s5h (talk) 18:55, 7 April 2018 (UTC)
I seem to have stumbled onto a battleground. I really don't care enough what it is as long it's consistent. I'm just going to leave now and wait until we have a clear winner. The English article about 1800 is at en:1800. Multichill (talk) 19:23, 7 April 2018 (UTC)
I sympathize. I personally have given editing birth and death dates precise to the exact day, unless the person lives/lived near 0° longitude, because the data model can't get time zones right. As for English Wikipedia year articles like "1800", they're so bad, and there are so many of them, that I just give up on all of them. Jc3s5h (talk) 19:38, 7 April 2018 (UTC)
@Jc3s5h:I think your approach is wrong. Lots of users already fill dates using that buggy model. When they specify 19th century, wikidata stores this as 1900 year. If you change UI in such a way that 1900/century will be rendered as 20th century, then lots of data will be treated wrong. Those edits were written to reflect reality, not your thoughts about what model should be. Also I don't understand why you suppose that wikidata model is ISO-8601 compatible. It's partially based on it but doesn't coincide. I find your revertion of edits counterproductive as you force users to read manual incompatible with current model/interface and to make a choice each time: whether to adhere manual (which doesn't have official status, btw) or to rely on how imported dates are rendered. -- VorontsovIE (talk) 13:23, 8 April 2018 (UTC)
I suggest VorontsovIE begins by reading the help page table of contents. It contains a heading "Model" and a heading "Interface". I maintain that the "Model" section does a reasonable job of describing the interface, in accord with the documents mw:Wikibase/DataModel/JSON#time and RDF. The foibles of the interactive user interface should be described in the "Interface" section, not the "Model" section.
For support of my contention that the model is based on ISO 8601, often by way of XSD 1.0 and XSD 1.1, read the Phabricator tickets on the subject, which are listed in the tracking tracking ticket phab:T87764. (Warning: the list of tickets fill 6 screens; you have quite a bit of reading to do.)
As for what users entered and what they expected, that's a problem that has existed with computers and centuries since the beginning of computers, and no solution is in sight, and is exacerbated by Wikidata being an multi-language project. There must be one way to model a century. Using the terminology in "w:Interval (mathematics)", we can model the current century as either [2000–2099} or [2001‐2100] but not both. The English phrase "21st century" could mean either. In various languages, there may not even be popular phrases for one or the other of these intervals. A user interface that uses the words "decade", "century", or "millennium" is certain to confuse some people. Jc3s5h (talk) 21:34, 8 April 2018 (UTC)
VorontsovIE, you wrote above "When they specify 19th century, wikidata stores this as 1900 year." If you mean the user tried to enter a date in the interactive user interface by writing the characters "19th century" this is not true. When you try that, an error message appears, The time value is malformed.. Jc3s5h (talk) 21:43, 8 April 2018 (UTC)
Jc3s5h, does it actually matter for the problem understanding that the exact phrase user should write is "19. century" instead of "19th century"? The problem is that you just can't fix model and not to change meaning of lots of dates which are already in wikidata. It's not possible to change an engine of a moving car, so it's better to describe an engine in use but not an engine which should be in use (but probably never will be). -- VorontsovIE (talk) 22:44, 8 April 2018 (UTC)
It is wrong to only pay attention to how dates are (incorrectly) presented in the interactive user interface. There are other methods to retrieve data, such as Wikidata Query Service. There are other ways to put data into the database, such as QuickStatements. If you read the user manual for Wikidata Query Service, it explains the dates follow RDF 1.1. If you read the documentation for QuickStatements it leads you to List of all data types available which states the time data type is "represented as a timestamp resembling ISO 8601".
So if you extract a timestamp of +1700-01-01T00:00:00Z with a precision of 7, you can't just assume the editor who entered it read in a book that so-and-so was born in the 17th century, decided that typing "17.century" into the user interface was a good way to put that in Wikidata, and it means so-and-so was born in the 17th century. It could equally have been entered by a bot that was able to infer from a database that so-and-so was born in the 1700s (which is the 18th century, except for arguments about the first and last year). So the bot author programmed the bot to follow the documentation and entered as a timestamp of +1700-01-01T00:00:00Z with a precision of 7. Since most data is entered by bots rather than the interactive user interface, the latter guess is probably better. Jc3s5h (talk) 23:59, 8 April 2018 (UTC)
If someone wrote 1700 with precision 7 he/she mentioned not a 1700 year but some century - either 17th or 18th. And normally before bot is run, it is tested to be sure that statements are displayed correct. If someone imports 1700/century and gets unintended century in wikidata user interface, he will rather fix bot than insist that his approach is right - so he doesn't care whether user see right century or not. For example when I prepared a bunch of data for QuickStatements, I prefer to set 1601/1701 to store 17/18 centuries - it will be correct both in terms of model and interface. And to understand how user will set century-level date in wikidata UI, try to put 1800 in UI and choose century-precision. You will immediately get 18th century in "will be displayed as" and most probably that you will choose some different year if you originally intended 1800-1899. -- VorontsovIE (talk) 08:09, 9 April 2018 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────You wrote "If someone wrote 1700 with precision 7 he/she mentioned not a 1700 year but some century". Sorry, but I can't get past that statement. It's not really true. What the user typed in the screen is not preserved. We don't have access to what is really stored in the database, but one of the better was to judge what was stored in the database is to look at the diff of the edit. For example, we can look at this edit you made to the Wikidata Sandbox. If it weren't for this conversation, I would have no way to know if you used the user interface, quickstatements, a bot, or whatever. I can see from the diff that the date is +1600-00-00T00:00:00Z and the precision is 100 years. Following the documentation, I ignore all the digits representing decades and finer and conclude the date is +16dd-dd-ddTdd:dd:dddd where "d" means "don't care". I infer from this conversation you typed "16. century" in the user interface, but that is not preserved anywhere, so it doesn't count.

As for how bot operators test bots, and whether they test the user interface as well as other methods of examining the data, I don't know what they do. I do know that the user interface has changed from time to time. Maybe it worked differently when some bot was run in the past. I only understand English, and enough French to find something to eat in a restaurant, but I have read in discussions that some languages express centuries differently. Maybe a bot operator tested the user interface in a language other than English, and it worked OK in that language. Maybe a bot operator notices the discrepancy between the model documentation and the user interface, and decides the model documentation supersedes the user interface and puts in 1600 precision 7 to designate the period [1600–1699]]. Or maybe a programmer who extracts data wrote a program that just follows the documentation, and never thought to test centuries.

I admire your decision to set the year to xxx1 so it's clear no matter how you look at it.

I think the best way forward is to add to the "Interface section" a description of this problem together with a recommendation to enter centuries with a year number of the form xx01. Jc3s5h (talk) 11:43, 9 April 2018 (UTC)

I took another look at the diff, and I see the edit summary includes "16. century". I know there is a Phabricator ticket open on this, so the meaning of those characters may change if/when the ticket is acted upon. Jc3s5h (talk) 11:48, 9 April 2018 (UTC)

I tried another |experiment with the the sandbox. I entered the date as "1400" and manually set the precision to "century". The result was exactly the same as if I had entered "14. century". But these two methods would probably have different meanings to editors. An editor who enters "1400" and sets the precision to century probably is thinking of the interval [1400–1499), not [1301–1400]. Jc3s5h (talk) 14:12, 9 April 2018 (UTC)

The bug linked above (T95553) has absolutely nothing to do with the issue that you're discussing here. I tried to remove the "Tracked in Phabricator" template, but Jc3s5h reverted me. ¯\_(ツ)_/¯ Kaldari (talk) 03:11, 6 June 2018 (UTC)
The bug you want is T73459. I would change it in the template, but apparently, I'm not allowed to. Kaldari (talk) 03:14, 6 June 2018 (UTC)
Both bugs are relevant. Both bugs center around the user interface allowing input of the form "20. century" or a year of, for example, 1950, with a precision of century. T95553 discusses both input and output.
The user interface is fundamentally flawed because it allows ambiguous expressions such as "20. century". Even if this were changed to proper English, like "20th century", it would be ambiguous because some would interpret it as 1900 through 1999 and others would interpret it as 1901 through 2000. The data model demands that periods with precision century begin with years with the last 2 digits "00" and end in years with the last 2 digits "99". (That's for periods beginning 100 or later; for earlier periods, it depends on whether you're using JSON or RDF.) Jc3s5h (talk) 16:04, 6 June 2018 (UTC)
Just because one bug depends on another bug does not make them the same bug. And regardless of whether it is "fundamentally flawed" or not, there is no chance that Wikidata will stop allowing input of "X century" as this is required for interoperability with Commons and the Structured Data on Commons project (plus it is the most intuitive way for editors to input a 100-year range). But since you believe this is a problem, I've created a specific bug for discussion of this: https://phabricator.wikimedia.org/T196674. Kaldari (talk) 18:21, 7 June 2018 (UTC)
Please provide links for "Commons and the Structured Data on Commons project". Jc3s5h (talk) 21:06, 7 June 2018 (UTC)

I switched descriptions from what should be the first and last year of century or millenium to how the dates are currently interpreted by the wikibase software. So currently if you type any number between "1801" and "1900" with precission of "century" than wikibase software will interpret it as "19th century" and that is what is documented. Those dates are in synch with en:19th_century or de:19. Jahrhundert articles. If the wikibase software interpretation change in the future, then we should alter the documentation. Use Wikidata Sandbox (Q4115189) to try things out. The time periods used by Module:Complex_date are synched with wikibase software time periods. --Jarekt (talk) 14:42, 17 July 2018 (UTC)

I have reverted the change for several reasons.
First, because of Jarekt's use of the phrase "wikibase software" to describe the faulty user interface. This suggests an incomplete understanding of the situation, and should cause deep suspicion about the edits.
The claim The time periods used by Module:Complex_date are synched with wikibase software time periods" is false. Wikidata data model clearly states that dates with precision 7 covers positive years where the first year of the period has the last 2 digits 00 and the last year of the period has digits 99. The user interface is a faulty part of the Wikidata software and it's behaviour should not be described as representative of the behavior of Wikidata.
[[:en:19th_century] is not about any of the standards, such as ISO 8601, that Wikidata dates are derived from, and is thus inapplicable.
It is essential to not think of the faulty user interface as a closed, which can create it's own divergent meaning for precision 7. Just because a date is entered with the user interface does not mean some person or application who needs the data won't obtain it with some other interface, interpret the value according to the published data model, and get the wrong result. Jc3s5h (talk) 18:13, 17 July 2018 (UTC)
Wikidata interface to wikibase software is in perfect synch with English understanding of term century and with Module:Complex_date. Just repeting that that is a bug because it does not match your understanding does not make it so. The software documentation should reflect the current implementation, not what you would like it to be. --Jarekt (talk) 18:27, 17 July 2018 (UTC)
There at least two official interfaces. One is the graphical user interface you get if you put an item number in the search box and push enter. In this page and this discussion, that's generally been called the user interface. Due to it's nature, it's only suitable to read or write small amounts of data. Since much of the data has been input by bots, it's reasonable to infer that most of the data did not come in through the user interface. I suspect that most of the data is output through interfaces other than this interactive user interface as well.
Another official interface is the applicaiton programmng interface (API). That API has at least two models for inputting and outputting information, mediawikiwiki:Wikibase/DataModel/JSON and mediawikiwiki:Wikibase/Indexing/RDF Dump Format. Inspection of these shows that both the JSON and RDF Dump Format documents call for the period during a precision 7 date could occur (for positive years) to begin in the a year with the last digits 00 and end in a year with the last digits 99. (JSON treats 0 as undefined, represents 1 BCE as -0001, 2 BCE as -0002, etc. But dates input with JSON and read out with RDF will be properly converted between the two representations). Jc3s5h (talk) 19:02, 17 July 2018 (UTC)
Jc3s5h Help:Dates is not "official inferface" it is a help page to explain wikidata interface, and the only reason Help:Dates is not in synch with the wikidata interface is that you keep reverting people trying to fix those documentation errors. Most of my 200k edits were not done with GUI but with variery of tools, but that does not change my understanding of how to interpret underlying data. I do not know much about JSON and RDF but if some documentation there differes from english definition of the term century than it should be fixed. --Jarekt (talk) 19:36, 17 July 2018 (UTC)
I have clearly stated my positions. If you alter the page to indicate the interactive user interface is correct and the other interfaces are wrong I will bring this to the attention of administrators for dispute resolution. Jc3s5h (talk) 19:59, 17 July 2018 (UTC)

I have added pointers to this discussion at mediawikiwiki:Talk:Wikibase/Indexing/RDF Dump Format#Disputed interpretation and mediawikiwiki:Talk:Wikibase/DataModel/JSON#Disputed interpretation. Jc3s5h (talk) 22:33, 17 July 2018 (UTC)

How to query with a requirement of at least year-precision[edit]

Please can someone add an example query here of how to filter by date precision in a query. --99of9 (talk) 23:39, 29 August 2018 (UTC)

Clarifying “century”- and “millennium”-precision dates[edit]

There are two possible ways to interpret a timestamp of +1800-00-00T00:00:00Z in combination with a precision of “century” (7):

  1. The value refers to the 18th century, which starts in the year 1701 and ends in year 1800. 1800 is the last year of its century.
  2. The value refers to the 19th century, which starts in the year 1800 and ends in the year 1899. 1800 is the first year of its century.

Though Wikibase’ documentation has not always been clear on this, its user interface has almost always used the first interpretation (“earlier”): Gerrit change Ia1692d1241, merged four years ago, fixed the counting of centuries in general and added a test that +1600-01-01T01:01:01Z is formatted as 16. century. (Prior to that change, centuries were taken to end and begin on years *49/*50, due to rounding.)

Abián and I also surveyed several samples of “ambiguous” dates on Wikidata (that is, time values with precision “century” and a year ending in two zeroes) and looked at the corresponding items, references, Wikipedia articles, etc., to determine which interpretation was actually the correct one. We found that, in the cases where it was possible to decide which interpretation had been meant, 80% of values (57 out of 71) were correct when using the “earlier” interpretation, i. e. the one used by the Wikibase UI.

Whatever the original intentions in the data model for centuries may have been, the current situation is clear: the Wikibase UI for entering and viewing values favors the first interpretation, and most existing data, regardless of how it may have been entered, matches the same interpretation. Changing Wikibase’ presentation of ambiguous values would display this data incorrectly, and we are not going to do that.

I have updated the documentation on MediaWiki.org to clearly and unambiguously define when a century begins and ends. A future change by Abián will also further encode this in the Wikibase software: where currently the precise boundaries of a century matter only when parsing and rendering values, soon they will also be used to correctly check constraints (T168379). I recommend that this documentation be updated accordingly, perhaps with a note on the unfortunate historical ambiguity (for a while, this page exclusively documented the second interpretation, though I still do not understand on what basis). Pinging some previous discussion participants: Multichill, VorontsovIE, Jc3s5h, Kaldari, Jarekt

Apologies that it’s taken us so long to respond to this, but hopefully the situation is cleared up for now. (Final note: all of this also applies to millennia.) --Lucas Werkmeister (WMDE) (talk) 17:35, 17 September 2018 (UTC)

Thanks for picking this up Lucas. I don't have an answer, but I did notice this:
  • +1600-01-01T01:01:01Z precision 11 (day) means somewhere on +1600-01-01 (+1600-01-01T00:00:00Z to +1600-01-01T23:59:59Z)
  • +1600-01-01T01:01:01Z precision 10 (month) means somewhere on +1600-01 (+1600-01-01T00:00:00Z to +1600-01-31T23:59:59Z)
  • +1600-01-01T01:01:01Z precision 9 (year) means somewhere on +1600 (+1600-01-01T00:00:00Z to +1600-12-31T23:59:59Z)
So basically just ignore everything after the precision. Based on that logic I would expect:
  • +1600-01-01T01:01:01Z precision 8 (decade) means somewhere on +160x (+1600-01-01T00:00:00Z to +1609-12-31T23:59:59Z)
  • +1600-01-01T01:01:01Z precision 7 (century) means somewhere on +16xx (+1600-01-01T00:00:00Z to +1699-12-31T23:59:59Z)
  • +1600-01-01T01:01:01Z precision 6 (millenium) means somewhere on +1xxx (+1000-01-01T00:00:00Z to +1999-12-31T23:59:59Z)
The first meaning breaks this logic. If we would want to keep this logic intact, the second option would be better. The second option is very confusing for humans so I don't really like this one.
The best thing to do here is properly to document this logic and describe that we have exceptions, why we have these exceptions to the logic and give a bunch of examples. I assume you are referring to mw:Wikibase/DataModel#Dates_and_times by the way. Multichill (talk) 18:04, 17 September 2018 (UTC)
Multichill, you are right if we assume that such definitions follow logic. For better or worse they follow norms established long time ago, which are not always the most logical. If you look at en:19th century the first sentence is "The 19th century was a century that began on January 1, 1801, and ended on December 31, 1900. " and that is what Lucas' first definition follows. That way if you find in an old source stating that work was created in 19th century, that is what they meant. It is not up to us to redefine such terms, even if it would be more logical. Calendars are full of unfortunate illogical exceptions, like lack of year 0 (so number of years between 1999 and 2001 is 2001-1999 2, while number of years between 1 AD and 1 BC is 1 and not 1-(-1) = 2). --Jarekt (talk) 18:24, 17 September 2018 (UTC)
The "1900s" are the years 1900 through 1999. "19th", as an ordinal, is 1-indexed (for example, the second number in the list of numbers 0-10 is 1). Why not just say "1900s (century)"? --Yair rand (talk) 18:36, 17 September 2018 (UTC)
(Also, I thought the Proleptic Gregorian calendar does have a year 0. Are we not using ISO 8601?) --Yair rand (talk) 18:41, 17 September 2018 (UTC)
Yair rand, the issue is that the sources do not use "1900s (century)" notation, but mostly "century" notation, so if we want to quote a source that uses standard "century" notation, than we need to use the same definition they did. I am a bit confused about 0-year issue. Current system saves -1 BC as -0001-00-00T00:00:00Z and 1 AD as +0001-00-00T00:00:00Z, and is consistent with mw:Wikibase/DataModel/JSON#time, but not with mw:Wikibase/DataModel#Dates_and_times which states that "There is a year number 0 that refers to the year that is commonly called 1 BC(E)." Proleptic Gregorian calendar is only suppose to be used in RDF dumps. --Jarekt (talk) 19:49, 17 September 2018 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── A few issues with what is stated above:

  • "+1600-01-01T01:01:01Z precision 11 (day) means somewhere on +1600-01-01 (+1600-01-01T00:00:00Z to +1600-01-01T23:59:59Z)". That's what some of the documentation says, but it is not consistent with what users are really doing. Users are treating it as local time. This is also the case with looser precision such as month or year, but people might not care as much about looser precision. Treating it as local time would be consistent with the logic of identifying the least significant field, such as day or month, and ignoring everything to the right of the least significant field. In particular, the Z at the end would be ignored. This would also be consistent with ISO 8601 (which inspired our notation, but our notation is different). IS0 8601 is incapable of including a time zone, including Z, unless the precision is hour or tighter.
  • Arguing about the meaning of the phrase "xth century" is fruitless, because a source may not use the phrase "xth century", it may use the phrase (x-1)00's. But we only have one way to represent century precision. Unless we develop two different representations, we cannot precisely convey what is stated in source A that says "the 1900's" and source B which says "the 20th century". Jc3s5h (talk) 20:42, 17 September 2018 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Sorry for the late reply, everyone.
@Multichill: yes, that was more or less the point of my message – since we can’t change the interpretation (we’re tied to the current one by the many values which would otherwise be interpreted incorrectly), I want to ask the Wikidata community to update this documentation to bring it in line with both the software and the existing data. I already adjusted the MediaWiki documentation (see abstract, JSON, and RDF, though I didn’t edit the last one), but think the Wikidata documentation should be left to the community.
@Yair rand, Jarekt: the standard we use depends on the serialization format (JSON or RDF), and some of the documentation on MediaWiki.org was also very outdated, sorry about that. But in the abstract conceptual model of Wikibase, as well as in the JSON model, the year 0 is undefined, and -0001 means 1 BCE. On the other hand, in the RDF world, the year 1 BCE is supposed to be represented by the timestamp 0000, and we adjust timestamps accordingly when exporting to RDF.
@Jc3s5h: there is always the stated as (P1932) qualifier.
--Lucas Werkmeister (WMDE) (talk) 12:35, 13 December 2018 (UTC)

One year or another, but not between both[edit]

How is the date expressed when there is an event that occurred in one year or another, but not between both? For example, if a person died in 1144 or in 1147, and only in those years. That is, it is known that he did not die in 1145 or 1146. Saludos. --Romulanus (talk) 09:39, 18 April 2019 (UTC)