Wikidata:Stable Interface Policy

From Wikidata
Jump to navigation Jump to search
Translate this page;
Other languages:
Bahasa Indonesia • ‎Deutsch • ‎English • ‎dansk • ‎español • ‎français • ‎italiano • ‎polski • ‎svenska • ‎русский • ‎العربية • ‎中文 • ‎日本語

This page contains changes. Please contact a translation admin to mark them for translation.


Other languages:
Bahasa Indonesia • ‎Deutsch • ‎English • ‎dansk • ‎español • ‎français • ‎italiano • ‎polski • ‎svenska • ‎русский • ‎العربية • ‎中文 • ‎日本語

Stable public interfaces for data access are a crucial component of any public knowledge repository. This Stable Interface Policy defines which guarantees are and are not given by the Wikidata development team regarding the stability of data formats and APIs provided by Wikibase as deployed on www.wikidata.org.

Definitions[edit]

This section defines some crucial terms used in this document.

  • Consumer: software that reads and interprets data received from Wikidata.
  • Client: software that calls public Wikidata APIs. Clients are typically also consumers of data.
  • Compliant client/consumer: A client or consumer that complies to the specification of the underlying formats and protocols it uses. For instance, a compliant consumer that reads JSON data complies to the JSON specification, and will accept any encoding allowed by the JSON specification (RFC 7159). A compliant client using a web API will comply to the HTTP spec, etc.
  • Well behaved client/consumer: A (compliant) client or consumer which is implemented in a robust and forward-compatible way, specifically taking into account the guarantees and limitations stated in this document. For instance, a well-behaved client will not break when encountering a new data type.
  • Breaking change: a change to an API or data format that violates guarantees given or widely assumed before. Breaking changes include removal of API functions, parameters, or data fields and changes to the interpretation or format of parameters or data fields.
  • Significant change: a change to an API or data format that would be beneficial for clients or consumers to adapt to, but which will not break a well behaved client or consumer. Significant changes particularly include additions, such as the introduction of new data types or entity types, or the inclusion of additional information in the data output. See Extensibility below.
  • Insignificant change: a change to an API or data format that is not expected to have any impact in a well-behaved client. Insignificant changes include changes to whitespace outside literals as well as the order of fields in a JSON object.
  • Stable Interface: an API or data format for which breaking and significant changes will be announced as per the below policy. Which interfaces are considered stable is defined in the Stable Interfaces later in this document.

Notification Policy[edit]

This section defines where and when the operators of clients and consumers will be notified of changes to a stable interface. No guarantees are made regarding unstable interfaces.

  • Breaking changes to stable interfaces will be duly announced in advance on the relevant mailing lists (wikidata-tech, wikidata and pywikibot) and on the Project Chat. The announcement will generally be made four weeks before, but no less than two weeks before the change is deployed to https://www.wikidata.org/. The change will be available for testing at least two weeks before deployment on https://test.wikidata.org/. Such announcements will have the word BREAKING in the subject line.
  • Significant changes to stable interfaces will be announced on the relevant mailing lists (wikidata-tech, wikidata and pywikibot) and on the Project Chat. The announcement will generally be made at least two weeks in advance, but no less than one week after the change was deployed to https://www.wikidata.org/. The change will typically be available for testing at least two weeks before deployment on https://test.wikidata.org/.
  • Insignificant changes to stable interfaces will generally not be announced.
  • Changes to non-stable interfaces may not be announced, even if they are breaking changes.
  • Significant changes to this policy will be announced on the relevant mailing lists (wikidata-tech and wikidata) and on the Project Chat within a week of the change being made.

Extensibility[edit]

This section explains in which way our data model and data formats are extensible. Consumers should consider this information in order to accommodate unknown structures they may encounter in the data.

The Wikibase Data Model is designed to be extensible. In particular, it is possible to introduce new data types and new entity types. Well-behaved clients and consumers should thus be prepared to encounter unknown data types and entity types, and handle them gracefully, in a way appropriate for the use at hand. In many cases, it is appropriate to simply ignore such structures of unknown type.

Similarly, bindings such as the JSON representation of the Wikibase data model are designed to be extensible. Data structures may be added in any syntactically appropriate place as long as they do not modify the meaning of pre-existing fields or data structures, and as long as their addition does not break any guarantees regarding the containing data structures. This follows the idea of the Liskov substitution principle: what was guaranteed about a data structure before the addition should still be guaranteed after the addition.

If no explicit guarantees are given regarding the structure and contents of a data structure, the following principles should give guidance regarding whether a change should be considered a breaking change:

In structures based on lists (aka. arrays) and maps (aka. hashes or objects), like JSON is, adding a key to a map is not considered a breaking change, as long as the new field does not change the interpretation of any other fields in the structure (nor in any surrounding structure). Adding a structure to a list or set however is considered a breaking change if it would break assumptions about the type of structure to expect in the list, or under what conditions a structure would be included in the list.

By convention, lists are considered homogeneous, and should only contain one kind of element, unless otherwise specified. So adding a data structure to a list is a breaking change if that data structure is not compatible with the type of structure that the list was previously defined or implied to contain.

In a tabular data representation, such as a relational database schema, the addition of fields is not considered a breaking change. Any change to the interpretation of a field, as well as the removal of fields, are considered breaking. Changes to existing unique indexes or primary keys are breaking changes; changes to other indexes as well as the addition of new indexes are not breaking changes.

In DOM-like structures based on nested typed elements with attributes, like XML is, adding an attribute is not considered a breaking change, as long as the new attribute does not change the interpretation of any other fields in the structure (nor in any surrounding structure). Adding a new type of element to a parent element is also not considered breaking, if that parent element is heterogeneous and essentially acts like a map. However, if the parent element is defined or implied to be a homogeneous list of a specific kind of child element, adding another kind of element is considered a breaking change.

For data formats that allow namespacing, like XML does, names (attribute names, element names) that belong to a namespace not explicitly mentioned by the specification of the data format can be ignored by consumers. Addition and changes to data structures from other namespaces are not considered breaking changes.

In contrast, the following modifications are examples of breaking changes, and can thus not be used to extend a format: removal of fields, changes to the type or format of a primitive value, changes to the interpretation or role of a data field, as well as changes to the element type of a collection as described above.

Stable Data Formats[edit]

This section lists the data formats we consider stable. These data formats are subject to the above notification policy.

The RDF mapping of the Wikibase Data Model, as used in RDF dumps as well as in the Linked Data Interface and the Query Service, is considered a stable data format. The Wikibase vocabulary is formally defined by http://wikiba.se/ontology. Any changes to the structure or interpretation of the mapping are subject to the above notification policy. As per the general principles of RDF, additional information introduced at any time, in any location, about any subject, is not considered a breaking change.

The JSON binding of the Wikibase Data Model as used in JSON dumps, with the web API, and with the Linked Data Interface, is considered a stable data format. Any changes to the structure or interpretation of the mapping are subject to the above notification policy. Following the flexible nature of JSON, the addition of fields to JSON objects is not considered a breaking change. Well-behaved consumers should be prepared to ignore such additional fields.

Stable Public APIs[edit]

This section lists the interfaces we consider stable. These interfaces are subject to the above notification policy.

The Wikibase Web API accessible via https://www.wikidata.org/w/api.php is considered a stable interface. Changes to the parameters, operation, or returned data structure are subject to the above notification policy.

The Linked Data Interface accessible via https://www.wikidata.org/wiki/Special:EntityData and https://www.wikidata.org/entity/... is considered a stable interface. Changes to the parameters, operation, or returned data structure are subject to the above notification policy.

The Wikidata Query Service accessible via https://query.wikidata.org/ is considered a stable interface. It provides a full SPARQL endpoint. Changes to the parameters, operation, or returned data structure are subject to the above notification policy.

The Wikibase Lua library for client wikis is considered a stable interface. Changes to the available functions, parameters, or returned data structures are subject to the above notification policy.

To allow better gadget integration JavaScript hooks documented in the hooks-js.txt file delivered together with Wikibase source code are considered stable.

We acknowledge that third party tools on Wikimedia Labs and Tool Labs may rely on the Wikibase database schema. Because of this, changes to the available tables and fields are subject to the above notification policy. However, note that the database schema is not designed to be a public API, and less consideration is given to backwards compatibility.

Unstable Interfaces[edit]

This section lists some interfaces that we do not currently consider stable, and thus may change in incompatible ways without notice.

MediaWiki XML Dumps are not considered a stable interface. MediaWiki XML dumps contain the raw data of page revisions in their internal representation. The internal representation of Wikibase entities is not a stable interface. It has changed significantly in the past, and it may change again in the future. Several different representations of Wikibase content may be present in the same XML dump.

Raw revision content as returned by the MediaWiki core API is not considered a stable interface, as it uses the internal representation of the content, just like the XML dumps. Raw revision content is returned from API queries such as api.php?action=query&prop=revisions&titles=Q42&rvprop=timestamp|user|comment|content.

Wikibase PHP code is not considered a stable interface. Since there are currently no official releases of the Wikibase extension, just a rolling deployment to wikidata.org, there is no point in time at which any given PHP class or interface can be assumed to remain stable.

Wikibase JavaScript code is not considered a stable interface. Since there are currently no official releases of the Wikibase extension, just a rolling deployment to wikidata.org, there is no point in time at which JavaScript code can be assumed to remain stable. This means that Gadgets cannot rely on the JavaScript code to remain stable.

The HTML DOM structure generated by Wikibase is not considered a stable interface. This means that Gadgets cannot rely on the DOM structure to remain stable.

Outlook[edit]

This section provides information about improvements that are planned or considered for the future.

The JSON binding should be versioned (using the semantic versioning convention), so consumers know what structures to expect, and how to interpret them. See phab:T92961.

Wikibase JavaScript code should indicate stable interfaces that can be confidently used by Gadgets.

Wikibase should provide some basic guarantees about the HTML DOM structure it generates, so Gadgets can confidently interact with the DOM.

For 3rd party installations, Wikibase should have regular releases (using the semantic versioning convention) like MediaWiki does. Wikidata will continue to use rolling deployments of the latest development version.

History[edit]

This section lists past and scheduled breaking changes. The list of past changes before the implementation of this policy may be incomplete. Each change should be listed with the date of announcement and the date of deployment, ideally accompanied with a link to the announcement and any relevant tickets.