Wikidata:Development plan

From Wikidata
Jump to: navigation, search


This is the Wikidata development plan. It is ordered by when items will likely be started.

Done[edit]

Badges[edit]

bugzilla:40810

Some clients want to add extra meta data (badges) to their articles. This includes things like "good article", "featured article" and the importance and quality of an article.

If your wiki uses this feature, please ensure it's listed.

  • The badges will be defined on Wikidata items' sitelinks as Wikidata items, like "featured article (Q123456)".
  • The items allowed as badges will be defined in the Wikidata configuration settings. Initially "good article" and "featured article" badges will be available.
  • The user can set and change a badge by selecting one from such pre-defined list of badges.
  • On the clients sitelinks with badges will have an icon in the list of language links in the sidebar. There will be default icons shipped by a Wikimedia specific Wikibase extension, but wikis can choose to use own icons.
  • If further customization on a per wiki level is needed, that can be done by using the CSS classes that are set on sitelinks with badges. There will be CSS classes that map to badge ids (for example Q120 could map to class GA). Also there will be canonical classes which aren't setting dependent (like wb-badge-Q123) on every sitelink with badges.

Technical details[edit]

These badges correspond to Wikidata items.

Remaining problems[edit]

  • phab:T72209: Create a special page to query for badges
  • phab:T73887: Other projects sidebar should show badges if applicable

Merges and redirects[edit]

bugzilla:38664 and bugzilla:57744

When two different items about the same topic are created they can be merged. Labels, descriptions, aliases, sitelinks and statements are merged if they do not conflict. The item that is left empty can then be turned into a redirect to the other. This way, Wikidata IDs can be regarded as stable identifiers by 3rd-parties.

Remaining problems[edit]

  • phab:T59745: Automatically create redirects when merging items

Mono-lingual text datatype[edit]

bugzilla:63721

Users can add strings and specify a language for it. They can for example enter the motto of a country in the country’s language. It is shown in this language to all users regardless of their language setting.

JSON dumps[edit]

bugzilla:52799

For 3rd party re-use and analysis of the data in Wikidata, we provide JSON dumps in addition to the regular XML dumps. The JSON dump contains the canonical JSON representation of all entities (as opposed to the brittle internal JSON representation found in the XML dumps). The JSON representation of individual entities is also available via Wikidata’s linked data interface (Special:EntityData).

Quantities without units[edit]

bugzilla:54318

Users are able to enter quantitative data in Wikidata and re-use it on the clients as well as outside of Wikimedia. It is possible to express statements like “Berlin has an estimated population of 3,397,469 (+/-100) as of 31 July 2013” or “Berlin has an area of 891.85 km2”. At first only unitless quantities are supported. In a later deployment a small number of units are added and expanded in future deployments. When viewing an item with such quantitative data, the user sees these according to local conventions (decimal separator, unit conversion). On the client this data is accessed via the parser function and Lua and also shown according to the content language. The API provides a way to access the data in the preferred format of the request sender.

Remaining problems[edit]

  • phab:T68580: Better support for exact values in Quantity DataType
  • phab:T59589: make it possible to show the + sign in quantities on item pages

In other projects sidebar[edit]

mw:Beta Features/Other projects sidebar

On an article a user can see links to the same topic on other projects in the sidebar. This is similar to how different languages of the same project are linked.

Entity suggester[edit]

When entering a new statement the user is shown a number of properties that he is likely to use. These properties are calculated based on which properties are used in similar items. In future versions suggestions should also be made for values.


Statements on properties[edit]

bugzilla:49554

To improve maintainability of the data in Wikidata it is possible to add statements to property pages. This is used to store constraints for properties. An example for such a constraint is that the winner of a certain award must be human. Another example would be that the number of inhabitants needs to be a positive integer.

Language fallback[edit]

bugzilla:36430

When viewing an item that is linking to other items the labels for these items are shown in the users language. If labels in this language are not available labels in languages are shown that the user is likely to speak.

In progress[edit]

Access for remaining sister projects[edit]

The remaining sister projects have access to sitelinks and data via Wikidata. The roll-out is staged to allow the communities to adapt. The planned order is:

  • Wikisource
    • Sitelinks: ✓ Done
    • Data: 25.02.2014 ✓ Done
    • Oldwikisource not done yet (see Bugzilla62717)
    • Edition interwiki links not done yet
  • Wikiquote
    • Sitelinks: 08.04.2014 ✓ Done
    • Data: 10.06.2014 ✓ Done
  • Wikinews
    • Sitelinks: 19.08.2014 ✓ Done
    • Data: ?
  • Wikidata itself
    • Sitelinks: 19.08.2014 ✓ Done
    • Data: 19.08.2014 ✓ Done
  • Commons (not including file metadata!)
    • Sitelinks: 23.09.2013 ✓ Done
    • Data: 2.12.2014 ✓ Done
  • Wikibooks
    • Sitelinks: ?
    • Data: ?
  • Wikiversity
    • Sitelinks: ?
    • Data: ?
  • Meta, MediaWiki, Wikispecies, Incubator
    • Sitelinks: ?
    • Data: ?

Not to be done (yet)[edit]

  • Wiktionary

Simple queries[edit]

bugzilla:52385

Users are able to pose simple queries to Wikidata via a SpecialPage as well as the API. Wikidata can answer queries like “What has the ISBN 2-01-202705-9” or “What has the capital Paris”. These queries are restricted to one property/value pair and return a list of items. The returned result only includes items where the statement is marked as preferred. These queries are most useful for use with one of the many identifiers in Wikidata that connect the knowledge base to other databases.

Not to be done[edit]

  • Querying for sources or qualifiers

Things to keep in mind[edit]

Some data types are easier to query than others. Time, Geo and Quantity values require range queries. For the Item and String data types, simple equality is sufficient.

UI redesign[edit]

bugzilla:52136 and Wikidata:UI redesign input

Reading and editing Wikidata is joyful and intuitive on desktops, tablets and mobile phones. The interface is visually pleasing, integrates nicely with other Wikimedia projects and contains no jargon. The interface provides the user with the information they were looking for quickly and does not overwhelm them (i.e. deprecated data is hidden initially and information is ordered in an intuitive way). It invites the user to add additional information (including qualifiers and sources) and offers little nudges towards making correct and useful contributions by offering suggestions. Erroneous contributions and vandalism are discouraged. Navigating and editing the website is fast. Both the data and the interface is localized in the user’s locale and language preferences. Where no data is available in a particular language a fallback is used.

Not to be done[edit]

  • enforcing user-defined constraints on data input

Data usage tracking[edit]

bugzilla:47288

To ease maintenance of the data in Wikidata it is possible to get a list of all articles certain data is used in. Users are thereby able to see which articles are affected by changes they are making. This also allows a better overview of where and how Wikidata’s data is used in Wikimedia’s projects.

Not to be done[edit]

  • Usage tracking outside Wikimedia’s projects

Wikimedia Commons[edit]

bugzilla:64288 and commons:Commons:Structured data

Wikimedia Commons holds a huge amount of multimedia files available for the other Wikimedia projects and the world to use. Structured data support for Wikimedia Commons is important to make it easier to maintain the files and make reuse, especially 3rd-party reuse, easier. The structured data support comes in two ways. The first is by providing access to the data stored in Wikidata. This includes things like the date of birth of an artist. The second way is by enabling Wikimedia Commons itself to store structured data related to the files stored there. This includes things like the license and subject of a photo for example.

When a new file is contributed, the uploader is asked to provide some information like tags, creator name and license in the upload wizard. Users are able to then access and edit this structured data via a form as well as an API (similar to how it is done on Wikidata). It is easy to specify and retrieve the licensing and provenance information of a multimedia file. Additionally it is easy to tag and categorize images based on concepts from Wikidata. Tags and other file information is shown in the user’s language to accommodate the multi-lingual audience of Wikimedia Commons. All this information can be used to easily search for files that fit certain criteria like “picture of a cat and a child from 2010, licensed under CC-BY-SA”.

Technical details[edit]

The data is stored on a “data” page attached to the file’s page that is similar to Wikidata’s item pages. Commons is thereby a repository and at the same time its own client as well as a client of wikidata.org.


Hover cards[edit]

bugzilla:67434

When a user hovers over a link to an item a small card is shown that holds the most important information about that item.

Improvements to constraints violation reports[edit]

It is easy for a user to find and understand constraint violation reports. Fixing an item that violates a constraint is easy. When viewing an item with a statement that violates a constraint the user can easily spot the wrong statement.

Access to data from arbitrary items[edit]

bugzilla:47930

Users on the client are able to include data from any Wikidata item they chose by specifying its ID. This expands on their ability to access data of the item currently associated with the page via a sitelink. This access is possible via both the parser function and Lua.

Quantities with units[edit]

bugzilla:63722

Users are able to enter quantitative data in Wikidata and re-use it on the clients as well as outside of Wikimedia. It is possible to express statements like “Berlin has an estimated population of 3,397,469 (+/-100) as of 31 July 2013” or “Berlin has an area of 891.85 km2”. At first only unitless quantities are supported. In a later deployment a small number of units are added and expanded in future deployments. When viewing an item with such quantitative data, the user sees these according to local conventions (decimal separator, unit conversion). On the client this data is accessed via the parser function and Lua and also shown according to the content language. The API provides a way to access the data in the preferred format of the request sender.

Not to be done (yet)[edit]

  • Letting users add arbitrary units
  • Currency conversion (Conversion rates and the value of a single currency change over time. That is very complex to model.)

Todo[edit]

Complex queries[edit]

bugzilla:65626

Users are able to write queries that are more complex than the simple queries. This includes queries like “all poets who lived in 1982” or “all cities with more than 1 Million inhabitants”. They are entered (using the semantics as embodied by Semantic MediaWiki’s Ask extension) in a page in the Query namespace and internally saved as JSON. They are then executed when resources are available - usually not immediately. The result is cached. A query can be set to rerun at regular intervals or on-demand by an administrator. The result of the query is shown on the same page. It can also be accessed via the API. The clients can include the result of a query in their pages to for example create list articles. The result will be a list of items. It can then be manipulated as needed by Lua. More result formatters and visualisations will be made available in future deployments based on Semantic MediaWiki’s result formatters. These queries are making Wikidata even more useful to the Wikimedia projects and the world and are needed by the community to maintain the large database.

Optional[edit]

  • transitive queries
  • disjunction

Article history integration[edit]

bugzilla:40358

Editors on a client can look at the change history of an article and see all Wikidata changes relating to this article. This way they can see all changes affecting their article without having to go to another project.

RDF dump[edit]

bugzilla:44581

For 3rd party re-use and analysis of the data in Wikidata, we provide RDF dumps in addition to the regular XML dumps. It contains the RDF representation of all entities, for use in semantic web applications. This RDF will not assert facts, but rather represent claims. The RDF representation of individual entities is also available via Wikidata’s linked data interface (Special:EntityData).

Geo-shape datatype[edit]

bugzilla:55549

Users can enter geo-shapes in Wikidata. They can for example use it to store the outline of a country.

Technical details[edit]

This will likely be realised using leaflet.js.

Multi-lingual text datatype[edit]

Users can add strings and specify a language for it. This is similar to the mono-lingual string. However translations in more than one language can be provided. The one in the user’s language is shown and the others can be shown on-demand.

Beyond the planning horizon of this plan[edit]