Shortcuts: WD:DP, WD:DEVPLAN

Wikidata:Development plan

From Wikidata
Jump to navigation Jump to search
Development plan Usability and usefulness Status updates Development input Contact the development team

You will find below the roadmap of the Wikidata development team (Wikimedia Germany) for Wikidata and Wikibase for 2020. If you have any question or comment, feel free to write on Wikidata:Contact the development team.

Please note that the roadmap presents the main projects that the development team will work on during 2020. Things like maintenance of the software and fixing pressing bugs are not mentioned in the roadmap, but will be included in the workflow over the year. The roadmap is based on estimations and will evolve during the year. The roadmap doesn't contain events we're attending or organizing.

The most up-to-date version of the roadmap can be found on our online project management tool. You can click on the different items to see further information, like a description or planned start date. You will also find screenshots in the sections below.

Wikidata as a platform[edit]

Wikidata roadmap 2020 - Wikidata as a platform - version of January 2020

Increase data quality and trust[edit]

Feedback loops with data re-users[edit]

We want to work with large data re-users to get their feedback and improvements for our data in a way that works for our community.

Automated finding of references based on semantic markup[edit]

Too many statements on Wikidata don't have a reference. For some of these statements we can automatically find references. We can do this by comparing our data with data in linked websites (marked up with schema.org or similar mark-up). These websites could be in the identifiers, references or even connected Wikipedia articles.

Checks against 3rd party databases[edit]

Via external identifiers we have connections to a lot of other data bases. We can compare our data against their data and highlight differences so editors can look into them and fix them when needed. We want to build a system that is extensible so that anyone can do the mapping for another 3rd party database in the future.

Improve quality scoring for Items[edit]

Scoring of Items in Wikidata is possible now using ORES. The scoring is not accurate enough yet though and needs improvement.

Finding problems in the ontology[edit]

We want to make it easier for people to find issues in Wikidata's ontology because modeling inconsistencies and related problems are a big obstacle for data reuse.

Tainted references - persistence[edit]

We have developed the first version of tainted references. Depending on feedback by the editors we do want to make the indicator persistent in order to allow more people to see and clean up mismatching value/reference pairs.

Expose data about current events (Prototype)[edit]

Current events are a common target of vandalism while also being important for reusers. We should find ways to expose them on Wikidata so editors can better keep an eye on them.

Finding gaps and biases[edit]

We want to find more ways for people to find biases and gaps in the data in Wikidata so we can work on making our data less biased and more complete.

Research for easier monitoring of Items[edit]

Institutions are interested in more easily monitoring how their data is changed. We need to research how to make this possible.

Evaluation of data quality of a subset of Items[edit]

It is important that data re-users have trust in Wikidata's data quality and can easily access useful information about the data quality of the specific parts of Wikidata that are important to their work.

Completeness indicators[edit]

Within Wikidata, it is sometimes difficult to determine the completeness of an Item or a particular area of knowledge. Reusers of our data are interested in knowing this information when evaluating whether Wikidata is the right data source for their needs.

Curious facts (prototype)[edit]

Curious facts are potential indicators for wrong data. After analyzing the data in Wikidata for curious facts, we can then make them visible to users within the interface. This will make it possible for editors to check on the data to determine which, if any, corrections are needed.

Build out the ecosystem[edit]

Expand GLAM strategy[edit]

We need to expand the strategy with a focus on GLAM market size and opportunities.

Merge UI[edit]

Wikidata has a gadget to improve the workflow of merging two Items. We should incorporate that into the main Wikibase codebase so every Wikibase installation benefits.

investigate and include core gadgets as part of Wikibase (excl. merge gadget)[edit]

Incorporating core gadgets can reduce the time and complexity of Wikibase setup for non Wikidata.org use cases, as well as increase the usability of Wikibase for third parties. We will first conduct a review of the available gadgets and then incorporate the most widely useful ones into the Wikibase system.

Encourage more data use[edit]

Page rank for Items (Prototype)[edit]

Wikidata has a lot of Items. When querying for a list of Items you want to often order them by some kind of importance measure. Each Item in Wikidata should get a score that tries to represent how important it is compared to other Items.

Easier access to data for programmers[edit]

We want to improve our APIs to make it easier for programmers to access our data.

Query Service UI improvements[edit]

We will enhance the usability of the Query Service by making improvements and adjustments to the UI.

Incorporate feedback into partnership model V1[edit]

The first version of the partnership model has been published and we now need to incorporate feedback we received for it.

Data partnership model expansion - version 2[edit]

After reviewing and incorporating feedback received on the first version of the partnership model, we will expand the model and publish a second version to the community.

Enable more diverse data and users[edit]

investigate PanLex integration[edit]

By transforming thousands of translation dictionaries into a single common structure, the PanLex database makes it possible to derive billions of lexical translations that are not found in any single dictionary. We are investigating how Wikidata/Wikibase and PanLex can work together for the benefit of all.

Accessibility evaluation[edit]

We don’t want to exclude any users through technical barriers. As a first step we need to first understand the current state.

Usability problems/UX debt[edit]

Over the years a number of UX issues have been accumulating on Wikibase. We need to tackle the worst ones.

Support the creation of underrepresented knowledge[edit]

To support the mission of giving more people more access to more knowledge, it is important to increase the diversity of knowledge in Wikidata. We will explore how to best connect this data to Wikidata and try to identify if there are any unintentional ways that Wikidata's method of storing knowledge might not support participation by certain communities.

Interface improvements for lexicographical data[edit]

To increase the usability of lexicographical data in Wikidata, we will ensure that there is a stable interface that allows people to edit and reuse this data more easily.

Query builder for lists[edit]

Queries and lists are an integral part of accessing the data in Wikidata and making sense of it. Right now creating lists requires knowledge of SPARQL. We want to make it easier for people to create lists without having to know SPARQL.

Other[edit]

Design system[edit]

We want all parts of Wikidata and Wikibase to be consistent to provide a nicer user experience. The design system will help us with that by defining a number of standard components and patterns.

Normalization of wb_terms table[edit]

The wb_terms table has grown in size to the point where the infrastructure can't cope with its growth anymore. We need to rearchitect the table to scale Wikidata better.

Infrastructure analysis[edit]

We will have a company provide an outside view of our infrastructure, provide an evaluation and give recommendations. We then need to review the feasibility and fit of these recommendations.

prototype of list/simple query storage and lookup infrastructure (Prototype)[edit]

We want to prototype the infrastructure that will allow us to make a query builder possible.

Vue component library[edit]

In order to make development of new features for Wikibase easier we need to build out a library of UI components that can be reused across all of Wikibase.

Improve Wikibase lower and midlevel documentation[edit]

We need to make it easier for new developers to understand Wikibase's architecture and codebase.

Wikibase extension registration and decoupling[edit]

Currently Wikibase extension is conceptually divided into Repo and Client components. These components are not clearly separated, interdependent, and often, intentionally and not intentionally use the same code pieces. This entanglement affects negatively the productivity when making change to the Client part (might be non intentionally affecting “Repo”), and the other way round.


Wikibase ecosystem[edit]

Wikidata roadmap 2020 - Wikidata as a platform - version of January 2020

Build out the ecosystem[edit]

user research with catalogers at GND[edit]

We are working with the German National Library on a Wikibase installation. One important thing to understand is how catalogers use their existing system today.

Wikibase community calls[edit]

The Wikibase community needs a venue for regular meeting and communication. We are coordinating calls on a regular basis to support this.

Wikibase community meetups[edit]

Throughout 2020, we will continue to foster and build the Wikibase community by supporting meetups.

Access Wikidata Properties in custom Wikibase instance (aka Federation)[edit]

In order to both increase the speed/ease of setting up a new Wikibase instance, as well as to enable federation of Wikibase instances, it is important that we allow other Wikibases to reuse Wikidata's ontology. In the MVP, we will first allow for the use of Wikidata's Properties.

Expand GLAM strategy[edit]

We need to expand the strategy with a focus on GLAM market size and opportunities.

Documentation for things to do after installing Wikibase[edit]

There are a number of things that people need to do after installing Wikibase in order to really be able to use it productively. We need to document them better.

Strategy and infrastructure for releasing Wikibase packages[edit]

As Wikibase becomes more and more used outside Wikimedia we need to set up proper release infrastructure and processes.

Update instructions for Wikibase installations[edit]

We need to document how to upgrade an existing Wikibase installation.

Merge UI[edit]

Wikidata has a gadget to improve the workflow of merging two Items. We should incorporate that into the main Wikibase codebase so every Wikibase installation benefits.

Research for data input forms[edit]

A regular request from people running Wikibase instances is that they would like an easy way to build forms for their editors so they can easily enter complete and accurate data. We need to research how to make this happen.

support structures in the Wikibase Ecosystem[edit]

The Wikibase ecosystem needs providers for support, customization etc. We need to discuss possible setups and ground rules.

Installation pingback (prototype)[edit]

We would like to better understand how Wikibase is used out there. MediaWiki already has a mechanism for this. We want to adopt it for Wikibase.

Wikibase website[edit]

We need to continue to improve the Wikibase website. The focus will be to provide different target groups information that is more tailored to them.

MVP for Wikibase as a service platform[edit]

OpenCura has proven to make it easier for people to spin up a working Wikibase instance in minutes. We need to continue working on it, clarify its direction and build it out.

Configuration discovery[edit]

As more and more Wikibase instances are being set up it becomes harder to built tools on top of them due to the use of different configurations. We need to make it possible to programatically discover the configuration of a Wikibase instance. This needs to include things like: property ID of same-as, property ID of instance of, constraints definitions, wiki this instance is federated with

Explore opportunities for more organisations to use Wikibase in their projects[edit]

We need to look beyond GLAMs and especially libraries as a target group for Wikibase.

Investigate and include core gadgets as part of Wikibase (excl. merge gadget)[edit]

Incorporating core gadgets can reduce the time and complexity of Wikibase setup for non Wikidata.org use cases, as well as increase the usability of Wikibase for third parties. We will first conduct a review of the available gadgets and then incorporate the most widely useful ones into the Wikibase system.

Federation version 2[edit]

Continued steps toward allowing non-Wikimedia installations of Wikibase to use Wikidata's ontology, an important component of Federation. In the second prototype we will likely allow for more complex Federation with local ontology.

Wikibase service defaults for non-Wikidata.org installations[edit]

Wikibase continues to be used more and more outside of Wikimedia projects, and the needs of third-party Wikibase users are diverse. A lot of the configuration options of Wikibase have defaults that make sense for Wikidata but not necessarily other Wikibase instances. We will review and adjust them.

Link to media on wikis other than Commons in statements[edit]

Wikibase users, particularly those in the GLAM sector, want to be able to link to/display media in Wikibase that is not and cannot be on Commons.

Make Query Service less specific to Wikidata[edit]

To better suit the needs of Wikibase use cases outside of Wikidata.org, we will make changes to the Query Service to move defaults away from Wikidata and toward Wikibase.

Automatic inclusion of local client site for sitelinks[edit]

When installing a new Wikibase, completing this extra setup step could be avoided, making setup easier.

Encourage more data use[edit]

Easier access to data for programmers[edit]

We want to improve our APIs to make it easier for programmers to access our data.

Incorporate feedback into partnership model V1[edit]

The first version of the partnership model has been published and we now need to incorporate feedback we received for it.

Data partnership model expansion - version 2[edit]

After reviewing and incorporating feedback received on the first version of the partnership model, we will expand the model and publish a second version to the community.

Enable more diverse data and users[edit]

investigate PanLex integration[edit]

By transforming thousands of translation dictionaries into a single common structure, the PanLex database makes it possible to derive billions of lexical translations that are not found in any single dictionary. We are investigating how Wikidata/Wikibase and PanLex can work together for the benefit of all.

Accessibility evaluation[edit]

We don’t want to exclude any users through technical barriers. As a first step we need to first understand the current state.

Usability problems/UX debt[edit]

Over the years a number of UX issues have been accumulating on Wikibase. We need to tackle the worst ones.

Support the creation of underrepresented knowledge[edit]

To support the mission of giving more people more access to more knowledge, it is important to increase the diversity of knowledge in Wikidata. We will explore how to best connect this data to Wikidata and try to identify if there are any unintentional ways that Wikidata's method of storing knowledge might not support participation by certain communities.

Other[edit]

Design system[edit]

We want all parts of Wikidata and Wikibase to be consistent to provide a nicer user experience. The design system will help us with that by defining a number of standard components and patterns.

Normalization of wb_terms table[edit]

The wb_terms table has grown in size to the point where the infrastructure can't cope with its growth anymore. We need to rearchitect the table to scale Wikidata better.

Infrastructure analysis[edit]

We will have a company provide an outside view of our infrastructure, provide an evaluation and give recommendations. We then need to review the feasibility and fit of these recommendations.

Vue component library[edit]

In order to make development of new features for Wikibase easier we need to build out a library of UI components that can be reused across all of Wikibase.

Improve Wikibase lower and midlevel documentation[edit]

We need to make it easier for new developers to understand Wikibase's architecture and codebase.

Wikibase extension registration and decoupling[edit]

Currently Wikibase extension is conceptually divided into Repo and Client components. These components are not clearly separated, interdependent, and often, intentionally and not intentionally use the same code pieces. This entanglement affects negatively the productivity when making change to the Client part (might be non intentionally affecting “Repo”), and the other way round.

OPEN!NEXT[edit]

OPEN!NEXT is made up of 19 partners from seven European countries and seeks to facilitate wider adoption of open source hardware development practices by businesses. Our role in this initiative includes set up of Wikibase instances, data modelling, maintenance, and API development.

Wikidata for the Wikimedia projects[edit]

Wikidata roadmap 2020 - Wikidata as a platform - version of January 2020

Encourage more data use[edit]

Automated list generation[edit]

Wikimedia projects have a large number of lists like the list of monuments in a certain county or the heads of government of a country. These lists could be generated and updated automatically based on a query to Wikidata. We want to take off maintenance especially from smaller projects with this. (Elaborate lists on larger projects are too complex for now.)

Wikidata Bridge[edit]

We want to make it easier for editors on the other Wikimedia projects to edit data in Wikidata directly without having to leave their wiki.

Access to lexicographical data from Wiktionary[edit]

The lexicographical data in Wikidata needs to be accessible via Lua functions.

Enable more diverse data and users[edit]

Accessibility evaluation[edit]

We don’t want to exclude any users through technical barriers. As a first step we need to first understand the current state.

Usability problems/UX debt[edit]

Over the years a number of UX issues have been accumulating on Wikibase. We need to tackle the worst ones.

Support the creation of underrepresented knowledge[edit]

To support the mission of giving more people more access to more knowledge, it is important to increase the diversity of knowledge in Wikidata. We will explore how to best connect this data to Wikidata and try to identify if there are any unintentional ways that Wikidata's method of storing knowledge might not support participation by certain communities.

Query builder for lists[edit]

Queries and lists are an integral part of accessing the data in Wikidata and making sense of it. Right now creating lists requires knowledge of SPARQL. We want to make it easier for people to create lists without having to know SPARQL.

Other[edit]

Design system[edit]

We want all parts of Wikidata and Wikibase to be consistent to provide a nicer user experience. The design system will help us with that by defining a number of standard components and patterns.

Vue component library[edit]

In order to make development of new features for Wikibase easier we need to build out a library of UI components that can be reused across all of Wikibase.

Wikibase extension registration and decoupling[edit]

Currently Wikibase extension is conceptually divided into Repo and Client components. These components are not clearly separated, interdependent, and often, intentionally and not intentionally use the same code pieces. This entanglement affects negatively the productivity when making change to the Client part (might be non intentionally affecting “Repo”), and the other way round.

See also[edit]