Wikidata:Partnerships and data imports

From Wikidata
Jump to: navigation, search
Partnerships and data imports
This page provides a space to discuss importing data from external sources and forming partnerships with external organisations.


You may also find these related resources helpful:

High-contrast-document-save.svg Data Import Hub
High-contrast-view-refresh.svg Why import data into Wikidata.
Light-Bulb by Till Teenck.svg Learn how to import data
Noun project 1248.svg Bot requests
Question Noun project 2185.svg Ask a data import question
Check Box Noun project 10759.svg Data Import Archive
Please take a look at the Wikidata frequently asked questions to see if your question has already been answered.
Also see status updates to keep up-to-date on important things around Wikidata.
IRC channel: #wikidata connect

Project
chat

Administrators'
noticeboard

Development
team

Translators'
noticeboard

Requests
for permissions

Interwiki
conflicts

Requests
for deletions

Property
proposal

Properties
for deletion

Requests
for comment

Partnerships
and imports

Request
a query

Bot
requests

Filing cabinet icon.svg
SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 1 day and sections whose oldest comment is older than 30 days.
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2017/02.

European Red List of Habitats[edit]

Hi, I have seen this list recently [1], and thought it would be feasible to import it into WikiData. The Raw Database is in Microsoft Access [2]. I don't have experience with importing data into WikiData or Access for that matter. GoEThe (talk) 10:32, 24 January 2017 (UTC)

Hi GoEThe (talkcontribslogs) sounds like a great addition, please head to the Wikidata:Data Import Hub and make a request to import the dataset there. Thanks, --John Cummings (talk) 11:57, 23 February 2017 (UTC)

Curating a database of institutions inside Wikidata?[edit]

User:Pigsonthewing has made me aware that ORCID, Inc. (Q19861084) has started thinking about launching a new database of organizations. I wonder if it would be possible for them to curate such a database directly inside Wikidata: curating the database would just amount to editing Wikidata like other editors. I see multiple advantages over starting a new database from scratch:

  • The existing data: Wikidata already contains a lot of organizations, which are often aligned to other databases;
  • The existing back-links: some databases, such as VIAF or GRID, already link back to Wikidata;
  • The editing community, both from Wikidata itself and from other Wikimedia projects: editors are database curators that you do not need to hire!
  • The editing, querying, resolution, disambiguation and reconciliation tools which are already in place;
  • The possibility to run away with the data and the software if they are not satisfied with the ecosystem.

The downsides are:

  • Community consensus would be needed for large-scale heavy changes (although this can be mitigated by extracting the data into an external database)
  • Notability of the items: although I do not think that this is a problem for this particular project, the database curators might want to create records that would not satisfy the inclusion criteria of Wikidata
  • Volatility of the data: it is possible to vandalize Wikidata, so the data might be temporarily inconsistent or unavailable. But errors and downtime also happens with traditionally curated databases.
  • No control over the appearance of native identifiers (Wikidata item ids). This could be mitigated by creating an identifier that resolves to the relevant Wikidata item, and which could be added to the item itself as a property if needed. This would enable to query using the new identifier both inside and outside Wikidata. As the resolver would be controlled by ORCID, it makes the migration outside Wikidata even easier: you only need to switch the resolver to the external database.

Otherwise, it seems to me that Wikidata would tick all the boxes of this technical whitepaper.

My questions are:

  • is there any institution that "runs" a database inside Wikidata currently?
  • does Wikidata have any guidelines / policies on the subject?
  • do you think it is worth approaching them about that?

Pintoch (talk) 12:03, 17 February 2017 (UTC)

While it is not for me to speak for ORCID in this context (Disclosure: I am Wikipedian in Residence with ORCID), Wikidata's notability criteria would indeed preclude such use. Especially while there are deletions with no prior discussion, and recovering IDs for deleted items is apparently not possible. WMF policies on paid editing may also present difficulties to organisations wishing to update the record about themselves or their subsidiaries. I would expect that the proposed new "registry for organisations" would mirror the ORCID registry in having an API with features which Wikidata does not, and thus being available for integration in others' back-end systems. Our friends at ORCID are, of course, aware of Wikidata. I am sure that once a new organisational identifier emerges, we would create a property for it here on Wikidata. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:55, 17 February 2017 (UTC)