Wikidata:Data Import Hub

From Wikidata
Jump to: navigation, search
Data import hub
This page is a hub to organise importing data from external sources.

To request a data import please see the section below, the basic process of a dataset being imported is:

  1. Dataset import is requested
  2. The data import is planned and formatted by the community
  3. The data is imported through a bot request

A list of imported data sets is available here.

Contents

Request a data import[edit]

Noun project - plus round.svg
  1. Create an account by clicking Create an account in the top right hand corner of the page.
  2. Enable email user (this will allow Wikidata users to email you to notify you about discussion about the dataset)
  3. Click the Request a data import button at the top of this page
  4. Add the name of the dataset in the Subject field
  5. Fill in the preloaded fields

Instructions for data importers[edit]

High-contrast-document-save.svg

Please include notes on all steps of the process, instructions for doing so can be found here.

Once a data set has been imported into Wikidata please remove it from the list below and add it to the imported data sets page.

Census of Population data of Philippines Cities, Municipalities, Provinces and Regions (1903-2007)[edit]

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name: Census of Population data of Philippines Cities, Municipalities, Provinces and Regions (1903-2007)

Source:Philippine Statistics Authority

Link: Web.Archive.org upload (As Philippines public domain FOI request)

Description: Census of Population data of Philippines Cities, Municipalities, Provinces and Regions (1903-2007)

Link: Web.Archive.org upload (As Philippines public domain FOI request)

Done:

To do: -

Notes: -

Structure: Population (P1082)

Example item: Dasol (Q41917), Urdaneta (Q43168), Pangasinan (Q13871), Ilocos Region (Q12933)

Done:

To do: -

Done:

To do:

Notes:

Done:

To do:

Notes:

Done:

To do: -

Notes:

Date complete:

Notes:

Discussion:[edit]

World Heritage Sites[edit]

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name: World Heritage sites

Source: UNESCO World Heritage Centre

Link: http://whc.unesco.org/en/list

Description: A database of the World Heritage sites

Link: here

Done: All

To do: -

Notes: -

Structure: World Heritage Site ID (P757) , (P2614) World Heritage criteria (2005), (P1435) heritage status = World Heritage Site (with start time as qualifier)

Example item: Q4176

Done: All

To do: -

Done: All

To do:

Notes:

Done:

To do: Inception (P571): remaining items (dates can be found in the site descriptions on the World Heritage website)

Notes:

Done: All except construction date Inception

To do: -

Notes:

Date complete:

Notes:

Discussion:[edit]

UNESCO list of journalists who were killed in the exercise of their profession[edit]

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name: journalists who were killed in the exercise of their profession

Source: UNESCO

Link: http://www.unesco.org/new/en/communication-and-information/freedom-of-expression/press-freedom/unesco-condemns-killing-of-journalists/

Description: Yearly lists journalists who were killed in the exercise of their profession collated by UNESCO

Link: here

Done: Import data

To do: manual work on job and employer columns

Notes:

Structure:

Example item:

Done:

To do:

Done:

To do:

Notes:

Done:

To do:

Notes:

Done:

To do:

Notes:

Date complete:

Notes:

Discussion:[edit]

I don't understand how to include the official condemnation of the killing by UNESCO and the responses by the governments --John Cummings (talk) 15:44, 6 December 2016 (UTC)

UNESCO Atlas of the World's Languages in danger[edit]

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name: UNESCO Atlas of the World's Languages in danger

Source: UNESCO

Link: http://www.unesco.org/languages-atlas/

Description: A database of the world's endangered languages

Link: here

Done: All

To do: -

Notes:

Structure:

Example item:

Done:

To do:

Done: All

To do:

Notes:

Done:

To do: Matching in Mix n' Match

Notes:

Done: Imported into Mix n' Match

To do:

Notes:

Date complete:

Notes:

Discussion:[edit]

UNESCO Art Collection[edit]

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name:

Source:

Link:

Description:

Link: here

Done: Imported data on all the artworks

To do: Add links to the individual pages of the artworks

Notes: Not available as a structured database, database created by hand

Structure:

Example item:

Done:

To do:

Done:

To do:

Notes:

Done:

To do:

Notes:

Done:

To do:

Notes:

Date complete:

Notes:

Discussion:[edit]

UNESCO Memory of the World Programme[edit]

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name: UNESCO Memory of the World Programme

Source: UNESCO

Link: http://www.unesco.org/new/en/communication-and-information/flagship-project-activities/memory-of-the-world/homepage/

Description: An international initiative launched to safeguard the documentary heritage of humanity

Link: here

Done: All

To do: -

Notes:

Structure:

Example item:

Done:

To do:

Done: All

To do:

Notes:

Done: Mix n' Match

To do:

Notes:

Done: Mix n' Match

To do: Next steps

Notes:

Date complete:

Notes:

Discussion:[edit]

UNESCO Lists of Intangible Cultural Heritage and the Register of Best Safeguarding Practices[edit]

  • Name of dataset: UNESCO Lists of Intangible Cultural Heritage and the Register of Best Safeguarding Practices
  • Source: UNESCO
  • Link: http://www.unesco.org/culture/ich/en/lists
  • Description: The UNESCO international register of Intangible Cultural Heritage
  • Request by: Sign your name using John Cummings (talk) 17:20, 6 December 2016 (UTC)

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match data to existing data Importing data into Wikidata Date import complete and notes
Name: UNESCO Lists of Intangible Cultural Heritage and the Register of Best Safeguarding Practices

Source: UNESCO

Link: http://www.unesco.org/culture/ich/en/lists

Description: The UNESCO international register of Intangible Cultural Heritage

Link: here

Done: All

To do:

Notes:

Structure:

Example item:

Done:

To do:

Done: All

To do:

Notes:

Done: Imported into Mix n' Match

To do: Match on Mix n' Match

Notes:

Done:

To do:

Notes:

Date complete:

Notes:

Discussion:[edit]

European Red List of Habitat[edit]

The European Red List of Habitats provides an entirely new and all embracing tool to review commitments for environmental protection and restoration within the EU2020 Biodiversity Strategy. In addition to the assessment of threat, a unique set of information underlies the Red List for every habitat: from a full description to distribution maps, images, links to other classification systems, details of occurrence and trends in each country and lists of threats with information on restoration potential. All of this is publicly available in PDF and database format (see links below), so the Red List can be used for a wide range of analysis. The Red List complements the data collected on Annex I habitat types through Article 17 reporting as it covers a much wider set of habitats than those legally protected under the Habitats Directive."

  • Request by: GoEThe (talk) 12:04, 23 February 2017 (UTC)

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name: European Red List of Habitat

Source: European Comission

Link: [1]

Description: Current status of all natural and semi-natural terrestrial, freshwater and marine habitats in Europe.

Link: [2]

Done: All data imported to spreadsheet

To do: Check coding in sheet "European Red List of Habitats", formatting of names with diacritics.

Notes:

Structure:

Example item:

Done:

To do:

Done:

To do:

Notes:

Done:

To do:

Notes:

Done:

To do:

Notes:

Date complete:

Notes:

Discussion:[edit]

DB Netz Betriebsstellenverzeichnis[edit]

  • Name of dataset: DB Netz Betriebsstellenverzeichnis (Open-Data-Portal)
  • Source: DB Netz AG (infrastructure departement of germany’s national railway company)
  • Link: https://data.deutschebahn.com/dataset/data-betriebsstellen (the latest one, currently from 2017-01)
  • Description:
    1. Abk: The abbreviation used for operational purposes („Ril 100“, formerly „DS 100“). Import to station code (P296).
    2. Name: The full name. Import to official name (P1448).
    3. Kurzname: Name variant abbreviated to fit within 16 characters. Import to short name (P1813).
    4. Typ: Type of location. Import to instance of (P31). I’m suggesting to restrict the import to Bf (no label (Q27996466)), Hp (no label (Q27996460)), Abzw (no label (Q27996464)), Üst (no label (Q27996463)), Anst (no label (Q27996461)), Awanst (no label (Q27996462)) and Bk (no label (Q27996465)) (including combinations of those like „Hp Anst“, but not the variants like „NE-Hp“) for now.
    5. Betr-Zust: Wheter the location is only planned or no longer exists. I’m suggesting to not automaticaly import anything with a value here.
    6. Primary Loaction Code: The code from TSI-TAP/TSI-TAF. Import to station code (P296).
    7. UIC: Which country the location is in. I’m suggesting to restrict the import to germany (80) for now.
    8. RB: Which regional section of DB Netz is responsible for this location. I’m suggesting to not automaticly import those which don’t have a value after the other suggested filterings. Or in other words: To not import those without a value here, but ignore the value otherwise.
    9. gültig von: Literally translates to „valid from“, but honestly I don’t know which date exactly this refers to. Anyway: Not relevant, or maybe don’t import those newer than 2017-01-01.
    10. gültig bis: Literally translates to „valid until“, same as before just whatever end. Not relevant.
    11. Netz-Key: Add zeroes on the left until it’s six digits long, prepend the UIC country code and import to UIC station code (P722).
    12. Fpl-rel: Whether this can be ordered as path of a train path. Not relevant.
    13. Fpl-Gr: Whether the infrastructur manager (for the germans around: that’s the EIU) responsible for creating the train’s timetable may change here. Not relevant.
  • Note about my usage of „P296“ in the description section above: It’s not really clear to me how P296 is supposed to be used. Maybe a new property or whatever would be better. So read this as „P296 or new property“. Note that there are already Items with those codes in P296, which would need to be changed to whatever representation is chosen.
  • Request by: --Nenntmichruhigip (talk) 19:52, 21 March 2017 (UTC)

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name:

Source:

Link:

Description:

Link:

Done:

To do:

Notes:

Structure:

Example item:

Done:

To do:

Done:

To do:

Notes:

Done:

To do:

Notes:

Done:

To do:

Notes:

Date complete:

Notes:

Discussion:[edit]

Protected Planet dataset for Germany[edit]

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name:

Source:

Link:

Description:

Link:

Done:

To do:

Notes:

Structure:

Example item:

Done:

To do:

Done:

To do:

Notes:

Done:

To do:

Notes:

Done:

To do:

Notes:

Date complete:

Notes:

Discussion[edit]

List of Museums of São Paulo/Brazil[edit]

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name:

Source:

Link:

Description:

Link:

Done:

To do:

Notes:

Structure:

Example item:

Done:

To do:

Done:

To do:

Notes:

Done:

To do:

Notes:

Done:

To do:

Notes:

Date complete:

Notes:

Discussion[edit]

Debates of the Constituent Assembly of Jura[edit]

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name: Debates of the Constituent Assembly of Jura

Source: Jura cantonal archives

Link: http://www.jura.ch/DFCS/OCC/ArCJ/Projets/Archives-cantonales-jurassiennes-Projets.html

Description: Sound collection of the plenary sessions of the Constituent Assembly of the canton Jura in Switzerland

Link: https://docs.google.com/spreadsheets/d/1dqt8hwk9Wp8o5n9i4umoLX-uorW3q7YSpmOpd1FeRD4/edit?usp=sharing

Done:

To do:

Notes: The Wikimedia Commons page with the sound tracks already exists

Structure:

Example item:

Done:

To do:

Done:

To do:

Notes:

Done:

To do:

Notes:

Done:

To do:

Notes:

Date complete:

Notes:

Discussion[edit]

For the Workshop Wiki SUPSI - Chapter 2 we are looking into how to add this database to Wikidata. The database was provided by Ilario Valdelli of Wikimedia Switzerland to act as a key study for the viability of adding Wikimedia content's metadata (in this specific case audio recordings collection).

We will work on documenting the process in order to provide a real example for the Archives and Institutions in Switzerland to encourage them using Wikidata as database too.

As it is the first time we are uploading to Wikidata, we would like to have to chance to discuss and find the best way to import those data and define the properties for the audio contents.

Ethnologue's EGIDS language status[edit]

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name: EGIDS language status

Source: Ethnologue

Link: https://www.ethnologue.com/browse/codes

Description: Import the "Language Status" in every page of languages

Link:

Done:

To do:

Notes:

Structure:

Example item:

Done:

To do:

Done:

To do:

Notes:

Done:

To do:

Notes:

Done:

To do:

Notes:

Date complete:

Notes:

Discussion[edit]

What is the difference between these three letter codes and the ISO-639-3 also maintained by Ethnologue? (As far as I am aware they are the same. Thanks, GerardM (talk) 17:29, 27 June 2017 (UTC)

@GerardM: it's same --Beeyan (talk) 08:23, 9 August 2017 (UTC)

DB SuS and RNI Stationsdaten[edit]

  • Name of dataset: Stationsdaten DB Station&Service und DB RegioNetz Infrastruktur (Open-Data-Portal)
  • Source: DB Station&Service AG (passenger train station departement of germany’s national railway company) and DB RegioNetz Infrastruktur GmbH (infrastructure departement of a regional-oriented subsection of germany’s national railway company)
  • Link: https://data.deutschebahn.com/dataset/data-stationsdaten and https://data.deutschebahn.com/dataset/data-stationsdaten-regio (the latest one respectively, currently from 2016-07 and from 2016-01)
  • Description:
    1. Bundesland: Which federal state the station is in. Import to located in the administrative territorial entity (P131), if there isn’t a more specific value already (see also row 9 „Ort“).
      • BM: (DB SuS) Which station management (subregions of the regional areas; yes, Berlin central station has it’s own station management) is responsible for the station. Not sure how it should be imported. Propably same as „RB“ in the import from DB Netz above.
      • Regionalbereich: (DB RNI) Which regional section operates the station. Not sure how it should be imported.
    2. Bf. Nr.: Station number in DB’s own system. Import to station code (P296).
    3. Station: The full name. Import to official name (P1448).
    4. Bf DS 100 Abk.: The abbreviation used for operational purposes („Ril 100“, formerly „DS 100“). Be careful about importing, as one passenger station may map to multiple operational stations (Famous example: Passenger station 1071 is all of BL, BLS, BHBF and BHBT).
    5. Kat. Vst / Kategorie Vst: Category of the passenger station. Import to instance of (P31) with the appropriate subitem of German railway station categories (Q550637).
    6. Straße: Postal adress. Ignore.
    7. PLZ: Postal area code. Ignore.
    8. Ort: Which city the station is in. I’m not sure how accurate this is, but it seems good enough to import to located in the administrative territorial entity (P131) where there isn’t such a statement already.
    9. Aufgabenträger: Which authority (no label (Q29471795)) is mainly responsible for ordering the regional passenger transport services. Not sure if it should be imported.
    10. (the following three rows in the RNI table can be ignored)
  • Note about my usage of „P296“ in the description section above: See #DB Netz Betriebsstellenverzeichnis.
  • Request by: --Nenntmichruhigip (talk) 19:52, 21 March 2017 (UTC)

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name:

Source:

Link:

Description:

Link:

Done:

To do:

Notes:

Structure:

Example item:

Done:

To do:

Done:

To do:

Notes:

Done:

To do:

Notes:

Done:

To do:

Notes:

Date complete:

Notes:

Discussion:[edit]

Berliner Malweiber[edit]

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name: Berliner Malweiber

Source: Stiftung Stadtmuseum Berlin

Link: https://www.stadtmuseum.de/ausstellungen/berlin-stadt-der-frauen

Description: Metadata relating to the museum's digitisation project Berliner Malweiber, involving works by female artists displayed by the museum in its exhibition Berlin – Stadt der Frauen (March–August 2016).

Link: here

Done: Initial import of data into spreadsheet; metadata complemented with GND IDs where available.

To do:

Notes:

Structure: link

Example item:

Done:

To do:

Done:

To do:

Notes:

Done:

To do:

Notes:

Done:

To do:

Notes:

Date complete:

Notes:

Discussion[edit]

The data will be imported by User:Hgkuper in preparation for the digiS workshop A gentle introduction to WIKIDATA.

UNESCO Atlas of World Languages in Danger[edit]

  • Name of dataset: UNESCO Atlas of World Languages in Danger
  • Source: UNESCO
  • Link: http://www.unesco.org/languages-atlas/index.php
  • Description: UNESCO’s Atlas of the World’s Languages in Danger is intended to raise awareness about language endangerment, it provides information on numbers of speakers, relevant policies and projects, sources, ISO codes and geographic coordinates.
  • Request by: John Cummings (talk) 06:59, 10 June 2017 (UTC)

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name:

Source:

Link:

Description:

Link:

Done:

To do:

Notes:

Structure:

Example item:

Done:

To do:

Done:

To do:

Notes:

Done: Added to Mix n' Match

To do:

Notes:

Done:

To do:

Notes: There are several languages currently existing in Wikidata that AWLD has more detail on, having a specific entry for each dialect. Once the data has been imported a query should be run to find the items with multiple AWLD and they should be separated out into separate items for each dialect and link back to the non dialect item.

Date complete:

Notes:

Discussion[edit]

JPL Small-Body Database[edit]

  • Name of dataset:JPL Small-Body Database (SBDB)
  • Source:JPL Small-Body Database
  • Link:https://www.jpl.nasa.gov/
  • Description: A database about astronomical objects. It is maintained by Jet Propulsion Laboratory (JPL) and NASA and provides data for all known asteroids and several comets, including orbital parameters and diagrams, physical diagrams, and lists of publications related to the small body. The database is updated on a daily basis.
  • Request by: Noobius2 (talk) 20:47, 16 June 2017 (UTC)

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name:

Source: Link:

Description:

Link:

Done:

To do:

Notes:

Structure:

Example item:

Done:

To do:

Done:

To do:

Notes:

Done:

To do:

Notes:

Done:

To do:

Notes:

Date complete:

Notes:

Discussion[edit]

Gatehouse Gazetteer (Wales)[edit]

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name:

Source:

Link:

Description:

Link: https://docs.google.com/spreadsheets/d/1o-fZ7HbMieFEJ6vHT61Kp9e97Ix7hTl4utI8Jl0AXos/edit#gid=132680949

Done: Import into Mix n' Match, matched in Mix n' Match

To do: Quickstatements

Notes:

Structure:

Example item:

Done:

To do:

Done:

To do:

Notes:

Done: Added to Mix n' Match

To do:

Notes:

Done:

To do:

Notes:

Date complete:

Notes:

Discussion (20)[edit]

@Richard Nevell (WMUK): I see that you have created multiple items for castles that have no statements and no sources. I have looked at two items and they seem to be duplicates of already existing items Berry's Wood earthwork (Q17647484) and Hen Domen (Q5712905). Duplicate items produce additional workload and the general expectation in Wikidata is that contributors make a reasonable effort to avoid duplicate items when mass creating new items.

I created a property proposal for an ID to reference this website at https://www.wikidata.org/wiki/Wikidata:Property_proposal/Gatehouse_Gazetteer_place_ID . Importing via mix-and-match is likely a better idea.

There's also the question of whether data besides the name/description can be imported. ChristianKl (talk) 15:44, 27 June 2017 (UTC)

Hello @ChristianKl:. The two Berry's Wood enclosures are in different countries (England and Wales) while Hen Domen, Llansantffraid Deuddwr is a different site to the Hen Domen near Montgomery. How many items have you merged so far? The items have been created ahead of matching with Mix'n'match (catalogue here); information on county, country, coordinates, and instance would be imported. Richard Nevell (WMUK) (talk) 16:14, 27 June 2017 (UTC)
Only the two. Hen Domen, Llansantffraid Deuddwr is located in the historic country of Montgomeryshire (according to http://www.gatehouse-gazetteer.info/Welshsites/664.html). What makes you think that isn't near to Montgomery? ChristianKl (talk) 16:24, 27 June 2017 (UTC)

@ChristianKl: The distance between those two sites is 20km as the crow flies. I realise that's not clear from what I added to Wikidata as there were no coordinates on the new item. Both sites are in the historic county of Montgomeryshire, but it does cover something like 2,000km2.

I've been using Mix'n'matches 'game mode to match Wikidata items to the catalogue. The only options for entries without a match are 'new item' (which is what I've been using) and 'N/A'. Have I been using the wrong option? I understand that having Wikidata items without statements isn't particularly helpful, but it is meant to only be temporary. Richard Nevell (WMUK) (talk) 11:35, 29 June 2017 (UTC)

Hi all, creating new items without statements is just how Mix n' Match works. The statements will be added to the items items once the matching has been complete, there around 300 matches still to go. Thanks, --John Cummings (talk) 11:59, 29 June 2017 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────@ChristianKl: Is it ok to resume using Mix'n'match to create items? I reckon I could get the rest done by theend of the week so we'll be ready to import statements. Richard Nevell (WMUK) (talk) 13:15, 5 July 2017 (UTC)

Given that @ValterVB: is the person who's at the moment deleting the items, it might be better to have his opinion. Given that I created the property proposal, I can't create the property for the identifier. If another admin or property creator creates it, it would help a lot with making clear that the items are notable. ChristianKl (talk) 13:25, 5 July 2017 (UTC) S
Which Item? Without example isn't easy, probably was item without source and with only label e description. --ValterVB (talk) 20:05, 5 July 2017 (UTC)
ValterVB Mixnmatch will create items with only a description, but then add more statements. Is it OK to resume the matching process? I could try to go quicker so items don't stay empty for long. Richard Nevell (WMUK) (talk) 14:18, 6 July 2017 (UTC)
Do you have an example to see wich statements you add? And After how long you add the others statements? --ValterVB (talk) 17:43, 6 July 2017 (UTC)
@ValterVB: no label (Q30758975) is an item you deleted (twice) that is in the property proposal ChristianKl mentions up above. ArthurPSmith (talk) 18:06, 6 July 2017 (UTC)
@ValterVB: I think it's bad that you delete items (and especially redelete them when undeleted) without knowing what you delete. Engaging with Richard Nevell (WMUK) makes much more sense than blindly deleting his items. ChristianKl (talk) 18:13, 6 July 2017 (UTC)
I know what I delete: an item without statement, wihout sitelink without back link, no notable, it's in our guideline, If I found this kind of item I delete them withou doubt. --ValterVB (talk) 19:44, 6 July 2017 (UTC)
@ValterVB: Items that are linked from a property proposal dicussion fulfill a structural need and are thus notable. Aside from that those castles are clearly identifiable entities that can be described with public sources and thus also notable under 3. ChristianKl (talk) 23:34, 6 July 2017 (UTC)
@ValterVB: I understand that you deleted them because they had no statements, and that's what the policy says. But if I am allowed to match the rest of the set through Mix'n'match I intend to added statements to each item (including instance and location). Would you be happy letting me try that before deleting them? Richard Nevell (WMUK) (talk) 17:44, 7 July 2017 (UTC)
@Richard Nevell (WMUK): Yesterday I asked "And After how long you add the others statements?", @ChristianKl: If you add "public sources" that clearly identify the item we don't delete the item, nobody can force someone else to look for sources. If a user create an item can do a little effort and add the sources. The items are judged for the state they are in, not for potential that they can have. --ValterVB (talk) 19:30, 7 July 2017 (UTC)
@ValterVB: The criteria is whether there are public sources that can be used to describe the item and not whether the item is described by public sources. ChristianKl (talk) 19:35, 7 July 2017 (UTC)
OK, add link to public source in the item so we can check and eventually not delete. --ValterVB (talk) 20:07, 7 July 2017 (UTC)
ValterVB Does six days sound reasonable? Richard Nevell (WMUK) (talk) 12:32, 8 July 2017 (UTC)
6 days? Why so much time? no technical reason to wait one week. For me 48 hours it's the max accettable. --ValterVB (talk) 13:15, 8 July 2017 (UTC)
Addendum: If you have a list with source I can do it with my bot: Creation and addition of sources with 1 edit. --ValterVB (talk) 13:18, 8 July 2017 (UTC)
I imagine it could be done reasonably quickly by someone who is well versed with the process, however I am still learning the ropes. Six days should be enough for me to complete the matching, get help with quick statements, and get the information imported while also accommodating other calls on my time. Richard Nevell (WMUK) (talk) 15:55, 8 July 2017 (UTC)
In 6 day you can loss the item, because somen has changed the thing, you can win the lottery and forgot WIkidata and the process is veri dangerous. If you use quickstatement is more sure and correct add reference right after creating the item using "LAST" command. --ValterVB (talk) 20:29, 10 July 2017 (UTC)

ValterVB (talkcontribslogs), Richard Nevell (WMUK) I think there is a missunderstanding, creating empty items is how Mix n' Match works. If you delete empty items created by Mix n' Match you are breaking the data import process for the catalogues. There are currently 100s of catalogues being imported using this tool, some of which take several months to go through to match correctly to existing data. If the policy is incompatible with one of the main data import methods for Wikidata then I suggest we have a larger problem.... --John Cummings (talk) 15:12, 30 August 2017 (UTC)

Then we have a big problem. --ValterVB (talk) 19:18, 30 August 2017 (UTC)
If the intention is to add further statements to an item, what is the harm? Richard Nevell (WMUK) (talk) 10:30, 31 August 2017 (UTC)
Creating items and leaving them blank for a while is a problem, mainly because in the meantime someone might try to match the same concept to Wikidata and not detect your item (because it is blank and therefore hard to find). So they might create their own, and we end up with duplicates. − Pintoch (talk) 12:21, 31 August 2017 (UTC)
Granted, it's possible but with a suitably short time period the likelihood of this is small. Richard Nevell (WMUK) (talk) 14:42, 31 August 2017 (UTC)
The problem is: given how Mix and Match currently works, this time period can be rather long. Also, it is totally possible for someone to dump a dataset in Mix'n'Match, start matching some of it, and get bored at some point: in this case, the created items will remain empty forever… − Pintoch (talk) 14:00, 7 September 2017 (UTC)

Hi ValterVB (talkcontribslogs) and Pintoch (talkcontribslogs), can I suggest we start a discussion on the main project chat do discuss this possible incompatibility between the main import tool and Wikidata policy? There are 10s of Mix n' Match catalogues being processed at the moment, it does not seem realistic or practical to stop using the tool whilst this is discussed. Thanks, --John Cummings (talk) 15:13, 4 September 2017 (UTC)

Totally! By the way, I am also working on an alternative to Mix'n'Match: OpenRefine. − Pintoch (talk) 14:00, 7 September 2017 (UTC)
@ValterVB: please can you undelete all the items created by @Richard Nevell (WMUK): and myself asap? We are trying to import data into all items but its breaking because you deleted the items. We can populate the items quickly after you undelete them. Thanks, --John Cummings (talk) 14:06, 7 September 2017 (UTC)
ValterVB (talkcontribslogs), don't worry about undeleting these items. I'm going to recreate them now, along with some basic statements. Best NavinoEvans (talk) 10:17, 8 September 2017 (UTC)

Protected Planet Sites in Niger[edit]

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name:

Source:

Link:

Description:

Link: https://docs.google.com/spreadsheets/d/1tlqH0TggjqYL-nv2VWKsSYLRK9IQr4gC3jJtvnxwfBw/edit#gid=998871309

Done: 25%

To do: 75%

Notes:

Structure:

Example item:

Done:

To do:

Done:

To do:

Notes:

Mix N Match Catalogue: https://tools.wmflabs.org/mix-n-match/#/catalog/483Done:

To do:

Notes:

Done:

To do:

Notes:

Date complete:

Notes:

Discussion[edit]

Localisation and Information about all Fountains in the City of Zurich[edit]

  • Name of dataset: Brunnen der Stadt Zürich (Fountains in the City of Zurich)
  • Source: Open-Data-Catalog of the City of Zürich
  • Link: https://data.stadt-zuerich.ch/dataset/brunnen
  • Description: This Geodataset shows the locations of the ~1280 fountains which are maintained by the Water Supply Department of the City of Zurich (Wasserversorgung Stadt Zürich). The Geo-Dataset contains interesting attributes like the historical year of construction, the description of the fountain, the kind of water it contains or what kind of fountain it is. The Dataset is under CC-0-License an can be used freely.
  • Request by: Marco Sieber, Open-Data-Zürich-Team, Stadt Zürich

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name: Brunnen

Source: Open-Data-Catalog City of Zurich

Link: https://data.stadt-zuerich.ch/dataset/brunnen

Description: This Geodataset - available as GeoJSON, Geopackage, KML, Shapefile, Web Map Service and Web Feature Service - shows the locations of the ~1280 fountains which are maintained by the Water Supply Department of the City of Zurich (Wasserversorgung Stadt Zürich). The Geo-Dataset contains interesting attributes like the historical year of construction, the description of the fountain, the kind of water it contains or what kind of fountain it is.

Link: https://github.com/opendata-zurich/wikidata/blob/master/fountains/20170905_brunnen_zuerich.xlsx

Done: Manually converted from GeoJSON to Excel by author.

To do:

Notes: The conversion from GeoJSON 2 Spreadsheet is not automated yet

Structure:

Example item:

Done:

To do:

Done:

To do:

Notes:

Done:

To do:

Notes:

Done:

To do:

Notes:

Date complete:

Notes:

Discussion[edit]

Population by Stadtquartier since 1970 in the City of Zurich[edit]

  • Name of dataset: Population by Stadtquartier in the City of Zurich since 1970
  • Source: Open-Data-Catalog of the City of Zurich
  • Link: https://data.stadt-zuerich.ch/dataset/bev-bestand-jahr-quartier-seit1970
  • Description: This dataset contains all the population since 1970 in the City of Zurich per Statistical Stadtquartier (~district). Dataowner: Statistik Stadt Zürich
  • Request by: Marco Sieber, Open-Data-Zürich-Team, Stadt Zürich

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name: Bevölkerung nach Stadtquartier, seit 1970 (Resident population per District since 1970)

Source: Dataowner Statistik Stadt Zürich. Source of this file: Open-Data-Catalog of the City of Zurich

Link: https://data.stadt-zuerich.ch/dataset/bev-bestand-jahr-quartier-seit1970

Description: Attributes: [Ereignisjahr (technisch: StichtagDatJahr), Time stamp of when the number of the population is representative. Usually at the 31.12.YEAR] [Stadtquartier (Sort) (technisch: QuarSort), Official ID of the District called «Statistischen Stadtquartier» (Integer).] [Stadtquartier (lang) (technisch: QuarLang) Official Name of the District called «Statistischen Stadtquartier»(String).] [Wirtschaftliche Bevölkerung (technisch: AnzBestWir), amount of the resident population (Wirtschaftlich anwesende Personen) (Integer).] The Number of the population according to the Definition of «Resident Population»[3], which is different than the «Permanent resident Population»[4]. The Federal Statiscal Office publishes data for the latter.

Link: https://github.com/opendata-zurich/wikidata/blob/master/population_quartiere_since1970/bev324od3240.xlsx

Done: CSV2Excel by author.

To do:

Notes:

Structure:

Example item:

Done:

To do:

Done:

To do:

Notes:

Done:

To do:

Notes:

Done:

To do:

Notes:

Date complete:

Notes:

Discussion[edit]

Public Art on public ground in the City of Zurich[edit]

  • Name of dataset: Kunst im Stadtraum (KiS) / Public Art on public ground in the City of Zurich
  • Source: Open-Data-Catalog of the City of Zürich
  • Link: https://data.stadt-zuerich.ch/dataset/kunst-im-stadtraum
  • Description: This dataset is a collection of Public Art Objects, which are in possession of the City of Zurich and stand on public ground. The information stored in this data are coming from the responsible departements «Kunst im öffentlichen Raum» and «Kunst und Bau». It contains basic information about these objects and the artists who created them. All objects are georeferenced as well.
  • Request by: Marco Sieber, Open-Data-Zürich-Team, Stadt Zürich

Workflow[edit]

Description of dataset Create and import data into spreadsheet Structure of data within Wikidata Format the data to be imported Match the data to existing data Importing data into Wikidata Date import complete and notes
Name: Kunst im Stadtraum

Source: Open-Data-Catalog City of Zurich. Link: https://data.stadt-zuerich.ch/dataset/kunst-im-stadtraum

Description:

  • Attributes:
    • Titel: Title or Describtion of the piece of Art. If the official titel is not known, there's a description within brackets [].
    • Künstler_IN : Artist.
    • Datierung : Date of creation of the piece of Art.
    • Gattung : Type of Art (e.g. fountain, installation, architectural sculpture, etc.)
    • Material_Technik : Material or technique used.
    • Standort : Description on where the object can be found.
    • ID: ID used for the objects. Is supposed to be stable.
    • Easting_WGS: longitude value in WGS84
    • Northing_WGS: latitude value in WGS84
Link: https://github.com/opendata-zurich/wikidata/blob/master/public_art/kunstimstadtraum.xlsx

Done: Marco Sieber

To do:

Notes:

Structure:

Example item:

Done:

To do:

Done:

To do:

Notes:

Done:

To do:

Notes:

Done:

To do:

Notes:

Date complete:

Notes:

Discussion[edit]