Wikidata:Dataset Imports/The Database of British and Irish Hills
You may find these related resources helpful:
Guidelines for using this page[edit]
Documenting the import[edit]
- Guidelines on how to import a dataset into Wikidata are available at Wikidata:Data Import Guide.
- Please include notes on all steps of the process.
- Once a dataset has been imported into Wikidata please edit the page to change the progress status from in progress to complete.
- It is strongly recommended to use Visual Editor when making changes to this page, particularly for editing any of the tables.
Creating a Wikidata item for the dataset[edit]
- Please create a Wikidata item for the dataset, this will allow us to improve the coverage of datasets on Wikidata and understand what datasets are available on that topic and which of them have been added to Wikidata.
- If you are working with very large dataset you can break it into smaller Mix n' Match catalogues, but only create one Wikidata item.
- Link the dataset Wikidata item to this page using Wikidata Dataset Imports URL (P5195)
Getting help[edit]
- If your dataset import runs into issues please edit the page to change the progress status from in progress to help needed.
- You can ask for help on Wikidata:Project chat.
Overview[edit]
Dataset name[edit]
The Database of British and Irish Hills
Source[edit]
The editorial team of the Database of British and Irish Hills. Additionally, acknowledgements are listed here: http://www.hills-database.co.uk/database_notes.html#acknowledge
Link[edit]
http://www.hills-database.co.uk/downloads.html
Dataset description[edit]
List of British and Irish hills and up-to-date information on their location, height and various classifications
Additional information[edit]
Available as csv file, spreadsheet or access database
Progress of import[edit]
The table below is used to track the progress of importing this dataset. The suggested column headings are most applicable to data being imported from a spreadsheet - you can change some column headings or add new columns as required to best describe the progress of this import.
Wikidata item for the dataset | Import data into spreadsheet | Match the dataset to Wikidata | Importing data into Wikidata | Visualisations | Maintainance queries and expected results |
---|---|---|---|---|---|
The Database of British and Irish Hills (Q61667995) | Provided | Mix'n'match catalog: https://tools.wmflabs.org/mix-n-match/#/catalog/2218
| Workflow established and import in progress as more data is matched | Not done yet | Not done yet |
Edit history[edit]
Use the table below to list batches of edits that have been completed for this dataset. Ideally each entry should have all applicable columns filled out, but at a minimum please make to add a date and description to give an idea of what was added to Wikidata and when.
Date | Description | Method | Properties | Notes | Statements added | Statements removed |
---|---|---|---|---|---|---|
2019-03-10 | Imported Munro classifications | quickstatements | instance of (P31) | Munro (Q1320721) | 282 | 0 |
2019-03-10 | Imported country for matched hills strictly in Scotland | quickstatements | country (P17) | 289 | 0 | |
2019-03-11 | Import county for matched hills, not handling cases where multiple counties given | quickstatements | located in the administrative territorial entity (P131) | 254 | 0 | |
2019-03-11 | Import coordinates for matched hills | quickstatements | coordinate location (P625) | 286 | 0 | |
2019-03-28 | Import missing hill names for matched hills | quickstatements | Item Label | 36 | 0 | |
2019-03-30 | Generate missing descriptions for matched hills | quickstatements | Item Description | 36 | 0 | |
2019-03-30 | Import missing aliases for matched hills | quickstatements | Item Alternate Labels | 24 | 0 | |
2019-03-31 | Import missing counties for matched hills, now handling cases where hills cross borders | quickstatements | located in the administrative territorial entity (P131) | Script is now much smarter and only outputs missing statements (now using a SPARQL query to get matched data as opposed to the mix n' match catalogue). For hills with multiple values in the county column it makes sense to add all values as these are hills which cross borders. | 109 | 0 |
2019-03-31 | Import missing classifications for matched hills | quickstatements | instance of (P31) | 1497 | 0 | |
2019-03-31 | Import 10 digit grid references for matched hills | quickstatements | OS grid reference (P613) | Realised that Irish hills are listed with Irish Grid Reference (P4091) so one erroneous statement was added to Knockaunapeebra (Q26717518) (thankfully the only currently matched Irish hill). Fixed by hand and have updated my script to support this. | 343 | 0 |
Discussion of import[edit]
Original spreadsheet data corresponding to Wikidata[edit]
Column Title | Wikidata property | Notes |
---|---|---|
Number | DoBIH Number (P6515) | Identifier |
Name | Item Label | Some hills have aliases within square brackets. Summits appear to have <hill name> - <summit name>. Many hills share the same name so matching wikidata can be tricky. |
County | located in the administrative territorial entity (P131) | For Scottish hills these are instances of Scottish council area (Q15060255), need to check others. |
Meters | elevation above sea level (P2044) | in meters, the "Feet" column is just a conversion of these values so no point in importing it also |
Grid ref 10 | OS grid reference (P613) or Irish Grid Reference (P4091) | Spaces should be removed. |
Drop | topographic prominence (P2660) | in meters |
Latitude | coordinate location (P625) (partial) | |
Longitude | coordinate location (P625) (partial) | |
<classification code> | instance of (P31) | Boolean represented by 1/0. For all available classification codes see http://www.hills-database.co.uk/database_notes.html#classification |
Match the dataset to Wikidata[edit]
Currently working my way through the Munro (Q1320721) items (282/282) --SilentSpike (talk) 21:32, 20 February 2019 (UTC)
Importing data into Wikidata[edit]
I'm using a python script to convert the CSV DoBIH file into quickstatements. My process is listed below --SilentSpike (talk) 16:53, 10 March 2019 (UTC)
- Download the DoBIH CSV file and place it into a new directory.
- Run the following query and download the results as a CSV file -
query.csv
- into the same directory.Try it!SELECT ?id ?item ?itemLabel ?itemDescription ?itemAltLabel (group_concat(distinct SUBSTR(STR(?class), 32)) as ?class) (group_concat(distinct SUBSTR(STR(?county), 32)) as ?county) (group_concat(distinct ?grid_ref) as ?grid_ref) (group_concat(distinct ?coords) as ?coords) (group_concat(distinct ?esl) as ?esl) (group_concat(distinct ?drop) as ?drop) (group_concat(distinct ?ie_grid_ref) as ?ie_grid_ref) WHERE { ?item wdt:P6515 ?id . OPTIONAL { ?item wdt:P31 ?class } . OPTIONAL { ?item wdt:P131 ?county } . OPTIONAL { ?item wdt:P613 ?grid_ref } . OPTIONAL { ?item wdt:P625 ?coords } . OPTIONAL { ?item wdt:P2044 ?esl } . OPTIONAL { ?item wdt:P2660 ?drop } . OPTIONAL { ?item wdt:P4091 ?ie_grid_ref } . SERVICE wikibase:label { bd:serviceParam wikibase:language "en,en" } } GROUP BY ?id ?item ?itemLabel ?itemDescription ?itemAltLabel
- Run the python script (hosted on gist here) in said directory. If the DoBIH releases a new version the CSV filename will need to be changed in the script.
- Paste the resulting output CSV files into quickstatements to update the respective properties/labelling.
Import completion notes[edit]
Visualisations[edit]
Maintenance[edit]
Queries and expected results[edit]
Query | Description | Expected results |
---|---|---|
Link | count Munro instances in wikidata | There should be 282 (possible this may change in future, but unlikely) |