Wikidata:Dataset Imports/People in the Ísmús database

From Wikidata
Jump to navigation Jump to search

You may find these related resources helpful:


Guidelines for using this page[edit]

  • It is strongly recommended to use Visual Editor when making changes to this page, particularly for editing any of the tables.
  • Guidelines on how to import a dataset into Wikidata are available at Wikidata:Data Import Guide.
  • Once a dataset has been imported into Wikidata please edit the page to change the progress status from in progress to complete.
  • If your dataset import runs into issues please edit the page to change the progress status from in progress to help needed.

Overview[edit]

Dataset name[edit]

People in the Ísmús database

Source[edit]

Ísmús

Link[edit]

https://www.ismus.is/l/person

Dataset description[edit]

Just over 10,000 People listed on Ísmús website, an Icelandic music and cultural heritage database. Data available for each person:

  • Ísmús unique URL to description page
  • Icelandic name
  • Icelandic Aliases
  • Date of birth
  • Date of death
  • Gender
  • Occupations

Dataset Wikidata item[edit]

people listed in the Ísmús database (Q63565750)

Additional information[edit]

There are 10,003 people in the database at the time of the first import

Progress of import[edit]

The initial focus was adding the missing women from the dataset, so I've split the progress section to track each gender separately.

Gender segmentImport data into spreadsheetFormat the spreadsheet to import the dataStructure of data within WikidataMatch the dataset to WikidataImporting data into WikidataVisualisationsMaintainance queries and expected results
Female✓ Done (Link)✓ Done✓ Done ✓ Done Doing… Not done Not done
Male Not done Not done✓ Done Not done Not done Not done Not done

Edit history[edit]

The edit history so far, as well as planned upcoming edits are shown below:

DateSegmentDescriptionMethodPropertiesQualifiersReferencesItems createdStatements addedStatements removedLink to import sheet
5 Mar 2019FemaleCreated missing items and added URL for each personQuickStatementsdescribed at URL (P973)192419240Link
5 Mar 2019Female Add Icelandic labelsQuickStatementslabel19240Link
5 Mar 2019Female Add instance of human statementsQuickStatementsinstance of (P31) = human (Q5)reference URL (P854)19760Link
5 Mar 2019Female Add gender statementsQuickStatementssex or gender (P21) = female (Q6581072)reference URL (P854)19760Link
5 Mar 2019Female Add date of birth statementsQuickStatementsdate of birth (P569)reference URL (P854)18600Link
6 Mar 2019FemaleAdd country of citizenship statementsQuickStatementscountry of citizenship (P27) = Iceland (Q189) reference URL (P854)19760Link
4 Nov 2019FemaleAdd occupationsQuickStatementsoccupation (P106)reference URL (P854)22650Link
 Not doneFemaleAdd date of death statementsQuickStatements
 Not done FemaleAdd Icelandic AliasesQuickStatements

Discussion of import[edit]

These headings are generally useful, please change this section to suit your needs.

Format the spreadsheet to import the data[edit]

The simplest method for extracting data from the website is to filter the view to required data then copy and paste into Google sheet.

When you do this, you need to extract the links for each item from the 'embedded HTML' links that result in your spreadsheet.This can be achieved with this online solution (scroll down to the solution at the end of the thread). The basic steps are:

  1. Publish Google sheet to web
  2. Import data back using IMPORTXML

Note that most solutions you find online only work when the hyperlink() formula has been used, but this solution works with embedded HTML links as well.

NavinoEvans (talk) 18:10, 7 May 2019 (UTC)[reply]

Match the dataset to Wikidata[edit]

Auto-matching done in spreadsheet using existing list of all Icelandic people on Wikidata

Visualisations[edit]

Not done yet

Maintenance[edit]

Queries and expected results[edit]

Query linkDescriptionExpected results
All People with Described at URL starting with "https://www.ismus.is/i/person/"Shows all people imported so far, along with the link back to their page on the data source. This property is the main link used for matching
  • All URLs should be unique
  • Should be around 10,000 or fewer results

Schedule of new data released[edit]

No set schedule but new data expected to be added periodically

Exclusions[edit]

One item in the dataset is a test page and should not be imported - https://www.ismus.is/i/person/uid-ada4f2b8-b7fe-41bf-9454-b38aa4cc6cce