Help:Add main subject with Mix-n-Match

From Wikidata
Jump to navigation Jump to search

Test[edit]

This describes how to add main subject (P921) with Mix'n'match (Q28054658) (MxM).

Wikidata includes a series of items describing works, especially items about biographies. The item would normally include a main subject (P921)-statement to point to the item about the person. Sample: Baker, William (Q19061914) has a statement at Q19061914#P921 with William Baker (Q15433209) as value.

This outlines the steps how to add these with Mix-n-Match.

Samples below with catalogue 3461 or property P2536 might now longer work as (most) steps have been completed. Add whatever property/catalogue you are using instead.

Create[edit]

Select a series of items without main subject[edit]

Sample query:

SELECT ?item ?itemLabel ?itemDescription
WHERE
{ 
	?item wdt:P136 wd:Q309481 ; wdt:P31 wd:Q13442814 .
	?item rdfs:label ?l . 
	FILTER ( lang ( ?l) = "en" && REGEX( ?l, "[12]\\d{3}.+[12]\\d{3}")  )
	MINUS { ?item wdt:P921 [] }
	SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }        
}
LIMIT 100
Try it!

Create a list of items[edit]

Required columns:

  • QID
  • label
  • description. Include YOB/YOD

Sample list:

Q28188074	Arthur C. Guyton	(1919-2003)
Q48041578	Arthur C. Upton	(1923-2015)
Q43731458	Arthur Cherkin	(1913-1987)
Q52406484	Arthur Keller	(1868-1934)
Q46360334	Arthur S. Keats	(1923-2007)

Make sure your labels and descriptions are suitable for creating new items.

Upload the list to MxM[edit]

At https://tools.wmflabs.org/mix-n-match/import.php

Add auxiliary data[edit]

Ask Magnus to run the YOB/YOD auxiliary data creator: User talk:Magnus Manske

Match[edit]

Wait till it maches[edit]

Look for "Automatic name/date matcher"

Gradually add main subject (P921)[edit]

Run "Manual sync catalogue"

Add P921

  • Sample query:
SELECT ?item ?itemLabel ?itemDescription ?obit ?obitLabel ?obitDescription
WHERE
{
	?item wdt:P2536 ?value .
	BIND( URI ( CONCAT("http://www.wikidata.org/entity/", ?value)) as ?obit) 
	MINUS { ?obit wdt:P921 ?item } 
	SERVICE wikibase:label { bd:serviceParam wikibase:language "en"  }    
}
LIMIT 100
Try it!
  • Upload them, e.g. with QuickStatements

Match remaining items manually[edit]

Several options:

Note that "Automatched" (or "preliminarly matched") includes three types of entries:

  • 1. entries where the Wikidata item has no dates.
TODO: double-check, confirm or dematch/create
  • 2. entries where the Wikidata item has similar dates (maybe off by one year or YOD missing).
TODO: double-check, confirm or dematch/create
  • 3. entries where the Wikidata item has different dates (years off by > 10, different YOB and YOD).
TODO: dematch/create

It was suggested that (3) should appear in "unmatched" directly (Topic:Vjoli2bxtekfx7kf).

"purge automatches" on the "Jobs" screen proofed useful and then re-checking common_names can be useful. Once done, re-run "automatch by search"

Create new items for everything else[edit]

Several options:

Close[edit]

Add missing elements[edit]

  • Some of the entries for "Multiple external IDs for a single Wikidata item in this catalog" on the "Manual sync catalogue" (sample page https://tools.wmflabs.org/mix-n-match/#/sync/3461 17 of 1000 were missing). These could be added by downloading the entire catalogue comparing it with what's on Wikidata.

Samples:

SELECT  ?item ?itemLabel ?itemDescription   ?obit ?obitLabel ?obitDescription ?value
WHERE
{
	?item wdt:P2536 ?value .	
    BIND( URI ( CONCAT("http://www.wikidata.org/entity/", ?value)) as ?obit) 
    OPTIONAL { ?obit wdt:P921 ?item }	
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en"}    
}
Try it!

Deactivate the catalog in MxM[edit]

If everything is matched, then see "catalog editor", e.g. at https://tools.wmflabs.org/mix-n-match/#/catalog_editor/3461

Convert temporary statements[edit]

  • Delete temporary statements (with PetScan or QuickStatements)
SELECT ?item ?itemLabel ?itemDescription ?obit ?obitLabel ?obitDescription
WHERE
{
	?item wdt:P2536 ?value .
	BIND( URI ( CONCAT("http://www.wikidata.org/entity/", ?value)) as ?obit) 
	?obit wdt:P921 ?item  
	SERVICE wikibase:label { bd:serviceParam wikibase:language "en"  }    
}
LIMIT 100
Try it!

All done![edit]

Write a short summary here and revise the steps above if needed.

Summary for catalogue 3461:

  • 45% could be matched directly by "Automatic name/date matcher"
  • 5% were matched manually to existing items
  • 50% are new items. 7.5% including identifiers from other catalogues (generally 1, max. 9: Q89057265, Q89187861).

Of the 55% percent, maybe half were in "automatched" (mostly type 3 mentioned above), the others in "unmatched". Ideally for this catalogue maybe 5%-10% would have had to be checked manually.

Possible improvements[edit]