Help:Add main subject with Mix-n-Match

From Wikidata
Jump to navigation Jump to search


This describes how to add main subject (P921) with Mix'n'match (Q28054658) (MxM).

Wikidata includes a series of items describing works, especially items about biographies. The item would normally include a main subject (P921)-statement to point to the item about the person. Sample: Baker, William (Q19061914) has a statement at Q19061914#P921 with William Baker (Q15433209) as value.

This outlines the steps how to add these with Mix-n-Match.

Samples below with catalogue 3461 or property P2536 might now longer work as (most) steps have been completed. Add whatever property/catalogue you are using instead.


Select a series of items without main subject[edit]

Sample query:

SELECT ?item ?itemLabel ?itemDescription
	?item wdt:P136 wd:Q309481 ; wdt:P31 wd:Q13442814 .
	?item rdfs:label ?l . 
	FILTER ( lang ( ?l) = "en" && REGEX( ?l, "[12]\\d{3}.+[12]\\d{3}")  )
	MINUS { ?item wdt:P921 [] }
	SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }        
Try it!

Create a list of items[edit]

Required columns:

  • QID
  • label
  • description. Include YOB/YOD

Sample list:

Q28188074	Arthur C. Guyton	(1919-2003)
Q48041578	Arthur C. Upton	(1923-2015)
Q43731458	Arthur Cherkin	(1913-1987)
Q52406484	Arthur Keller	(1868-1934)
Q46360334	Arthur S. Keats	(1923-2007)

Make sure your labels and descriptions are suitable for creating new items.

Upload the list to MxM[edit]


Add auxiliary data[edit]

Ask Magnus to run the YOB/YOD auxiliary data creator: User talk:Magnus Manske


Wait till it maches[edit]

Look for "Automatic name/date matcher"

Gradually add main subject (P921)[edit]

Run "Manual sync catalogue"

Add P921

  • Sample query:
SELECT ?item ?itemLabel ?itemDescription ?obit ?obitLabel ?obitDescription
	?item wdt:P2536 ?value .
	BIND( URI ( CONCAT("", ?value)) as ?obit) 
	MINUS { ?obit wdt:P921 ?item } 
	SERVICE wikibase:label { bd:serviceParam wikibase:language "en"  }    
Try it!
  • Upload them, e.g. with QuickStatements

Match remaining items manually[edit]

Several options:

Note that "Automatched" (or "preliminarly matched") includes three types of entries:

  • 1. entries where the Wikidata item has no dates.
TODO: double-check, confirm or dematch/create
  • 2. entries where the Wikidata item has similar dates (maybe off by one year or YOD missing).
TODO: double-check, confirm or dematch/create
  • 3. entries where the Wikidata item has different dates (years off by > 10, different YOB and YOD).
TODO: dematch/create

It was suggested that (3) should appear in "unmatched" directly (Topic:Vjoli2bxtekfx7kf).

"purge automatches" on the "Jobs" screen proofed useful and then re-checking common_names can be useful. Once done, re-run "automatch by search"

Create new items for everything else[edit]

Several options:


Add missing elements[edit]

  • Some of the entries for "Multiple external IDs for a single Wikidata item in this catalog" on the "Manual sync catalogue" (sample page 17 of 1000 were missing). These could be added by downloading the entire catalogue comparing it with what's on Wikidata.


SELECT  ?item ?itemLabel ?itemDescription   ?obit ?obitLabel ?obitDescription ?value
	?item wdt:P2536 ?value .	
    BIND( URI ( CONCAT("", ?value)) as ?obit) 
    OPTIONAL { ?obit wdt:P921 ?item }	
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en"}    
Try it!

Deactivate the catalog in MxM[edit]

If everything is matched, then see "catalog editor", e.g. at

Convert temporary statements[edit]

  • Delete temporary statements (with PetScan or QuickStatements)
SELECT ?item ?itemLabel ?itemDescription ?obit ?obitLabel ?obitDescription
	?item wdt:P2536 ?value .
	BIND( URI ( CONCAT("", ?value)) as ?obit) 
	?obit wdt:P921 ?item  
	SERVICE wikibase:label { bd:serviceParam wikibase:language "en"  }    
Try it!

All done![edit]

Write a short summary here and revise the steps above if needed.

Summary for catalogue 3461:

  • 45% could be matched directly by "Automatic name/date matcher"
  • 5% were matched manually to existing items
  • 50% are new items. 7.5% including identifiers from other catalogues (generally 1, max. 9: Q89057265, Q89187861).

Of the 55% percent, maybe half were in "automatched" (mostly type 3 mentioned above), the others in "unmatched". Ideally for this catalogue maybe 5%-10% would have had to be checked manually.

Possible improvements[edit]