User:Geertivp/training/Wikidata Query
Wikidata Queries |
Wikidata Query is a tool to query Wikidata. It uses SPARQL to process your query on RDF data.
The result of a query can be used again to update Wikidata, to automatically add missing data, or correct (constraint) errors/inconsistencies/constraint violations.
This way you can start a cycle of continuous improvement. You will typically use QuickStatements to process your transactions.
Here I will show a simple example how to create missing labels in a target language. Other queries are available from Wikidata Queries.
Why we need Wikidata Query to verify Data quality/completeness?[edit]
Constraints[edit]
- Wikidata is extremely open
- Anyone can edit
- Constraints are not proactively checked => only visible after saving the data
Culture[edit]
The importance of languages:
- Multilingual countries
- EN is most used
- Get your language/culture known in EN => others will translate/build in their own language
- Add a WM link
Search for missing labels in a target language[edit]
Search for Labels from other languages that do not exist in the target language. The results can be input for QuickStatements 2 (Q29032512) (see QuickStatements). This way you can semi-automatically create Labels (and the Description for the item) in any target language.
Example query[edit]
- Copy the following code to https://query.wikidata.org and click on Execute query:
The following query uses these:
- Items: human (Q5) , Belgium (Q31)
- Properties: instance of (P31) , country of citizenship (P27)
# Duplicate Labels to other languages SELECT ?item ?itemLabel ?itemDescription WHERE { ?item wdt:P31 wd:Q5. #instance of human ?item wdt:P27 wd:Q31. #country of citizenship Belgium SERVICE wikibase:label { bd:serviceParam wikibase:language "en,nl,fr,de,it,lu,es,no,pt". } FILTER(NOT EXISTS { ?item rdfs:label ?lang_label. FILTER(LANG(?lang_label) = "en") #with missing English label }) } ORDER BY ?itemLabel
Results[edit]
Column | Description |
---|---|
item | Q-number |
itemLabel | Label in source language |
itemDescription | Description in source language |
Process[edit]
- Export to Excel (problem with download: bad accents with UTF-8 character set; use copy/paste instead)
- Remove rows with missing labels
- Remove rows with missing descriptions
- Translate descriptions (use Wikidata)
- Prepare a QuickStatements load file
- Execute the transactions (copy/paste to QuickStatements ⇒ try one row first)
- Verify the results
- Manually correct any errors
Load file example[edit]
There exists a V1 or V2 transaction format.
Execute transactions via https://tools.wmflabs.org/quickstatements/ (short user guide included). First create the Labels:
Q16526046 Len "Aaron Botterman" ...
and then afterwards the Description:
Q16526046 Den "Belgian athlete" ...
Authentication[edit]
- You need WiDaR to authorize your QuickStatements session
- Transactions are logged under your userID
Known problems[edit]
- Import accents with proper UTF-8 character set
- Use the Lxx and Dxx separately (otherwise only the first operation is executed...)
- V2 allows for multiple properties to be added, one after the other in columnar format
- V2 requires """string""" triple-string-quoting
- Network problem could stop the processing; when the network connection is established again only process the rest of the file
- Use off-line transaction with large transaction files, when possible
- Wikidata Query runs on a replica of the live database, so can be a couple of minutes behind the live update of Wikidata edits/QuickStatements (to verify your results with Wikidata Query you might wait up-to 5 minutes). Verify with "View history" to be sure.
Tips[edit]
- You can set your GUI language (same as Wikidata) -- this makes it more easy to work with Properties
- Preferably use English as target language; it has the most items/users ⇒ the chance that your item is amended in yet another languages is higher...
- You can easily take one query (as an example) and change a few properties/values to create simular queries
- You can use checkConstraints to see constraint problems
See also[edit]
- Wikidata
- SPARQL
- Wikidata:SPARQL
- Wikidata:SPARQL query service/queries
- Wikidata:SPARQL query service/queries/examples
External links[edit]
- Tools
- Documentation
- d:Wikidata:SPARQL tutorial
- b:SPARQL
- mw:Wikibase/Indexing/RDF Dump Format
- https://www.w3.org/TR/sparql11-overview/
- https://www.w3.org/TR/sparql11-query/
- https://docs.data.world/tutorials/sparql/
- Other
- https://www.wikidata.org/wiki/Wikidata:WikidataCon_2017/Submissions/Using_Wikidata_Query_and_QuickStatements_to_automatically_amend_Wikidata_items
- https://commons.wikimedia.org/wiki/File:Wikidata-20171028-WMBE-Query-QuickStatements.pdf
- http://magnusmanske.de/wordpress/?p=72
- Obsolete