Wikidata:SPARQL federation input

From Wikidata
Jump to navigation Jump to search

One of the cool features of SPARQL is federation. It allows you to query several SPARQL endpoints together to get a combined query result. In order to enable better integration of data available in Wikidata with other linked data sources, we plan to enable SPARQL Federation on Wikidata Query Service to a selected number of other SPARQL endpoints. For security and performance reasons, we can not just allow any endpoint without filtering. We need to have a whitelist of approved endpoints. This page is for nominating and discussing which endpoints should be supported. Currently supported endpoints are listed in the User Manual.

The suggested SPARQL endpoints must satisfy the following conditions:

  • Complies with the SPARQL 1.1 protocol, "query operation" part, at least to the extent necessary to make federated SERVICE clause work (most SPARQL endpoints do).
  • Contains data that can be linked to Wikidata - i.e., either contains Wikidata IDs or can be queried by values contained in one of the Wikidata properties.
  • Has data freely available under license compatible with CC0 (preferred) or other free database license allowing unrestricted reuse. Attribution licenses like CC-BY are ok too. Currently, we do not accept endpoints with reuse restriction clauses like NC/ND.

Please post the URL of the endpoint, a short description of it, and, if available, the URL of its documentation. Check first that the endpoint is not already in use or rejected. Thank you for helping to improve Wikidata.

Nominate new endpoint

Suggestions[edit]

Licence suitable[edit]

Endpoints that are immediately suitable for inclusion.

Attribution licences (like CC-BY)[edit]

Looks like attribution license are OK too, we will acknowledge them on licensing page for the service. If for some reason such acknowledgement is not enough, please do not add the endpoint here.

See also:

Licence unclear[edit]

Unclear license status, please help us to figure it out.

LOD Cloud Cache[edit]

Endpoint
http://lod.openlinksw.com/sparql
Documentation
Licence
Background
https://lists.w3.org/Archives/Public/public-lod/2013May/0154.html
https://sourceforge.net/p/virtuoso/mailman/message/32005015/

FactForge[edit]

SPARQL Endpoint
http://factforge.net/sparql
Federation endpoint
http://factforge.net/repositories/ff-news
Documentation
http://factforge.net/about
Background
FactForge represents a large scale public demonstrator of many of GraphDB‘s advanced features: reasoning, geo-spatial indexing, RDFRank, full-text search connectors and owl:sameAs optimization. It loads several LOD datasets in a single GraphDB repository. On top of that, cleanup and other corrections are applied to some of these datasets and ontologies.

3cixty[edit]

Endpoint
https://kb.3cixty.com/sparql
Documentation
http://www.eurecom.fr/~troncy/Publications/Rizzo_Troncy-iswc15swc.pdf
Licence
Background
https://www.3cixty.com

3cixty provides comprehensive knowledge bases covering entire territories and cities. It contains millions of triples describing all point of interests, local businesses and events happening in the city. The Knowledge Base is updated every night. The SPARQL endpoint has 99% availability since 2 years.

--Rtroncy (talk) 20:02, 23 February 2017 (UTC)

@Rtroncy: any idea about the licensing terms? --Smalyshev (WMF) (talk) 19:58, 11 April 2017 (UTC)
@Smalyshev: Sorry for the late reply, strangely, I didn't get any notifications! I control the endpoint. What license would be suitable for you? Rtroncy (talk) 07:35, 6 June 2017 (UTC)
@Rtroncy: CC0 ideally, but we agreed that CC-BY would be fine too, if you're ok with acklowledgement like here: https://query.wikidata.org/copyright.html --Smalyshev (WMF) (talk) 22:03, 6 June 2017 (UTC)

Not suitable[edit]

Endpoint suggestions rejected for license or other reasons. May be reconsidered if license or circumstances change.

data.admin.ch[edit]

SPARQL Endpoint
http://data.admin.ch/query/
Documentation
http://data.admin.ch/ and https://github.com/zazuko/fso-lod/blob/master/doc/eCH0071/sparql.md
Licence
https://opendata.swiss/en/dataset/historicized-municipalities-register
Background
Contains data from Swiss government agencies as Linked Data, in particular from the Swiss Federal Statistical Office (FSO). Where possible URIs on this endpoint link to Wikidata URIs.
@TheKtk: Unfortunately, the License link above returns 404. Could you update it? (done)

The license requires prior permission for commercial use. Since Wikidata has no way of separating commercial and non-commercial users, this creates uncertainty for us which I'd rather avoid. Smalyshev (WMF) (talk) 21:15, 28 July 2018 (UTC)

geo.admin.ch Linked Data Service[edit]

SPARQL Endpoint
https://ld.geo.admin.ch/query
Documentation
https://ld.geo.admin.ch/
Licence
https://opendata.swiss/en/dataset/swissboundaries3d-gemeindegrenzen
Background
Linked Data representation of swissBOUNDARIES 3D dataset by geo.admin.ch, the Swiss federal geoportal. URIs link to Wikidata URIs where appropriate, useful for visualizing data related to Swiss entities. One can get up to date shapes of these entities as Well Known Text (WKT).
@TheKtk: Unfortunately, the License link above returns 404. Could you update it? (done)

External lists[edit]

Other discussion[edit]

Please discuss on the talk page.

Incoming nominations[edit]

The nominations are initially placed here and then sorted and moved into the specific topics above.

European Patent Office Linked open EP data Service[edit]

SPARQL Endpoint
https://data.epo.org/linked-data/query
Documentation
https://data.epo.org/linked-data/
Item about database/website/endpoint
European Patent Office (Q7132151)
Licence
Creative Commons Attribution 4.0 International see https://www.epo.org/searching-for-patents/data/linked-open-data.html#tab-2
Background
Just noticed that the Euroepan Patent Office published their data as linked open data under a free license. The only relevant property I could find is patent number (P1246), but maybe a dedicated property is available (in the future). Multichill (talk) 18:08, 8 September 2018 (UTC)

Data ISTEX[edit]

SPARQL Endpoint
https://data.istex.fr/sparql/
Documentation
https://blog.istex.fr/category/istex-lod/sparql/
Item about database/website/endpoint
CNRS (Q280413)
Licence
Open License (Q3238028)
Background

Bibliographic metadata, with various alignments on GeoNames ID (P1566), shares border with (P47), instance of (P31), located in the administrative territorial entity (P131), Library of Congress authority ID (P244), BnF ID (P268), VIAF ID (P214), coordinate location (P625), population (P1082), ...