Wikidata:Report a technical problem/WDQS and Search

From Wikidata
Jump to navigation Jump to search
Report a problemHow to report a problemHelp with PhabricatorGet involvedWDQS and Search

Start a new discussion

id missing in result[edit]

# Mandals in Andhra pradesh with district and tewiki page_title

SELECT ?mandal ?mandalLabel ?districtLabel ?page_titleTE  WHERE {
  ?mandal wdt:P31 wd:Q817477;
        p:P131 ?districtstate.
  ?districtstate ps:P131 wd:Q15394.

  MINUS {?districtstate pq:P582 ?endTime.}
  ?article schema:about ?mandal;
           schema:isPartOf <https://te.wikipedia.org/>;
           schema:name ?page_titleTE.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "te,en".}

}
Try it!

should show 43 results, but only shows 42, with Q59902712 missing. This id was restored after it was redirected a week back, as it is used in OSM. --Arjunaraoc (talk) 11:22, 21 July 2021 (UTC)[reply]

@Arjunaraoc: Should be fixed now - diff. Q59902712 had an old value for tehsil - Q13005188, which had not been updated to the current value of Q817477. --Tagishsimon (talk) 12:07, 21 July 2021 (UTC)[reply]

Problem with the buttons "Search(results)" and "hide/show pagination"[edit]

Good morning from Vienna,

numerous times I tried to enter the following query:

"SELECT ?item WHERE {

 ?item wdt:P31 wd:Q3305213.
 FILTER NOT EXISTS  {?item wdt:P1257 ?icon}
 }"

Results appeared after 12 up to 45 seconds(average: 23).

The list starts with "Q72650". I noticed that there are several paintings without the attribute of iconclass with a qualifier lower than 72650, e.g. Q23912. Therefore I wanted to search for that other item in the panel, or to press the button hide/show pagination. In either case the warning was displayed "page(site) does not respond. you can wait or leave". The problem was not solved by waiting, so I had to skip. What could be the reason for that issue? Regards, Christoph

Cross posted to Request a Query, where it's being dealt with. Please raise issues in one place only. --Tagishsimon (talk) 10:19, 21 August 2021 (UTC)[reply]

No me trae la propiedad pseudonym wdt:P742 y el isni[edit]

   SELECT distinct  ?author ?authorLabel ?dob ?dod ?birthplaceLabel ?goodreads ?countryLabel  ?workLabel ?isniLabel

(GROUP_CONCAT(?occupationLabel; separator = ", ") as ?occupationLabels) (GROUP_CONCAT(?educatedLabel; separator = ", ") as ?educatedLabels ) (GROUP_CONCAT(?nomineeLabel; separator = ", ") as ?nomineeLabels) (GROUP_CONCAT(?awardLabel; separator = ", ") as ?awareLabels ) (GROUP_CONCAT(?genreLabel; separator = ", ") as ?genreLabels ) (GROUP_CONCAT(?pseudonymLabel; separator = ", ") as ?pseudonymLabels )

WHERE { ?author rdfs:label "Jorge Luis Borges"@es .?author rdfs:label ?authorLabel filter (lang(?authorLabel)='es').

       OPTIONAL {?author wdt:P569 ?dob } .
        OPTIONAL {?author wdt:P570 ?dod } .
       OPTIONAL {?author wdt:P19 ?birthplace .?birthplace rdfs:label ?birthplaceLabel filter (lang(?birthplaceLabel) = "es")}.

OPTIONAL {?author wdt:P2963 ?goodreads } . OPTIONAL {?author wdt:P27 ?country .?country rdfs:label ?countryLabel filter (lang(?countryLabel) = "es")}. OPTIONAL {?author wdt:P742 ?pseudonym .?pseudonym rdfs:label ?pseudonymLabel }. OPTIONAL {?author wdt:P106 ?occupation .?occupation rdfs:label ?occupationLabel filter (lang(?occupationLabel) = "es")} . OPTIONAL {?author wdt:P101 ?work .?work rdfs:label ?workLabel filter (lang(?workLabel) = "es")}. OPTIONAL {?author wdt:P1411 ?nominee .?nominee rdfs:label ?nomineeLabel filter (lang(?nomineeLabel) = "es")}. OPTIONAL {?author wdt:P69 ?educated .?educated rdfs:label ?educatedLabel filter (lang(?educatedLabel) = "es")}. OPTIONAL {?author wdt:166 ?award .?award rdfs:label ?awardLabel}. OPTIONAL {?author wdt:213 ?isni .?isni rdfs:label ?isniLabel}. OPTIONAL {?author wdt:136 ?genre .?genre rdfs:label ?genreLabel filter (lang(?genreLabel) = "es")}. SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]" }} GROUP BY ?author ?authorLabel ?dob ?dod ?birthplaceLabel ?goodreads ?countryLabel ?workLabel ?isniLabel

Various things to change. ISNI and Pseudonym do not have labels. Various predicates were missing the letter P. Please consider using Wikidata:Request a query for this sort of question in future; thx.
SELECT distinct  ?author ?authorLabel ?dob ?dod ?birthplaceLabel ?goodreads ?countryLabel  ?workLabel ?isni
(GROUP_CONCAT(DISTINCT ?occupationLabel; separator = ", ") as ?occupationLabels) 
(GROUP_CONCAT(DISTINCT ?educatedLabel; separator = ", ") as ?educatedLabels ) 
(GROUP_CONCAT(DISTINCT ?nomineeLabel; separator = ", ") as ?nomineeLabels) 
(GROUP_CONCAT(DISTINCT ?awardLabel; separator = ", ") as ?awardLabels )
(GROUP_CONCAT(DISTINCT ?genreLabel; separator = ", ") as ?genreLabels )
(GROUP_CONCAT(DISTINCT ?pseudonym; separator = ", ") as ?pseudonyms )
WHERE 
{ 
  ?author rdfs:label "Jorge Luis Borges"@es .
  ?author rdfs:label ?authorLabel filter (lang(?authorLabel)='es').
  OPTIONAL {?author wdt:P569 ?dob } .
  OPTIONAL {?author wdt:P570 ?dod } .
  OPTIONAL {?author wdt:P19 ?birthplace .?birthplace rdfs:label ?birthplaceLabel filter (lang(?birthplaceLabel) = "es")}.
  OPTIONAL {?author wdt:P2963 ?goodreads } . 
  OPTIONAL {?author wdt:P27 ?country .?country rdfs:label ?countryLabel filter (lang(?countryLabel) = "es")}. 
  OPTIONAL {?author wdt:P742 ?pseudonym }. 
  OPTIONAL {?author wdt:P106 ?occupation .?occupation rdfs:label ?occupationLabel filter (lang(?occupationLabel) = "es")} . 
  OPTIONAL {?author wdt:P101 ?work .?work rdfs:label ?workLabel filter (lang(?workLabel) = "es")}. 
  OPTIONAL {?author wdt:P1411 ?nominee .?nominee rdfs:label ?nomineeLabel filter (lang(?nomineeLabel) = "es")}. 
  OPTIONAL {?author wdt:P69 ?educated .?educated rdfs:label ?educatedLabel filter (lang(?educatedLabel) = "es")}. 
  OPTIONAL {?author wdt:P166 ?award . ?award rdfs:label ?awardLabel  filter (lang(?awardLabel) = "es")}. 
  OPTIONAL {?author wdt:P213 ?isni }. 
  OPTIONAL {?author wdt:P136 ?genre .?genre rdfs:label ?genreLabel filter (lang(?genreLabel) = "es")}. 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]" }
} GROUP BY ?author ?authorLabel ?dob ?dod ?birthplaceLabel ?goodreads ?countryLabel ?workLabel ?isni
Try it!
--Tagishsimon (talk) 16:37, 29 October 2021 (UTC)[reply]

query server data outdated (8 Dec 2021)[edit]

I seem to get data already updated yesterday. --- Jura 10:06, 8 December 2021 (UTC)[reply]

Is there anything specific you found that was not updated so that we can have deeper look at the problem? The current known problems that we are currently working on that might be related to what you've seen:
  • Implement a solution to rapidly identify & restart WDQS servers hit by deadly queries (phab:T293862). Currently some servers may enter a zombie state for quite some time (several hours) possibly exposing stale results when they resurrect.
  • Implement a reconciliation strategy for the streaming updater (phab:T279541). This is about being more reactive to inconsistencies between wikidata & WDQS.
Please let me know if you think that the problem you encountered is unrelated to the two items above, Thanks! DCausse (WMF) (talk) 14:46, 13 December 2021 (UTC)[reply]
Possibly T279541 solves it, but I don't really see how I could determine that.
In the meantime, it was updated on WDQS. Is there some information that I should gather when reporting this? The icon on WDQS was green.
@DCausse (WMF): --- Jura 12:47, 16 December 2021 (UTC)[reply]
As reported elsewhere the icon is sadly not a good indication, esp. if only one server is lagging. The exact time at which the problem was encountered will help to find a correlation with problematic servers if any. If the problem persists even after several hours then it might be an incoherence, reporting the wikidata item impacted will help us to debug the issue. Thanks! DCausse (WMF) (talk) 15:35, 16 December 2021 (UTC)[reply]
Is there a way for me to determine which server it may be?
@DCausse (WMF): --- Jura 17:52, 16 December 2021 (UTC)[reply]
From the query service perspective there are no ways to know if the results seen are from a server that is under lag, this is something we may address at some point and is tracked under phab:T278246.
The only way for us is to correlate reports with the lag reported in https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1. DCausse (WMF) (talk) 10:48, 17 December 2021 (UTC)[reply]
Ok, so I guess it's worth reporting if it can't be found there or if it's starts getting overly long.
Looking at [1] now, it seems I could probably have found it there. --- Jura 13:56, 17 December 2021 (UTC)[reply]
Added it to the notice for this page: [2]. --- Jura 11:21, 18 December 2021 (UTC)[reply]

make sitelink redirect status queryable (December 11, 2021)[edit]

To work on items without any claims and possible other queries, it would be helpful if one could directly query to exclude that a sitelink is a redirect.

This would be similar to the automatic deletion of any sitelink (when it's deleted on a client wiki) or the update (when it's being moved on a client wiki).

An update on a client wiki could add or remove this with an automatic badge.

Scope of the question: there are currently 495.000 redirects in sitelinks and a feature allowing them hasn't even been implemented.

This is independent of any other development about redirects (should they be possible, how to implement them, what (manual) badges to use to qualify them, etc).

@MisterSynergy, Epìdosis, DCausse (WMF): --- Jura 18:46, 11 December 2021 (UTC)[reply]

It would IMO be sufficient to make the manual redirect badges sitelink to redirect (Q70893996) and intentional sitelink to redirect (Q70894304) actually usable without any hacks. I would be able to create a bot that syncs these with the existing ~0.5M redirects in no time, and that keeps it synced on a daily basis or so. Unfortunately, we still cannot save redirect sitelinks without a temporary deactivation of the redirect on the client wiki (which some client wiki communities consider as vandalism if done systematically). —MisterSynergy (talk) 19:00, 11 December 2021 (UTC)[reply]
I surely agree with MisterSynergy, having the possibility to add redirect-sitelinks with no further problem would be very very useful to work on these items. --Epìdosis 19:07, 11 December 2021 (UTC)[reply]
I don't see an advantage of having to check several badges and having to wait for a bot to resync them.
Wikidata's key assumption that there is a page on a client wiki at the sitelink is currently broken. --- Jura 19:09, 11 December 2021 (UTC)[reply]
@DCausse (WMF), Lydia Pintscher (WMDE): which way do you prefer? Neither seems to require big developments and both approaches would lead to an improvement in data quality. --- Jura 13:12, 4 January 2022 (UTC)[reply]
From what I understand what you suggest does not change the mw:Wikibase/Indexing/RDF_Dump_Format so this is already a good thing. For the rest, about how changes are propagated between client wikis and wikidata I'm not knowledgeable enough to judge if it's easily feasible/fixable or not. I think that the scope of the problem you raise here is broader than just WDQS here and probably deserves being discussed on a page not dedicated solely to WDQS. DCausse (WMF) (talk) 09:28, 5 January 2022 (UTC)[reply]
I asked Lydia for input as well.
The relevant part of mw:Wikibase/Indexing/RDF_Dump_Format#Sitelinks is that wikibase:badge will be present for these. The idea is to find a pragmatic solution that doesn't require a lot of development. --- Jura 10:12, 5 January 2022 (UTC)[reply]

delete triples from WDQS: ?a rdf:type schema:Article[edit]

SELECT (COUNT(*) as ?count) 
{
  [] rdf:type schema:Article
}
Try it!

The above counts some 80,000,000 triples. These are part of the sitelink mapping on items. Sample:


SELECT * { <https://en.wikipedia.org/wiki/Wikidata> ?a ?b }
Try it!

I don't think they serve any real purpose. I suggest to not export them to WDQS going forward/delete them. Obviously, after announcing the change. --- Jura 20:06, 22 December 2021 (UTC)[reply]

I also see little value to these particular triples but changing the RDF dump format is something that takes time & effort (communication/tests/migration). On the other hand this seems a very valid candidate for deletion if we ever enact the Wikidata:SPARQL_query_service/Blazegraph_failure_playbook, please feel free to forward your suggestion to its talk page, thanks! DCausse (WMF) (talk) 10:08, 3 January 2022 (UTC)[reply]
I think it's a cleanup that can be done even without Blazegraph about to fail. --- Jura 00:37, 4 January 2022 (UTC)[reply]

index "street address" (P6375) strings[edit]

It would be interesting to be able to search for street address (P6375)-values, e.g. Special:Search/Getreidegasse Salzburg should find Q37970995. --- Jura 19:11, 28 December 2021 (UTC)[reply]

Thanks for the feedback, I agree that the current status quo about textual data in the search index (in general) is not ideal. We created phab:T240334 a while ago but given the lack of hardware resources we did not investigate further. But thanks to phab:T265621 (new dedicated search cluster for wikidata & commons) we might reconsider this and index more textual content. DCausse (WMF) (talk) 09:53, 3 January 2022 (UTC)[reply]
What would be the relative impact? We do seem to have a large number of statements with P6375, but for buildings (sample [[Q37970995] above), much would otherwise need to be added to the description and/or as alias. --- Jura 00:35, 4 January 2022 (UTC)[reply]
Impact is hard to evaluate before-hand. The two main things we will look at is space requirement (this is where having a dedicated search cluster will help) and then how it affects precision as by improving recall we will certainly degrade precision a bit and we should make sure it's not degraded too much. DCausse (WMF) (talk) 09:07, 5 January 2022 (UTC)[reply]
I suppose, in the meantime, we would have to copy the building addresses to aliases. --- Jura 10:14, 5 January 2022 (UTC)[reply]
@DCausse (WMF) where can I find the list of string properties that are currently indexed? --- Jura 11:59, 5 January 2022 (UTC)[reply]
The properties indexed (and searchable via haswbstatement) are the ones with the following data types:
  • string
  • external-id
  • url
  • wikibase-item
  • wikibase-property
  • wikibase-lexeme
  • wikibase-form
  • wikibase-sense
Minus these properties (that are explicitly excluded):
The intent here was to index only properties that do not contain natural language (only codes, IDs and the likes). It is very likely that switching from P969 to P6375 made this particular property unsearchable.
The purpose of phab:T240334 is specifically to make these other datatypes searchable. DCausse (WMF) (talk) 12:36, 6 January 2022 (UTC)[reply]
It may be that addresses were indexed when we used Property:P969 (string datatype). This was however lost when the content was moved to Property:P6375 (monolingual text datatype). --- Jura 12:48, 5 January 2022 (UTC)[reply]

SPARQL queries with MWAPI EntitySearch do not use the continue mechanism[edit]


The SPARQL API does multiple queries to the MWAPI using the continue mechanism by default. However, that is not the case with the MWAPI EntitySearch.

For instance, this query which uses the MWAPI EntitySearch always returns at most 50 results (depending on the mwapi:search parameter):

SELECT ?item ?itemLabel WHERE {
SERVICE wikibase:mwapi {
  bd:serviceParam wikibase:endpoint "www.wikidata.org";
  wikibase:api "EntitySearch";
  mwapi:search "York"; 
  mwapi:language "en".
  ?item wikibase:apiOutputItem mwapi:item.
}  
SERVICE wikibase:label {bd:serviceParam wikibase:language "en".}
}

A query to MWAPI EntitySearch returns a maximum of 50 results, but action=wbsearchentities supports continue, so when querying it outside of a SPARQL query, one can obtain n times 50 results.

However, there seems to be no automatic continuation for the MWAPI EntitySearch inside of a Wikidata SPARQL query, judging from the behavior.

Is that a bug or is that intended behavior? In case it's a bug: Can it be fixed?

Breslibomy (talk) 12:46, 5 January 2022 (UTC)[reply]

Indeed, this is unfortunately a known limitation of the current implementation, please see phab:T229291. DCausse (WMF) (talk) 13:04, 6 January 2022 (UTC)[reply]

A question about the background map[edit]

When I run a query with #defaultView:Map I get a background map with a lot of rendering errors in it. That name labels are partly missing or hidden I can live with. But when lakes like Mälaren (Q184492), Vänern (Q173596) and Vättern (Q188195) are rendered as land and not water it gets problematic. Especially when I work with lighthouses and need to see the shoreline. Remarkably Hjälmaren (Q211425) and smaller lakes show up just fine. All lakes are rendered just fine on openstreetmap.org, so there should not be any modeling error. I also note that Lake Huron (Q1383) and Lake Erie (Q5492) are also rendered correctly, so the problem is not related to size. Is there perhaps a limit on how many nodes a polygon can contain to render correctly? /ℇsquilo 12:31, 11 February 2022 (UTC)[reply]

This query shows the problem for the 3 Swedish lakes not shown water, and Hjälmaren (Q211425) which is OK:
#defaultView:Map
SELECT ?lake ?coordinates
WHERE
{
  VALUES ?lake { wd:Q184492 wd:Q173596 wd:Q188195 wd:Q211425 }
  ?lake wdt:P625 ?coordinates
}
Try it!
Compare with https://www.openstreetmap.org/#map=7/59.590/15.276 --Dipsacus fullonum (talk) 10:13, 13 February 2022 (UTC)[reply]
Tickets have been raised for a number of map issues, such as T288897 Wikimedia map tiles don't show some natural features (e.g. lakes) after zoom 10, T240755 Victoria lake is missing in our maps which points to current action on the issue at T218097 OSM DB degradation during sync as a result of missing features, as well as my own modest and mostly unheeded T289101 Bring WMF map tile feature sets into line with OSM default feature sets. 11 January 2022 seems to be the most recent WMF activity on a couple of those threads indicating there has been (and we hope, still is) active work ongoing. --Tagishsimon (talk) 11:13, 13 February 2022 (UTC)[reply]

Documentation for wikibase:identifiers[edit]

A new predicate, wikibase:identifiers, was created in 2017 (according to phab:T144476), but it was never documented in mw:Wikibase/Indexing/RDF Dump Format. I have now added a description with the edit mw:Special:Diff/5073859. Please check if my description is correct. --Dipsacus fullonum (talk) Dipsacus fullonum (talk) 09:09, 17 February 2022 (UTC)[reply]

Wikidata Query Service erroneously formats/fills partial dates into full dates[edit]

When querying some places with population having partial dates of point in time, the results display an erroneous full date instead of the exact partial date. For example, Ruinen (Q1007156) has a population of 1,624 with point in time 1830 (census). But when running the query, it displays 1 January 1830 instead of just 1830. How can this be fixed?:

SELECT ?place ?placeLabel ?populationLabel ?populationDate WHERE {
  ?place wdt:P131 wd:Q835108;
    p:P1082 ?place_statement.
  ?place_statement ps:P1082 ?population;
    pq:P585 ?populationDate.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY (?place) (?populationDate)
Try it!

Sanglahi86 (talk) 16:19, 23 February 2022 (UTC)[reply]

@Sanglahi86: Mainly by interrogating WDQS for the date precision associated with the date - iirc 9 is year, 10 is month and 11 is day, and then utilising that to do something like BIND(YEAR(?date) as ?year)
SELECT ?place ?placeLabel ?populationLabel ?populationDate ?precision ?date WHERE {
  ?place wdt:P131 wd:Q835108;
    p:P1082 ?place_statement.
  ?place_statement ps:P1082 ?population;
    pqv:P585 [
                wikibase:timePrecision ?precision;
                wikibase:timeValue ?populationDate ].
  BIND(IF(?precision=9,YEAR(?populationDate),"") as ?date)
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY (?place) (?populationDate)
Try it!
--Tagishsimon (talk) 16:47, 23 February 2022 (UTC)[reply]

Query Service has occasional issues (20:04, 6 March 2022 (UTC))[edit]

I keep getting servers that lag, but Grafana doesn't show it. Can you disconnect the borked server? @DCausse (WMF):

@DCausse (WMF): --- Jura 20:04, 6 March 2022 (UTC)[reply]

Grafana showed it. https://twitter.com/Tagishsimon/status/1500534060095000580 ... wdqs1006 is down to ~17 hours right now, from greater than 24 hours, in the last four hours. Should sort itself out in the next 8 hours. wdqs1007 is the other problem child, at ~10 hours. --Tagishsimon (talk) 20:10, 6 March 2022 (UTC)[reply]
Seems I looked at the wrong chart (or the right chart the wrong way). Anyways, it's (now) visible on the one linked on the top of this page. --- Jura 20:32, 6 March 2022 (UTC)[reply]
Should be healthy again. https://grafana.wikimedia.org/d/TUJ0V-0Zk/wikidata-alerts?orgId=1&from=now-6h&to=now&refresh=1m is also a useful view. Sjoerd de Bruin (talk) 21:29, 6 March 2022 (UTC)[reply]
Too many WDQS machine went down during the week-end causing an outage. Having some machines being killed due to some heavy queries/load is something we expect but sadly sometimes they go unnoticed for too long and cause these lag issues to be exposed to users when the machines get back online. Our current plan to mitigate this is to avoid having machines going down for too long so that they should no longer expose such high lag. DCausse (WMF) (talk) 12:06, 8 March 2022 (UTC)[reply]

wikibase:isSomeValue SPARQL function no longer working?[edit]

Is it just me or is the wikibase:isSomeValue SPARQL function no longer working recently? If I run the query below, I get records where the ?coord is the .well-known IRI when I specifically wanted to filter those records out. —seav (talk) 01:07, 8 March 2022 (UTC)[reply]

SELECT ?marker ?coord WHERE {
  ?marker wdt:P31 wd:Q21562164 ;
          p:P625 ?coordStatement .
  ?coordStatement ps:P625 ?coord .
  FILTER NOT EXISTS { ?coordStatement pq:P582 ?endTime }
  FILTER (!wikibase:isSomeValue(?coord)) .
}
Try it!
Thanks for the report, it does seem that the servers configuration was mistakenly changed recently. This should be resolved soon (please see attached phabricator ticket). DCausse (WMF) (talk) 11:48, 8 March 2022 (UTC)[reply]

Label service seems to have problems with 'values' statements.[edit]

The testcase results in 36 results instead of the expected 10.

SELECT * WHERE {
  VALUES ?val { 1 1 1 1 2  2 9 9 9 9 }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . }
}
Try it!

Don't know if it is a bug, but it looks kinda bug-like, ran under the sofa when I tried to whack it with a newspaper. Infrastruktur (talk) 06:11, 8 May 2022 (UTC)[reply]

Another testcase. Expected rows 2, returned 4. Infrastruktur (talk) 05:40, 9 May 2022 (UTC)[reply]

SELECT * WHERE {
  VALUES ?item { wd:Q42 wd:Q42 }
  ?item wdt:P569 ?dob.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
Try it!