Wikidata:Request a query/Archive/2023/09

From Wikidata
Jump to navigation Jump to search


USA box

Hello, what's wrong with this query ? I wanted to list things that are located inside continental USA {with their coordinates} AND (not having any P17 = lack of data) OR (having P17 =/= USA = wrong data somehow) Thanks for help !

#defaultView:Map
SELECT ?item ?itemLabel ?coords WITH {   SELECT *   WHERE {
    VALUES ?country { wd:Q30 } # change your country here, check that the bounding box only covers its mainland
    ?country p:P1332 [ a wikibase:BestRank; psv:P1332 [ wikibase:geoLatitude ?nmp_lat ] ].
    ?country p:P1333 [ a wikibase:BestRank; psv:P1333 [ wikibase:geoLatitude ?smp_lat ] ].
    ?country p:P1334 [ a wikibase:BestRank; psv:P1334 [ wikibase:geoLongitude ?emp_long ] ].
    ?country p:P1335 [ a wikibase:BestRank; psv:P1335 [ wikibase:geoLongitude ?wmp_long ] ].
  } } AS %a  WITH {   SELECT distinct ?item ?coords
  WHERE { 
    INCLUDE %a 
    ?item wdt:Q30 ?country;
      wdt:P31/wdt:P279* wd:Q811979#architectural thing
            ;wdt:P625 ?coords;#its coordinates
      p:P625 [ a wikibase:BestRank; psv:P625 ?coord_vn ] .
    ?coord_vn wikibase:geoLatitude ?lat. hint:Prior hint:rangeSafe true.
    ?coord_vn wikibase:geoLongitude ?long. hint:Prior hint:rangeSafe true.
    FILTER(?lat >  ?nmp_lat||  ?lat < ?smp_lat || ?long > ?emp_long ||wdt:P3999) ?dispar. }#on ne veut pas les disparus
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .} }
Try it!

Bouzinac💬✒️💛 18:53, 1 September 2023 (UTC)

Bouzinac: Please try to take a look at https://w.wiki/7P9T. It works for smaller countries after adjusting its deficiencies. However, it does not work for a country like the United States of America. What you should do is to reduce the complexity of the SPARQL query. --Csisc (talk) 14:59, 2 September 2023 (UTC)

MINUS oddity

Here's an odd one picked up by Vincent Tep; the way in which a MINUS clause is constructed affects the results set. Why?

In the three queries linked below, the minus statements are constructed as follows

# 3 individual minus statements
MINUS {?item wdt:P131* wd:Q217411 . } #Beyoglu
MINUS {?item wdt:P131* wd:Q673073 . } #Eyüpsultan
MINUS {?item wdt:P131* wd:Q326339 . } #Üsküdar

#single minus statement not using VALUES 
MINUS {?item wdt:P131* (wd:Q217411 wd:Q673073 wd:Q326339) }

#single minus statement using VALUES 
 VALUES ?minus {
    wd:Q217411 #Beyoglu
    wd:Q673073 #Eyüpsultan
    wd:Q326339 #Üsküdar
    }
  MINUS {?item wdt:P131* ?minus }
Try it!

The links show a query using DISTINCT; Blazegraph's explain for that query; and then the same query not using DISTINCT

Part of the issue revolves around e.g. Hoşyar Kadın Fountain (Q107008652) which has three P131 statements, one of which, "Istanbul", is not one of the values which will be acted on by the minus statement ... it points to a cause of the difference between the 46 row and the 64 row results set. And clearly the Optimized AST for the three queries are fairly different, but it's not v.clear what's going on, especially in the two 64 row results.

Any further insights welcome.

Finally, things do not improve much if FILTER NOT EXISTS is used instead of MINUS ... now one of the queries times out, the other two get 46 and 64 rows as before. --Tagishsimon (talk) 07:26, 2 September 2023 (UTC)

Minus opens a separate scope so it behaves differently than filter (not) exists and also have different performance tradeoffs as as result. In the first paragraph, you remove from the original set, 3 different sets. In the second paragraph you're actually using an RDF collection. Doubt it is valid to use them in a BGP like that. In the third paragraph the values keyword is outside the scope of the minus keyword so the variable is not bound. Just put the values keyword inside the minus and it should work as expected. —Infrastruktur (talk) 09:41, 2 September 2023 (UTC)

average amount of values of a property

for instance: some items have 10 values for Fandom article ID (P6262). most items, probably have one value. But how many values does Fandom article ID (P6262) have on average? novalue and somevalue statements should not be counted. thanks in advance 😃 – Shisma (talk) 09:34, 26 August 2023 (UTC)

Looks like you don't get novalues unless you specifically ask for them, so it's sufficient to filter away somevalues. Here is average: https://w.wiki/7LGS, and distribution: https://w.wiki/7LGc. Infrastruktur (talk) 20:23, 26 August 2023 (UTC)
@Infrastruktur thank you. that looks great! – Shisma (talk) 19:12, 4 September 2023 (UTC)

If Possible show Multiple places in Map View

Hi, I want view like church building (Q16970), mosque (Q32815) and Hindu temple (Q842402) in Map view. Sriveenkat (talk) 14:32, 29 August 2023 (UTC)

@Sriveenkat: My solution:
SELECT ?item ?itemLabel ?layer ?layerLabel ?coord WHERE {
  VALUES ?layer {wd:Q16970 wd:Q32815 wd:Q842402}
  ?item wdt:P17 wd:Q219. # Bulgaria. Change country as needed.
  ?item wdt:P31 ?layer.
  ?item wdt:P625 ?coord.
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
#defaultView:Map
Try it!
I had to restrict the query to just one country because there are over 200,000 church building (Q16970), mosque (Q32815) and Hindu temple (Q842402) in the world and that is too much to get labels without timing out.
Please notice that there are items in subclasses of church building (Q16970), mosque (Q32815) and Hindu temple (Q842402) (for example, hermitages, basilicas or cathedrals) that aren't shown in this query.
Does this query work for you?--Pere prlpz (talk) 20:50, 31 August 2023 (UTC)
@Pere prlpz Thankyou works good 💯. If possible we can define any colours? Thanks for your support Sriveenkat (talk) 01:14, 1 September 2023 (UTC)
@Sriveenkat: As far as I know you can use any variable to define colours (you just need to name that variable as "layer") but you can't pick which colours to use. In my experience, the palette is selected at random every time the query is executed. You can't even get a consistent palette in a given query (e.g. you won't get always the same colours with the same meaning for that query).
If you want more control over the results you may want to try mw:Help:Extension:Kartographer/Getting started, which I think now can accept a query as external data, but I've never used it and the query thing is a very new feature that might not be well documented.--Pere prlpz (talk) 09:10, 1 September 2023 (UTC)
You can set the colour using the ?rgb variable and specifying the colour using hex codes; add ?rgb to the SELECT statement and define the colours in the VALUES statement like so:
SELECT ?item ?itemLabel ?layer ?layerLabel ?coord ?rgb WHERE {
VALUES (?layer ?rgb) { (wd:Q16970 "ff0000") (wd:Q32815 "00ff00") (wd:Q842402 "0000ff" )} Piecesofuk (talk) 15:19, 1 September 2023 (UTC)
@Piecesofuk Thanks Sriveenkat (talk) 00:16, 16 September 2023 (UTC)

What is the simplest way of finding all items with a particular property (with any value except novalue) that also have a sitelink to the English language Wikipedia. Thanks. — Martin (MSGJ · talk) 15:22, 11 September 2023 (UTC)

@MSGJ: Seems to be as below. I use P718 as the example since I know there are <novalue> statements for this ... by & large I've convinced myself that items with (only) <novalue> P718 statements are not being picked up.
SELECT distinct ?item 
WHERE 
{
  ?item wdt:P718 [].                            # The property of interest
  ?article schema:about ?item ;                 # has a sitelink
  schema:isPartOf <https://en.wikipedia.org/> . # sitelink is to EN wikipedia
}
Try it!
--Tagishsimon (talk) 10:07, 12 September 2023 (UTC)
Thank you — Martin (MSGJ · talk) 13:29, 13 September 2023 (UTC)
When running with a well-used property like Library of Congress authority ID (P244) the query is timing out. Is there anything that can be done to prevent this?
SELECT distinct ?item 
WHERE 
{
  ?item wdt:P244 [].                            # The property of interest
  ?article schema:about ?item ;                 # has a sitelink
  schema:isPartOf <https://en.wikipedia.org/> . # sitelink is to EN wikipedia
}
Try it!
— Martin (MSGJ · talk) 18:59, 13 September 2023 (UTC)
@MSGJ: My inclination would be to slice the P244 dataset and look at batches of ~200k or somesuch amount. Increment the offset to get the next batch. There are 1.5 million uses of P244.
SELECT distinct ?item 
WHERE 
{
  
   SERVICE bd:slice {
      ?item wdt:P244 []. 
    bd:serviceParam bd:slice.offset 0 . # Start at item number (not to be confused with QID)
    bd:serviceParam bd:slice.limit 200000 . # List this many items
  }  
                             # The property of interest
  ?article schema:about ?item ;                 # has a sitelink
  schema:isPartOf <https://en.wikipedia.org/> . # sitelink is to EN wikipedia
}
Try it!
--Tagishsimon (talk) 17:38, 15 September 2023 (UTC)

Hi! I need your help. I get this query with a link to a URL of Tabakalera. How could I replace the URL with a linking term? In html language is something like "<a href="document.html">Term</a> Thank you

#defaultView:Map
SELECT DISTINCT ?item ?itemLabel ?TabakaleraLabel ?TabakaleraURL ?lugar ?coord
 
WHERE
{
    ?item wdt:P31 wd:Q5.
    ?item wdt:P1344 wd:Q9081343.
    ?item wdt:P10069  ?Tabakalera.
      ?item wdt:P19 ?lugar. 
      ?lugar wdt:P625 ?coord.

  
 OPTIONAL{?item wdt:P569 ?birthdate .} # P569 : Date de naissance
   BIND(year(?birthdate) as ?year)
  FILTER(?year > 1800)
  wd:P10069 wdt:P1630 ?formatterurl .
 BIND(IRI(REPLACE(?Tabakalera, '^(.+)$', ?formatterurl)) AS ?TabakaleraURL).


        SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } 
}
Try it!

Lmerice (talk) 19:57, 15 September 2023 (UTC)

digital preservation

Maggie Masenya Magretiah (talk) 10:13, 4 September 2023 (UTC)

@Magretiah: What's the question?--Pere prlpz (talk) 16:00, 16 September 2023 (UTC)

Items sorted by number of edits/revisions

Hey!

so far, if I wanted to sort English Wikipedia articles by number of edits/revisions, I have had to look at either their page information or XTools or call the API with something like https://en.wikipedia.org/w/rest.php/v1/page/foo/history/counts/edits - in all of these cases I have had to look at the stats separately for each article because none seems to provide a comparison function.

I would love to have a single SPARQL query that lets me rank a few given enwiki articles (which I can obviously provide as Wikidata objects) by number of edits, which I assume to be a good metric for public interest in the concept behind the article. I myself don't understand the mwapi service and its many functions well enough to come up with a solution myself.

I know ranking articles by number of edits didn't use to be possible but I thought it couldn't hurt to ask whether it is now.

Best, Weasel (talk) 22:29, 7 September 2023 (UTC)

@Weasel: I don't know if a query could access that information (I don't think so) but I don't think that Wikidata can help you here, because the information you need is all in Wikipedia, not Wikidata.
In fact English Wikipedia has an special page with what you want: en:Special:MostRevisions. Wikipedia:Statistics has links to related pages.--Pere prlpz (talk) 15:50, 16 September 2023 (UTC)

Create a map pinpointing abandoned buildings (Q106765618) in São Paulo city (Q174)

I tried creating this query a few weeks ago but it has been a while since I created my own queries on Wikidata. Would love some help, thanks! Tet (talk) 18:04, 12 September 2023 (UTC)

I am not entirely sure if former building or structure (Q96084375) is better for search queries! Tet (talk) 18:10, 12 September 2023 (UTC)

@Tet: Here are both, in separate queries but worldwide:

#defaultView:Map
SELECT DISTINCT ?item ?itemLabel ?coord
 
WHERE
{
    ?item wdt:P31 wd:Q106765618.
      ?item wdt:P625 ?coord.
        SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],pt,en". } 
}
Try it!
#defaultView:Map
SELECT DISTINCT ?item ?itemLabel ?coord
 
WHERE
{
    ?item wdt:P31 wd:Q96084375.
      ?item wdt:P625 ?coord.
        SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],pt,en". } 
}
Try it!

You can see that there aren't any abandoned building or structure (Q106765618) or former building or structure (Q96084375) in Sao Paulo, which can mean that:

  • All buildings in Sao Paulo are in perfect working order and none of them is abandoned.
  • Or that nobody has updated the abandoned state of the buildings.
  • Or that the status has been informed using other properties like state of conservation (P5816).
  • Or that abandoned buildings in Sao Paulo lack coordinates.

I suggest doing first some research by looking at items of buildings in Sao Paulo that you may know are abandoned.--Pere prlpz (talk) 15:37, 16 September 2023 (UTC)

Looking for candidate duplicate items

So, for example, Pak Dong-uk (Q79404675) has Olympedia people ID (P8286) = 56280, and links to one Wiki article (en.wp). It seems reasonable that any other Wikidata items with Olympedia people ID (P8286) = 56280 would be candidates for merging with Pak Dong-uk (Q79404675). So a warm-up query here would be to return Wikidata items with Olympedia people ID (P8286) = 56280 and with at least one Wikimedia link, so long as there are at least two such items. If there's only one (i.e. Pak Dong-uk (Q79404675) itself), return nothing.

More generally, Olympedia people ID (P8286) itself has property constraint (P2302): distinct-values constraint (Q21502410), so it is not the only property that signifies a unique ID. It seems reasonable that all duplicate Wikidata items with the same value for a distinct-values constraint (Q21502410)-property would be candidates for merging.

The full query I am thinking of is to return all sets of multiple (>1) Wikidata Q-items, where each item has the same value for at least one P-property that itself has a distinct values constraint. I.e. all sets of multiple Q-items for athletes with the same Olympic ID, of all films with the same IMDb ID (P345), etc.

And finally, all the stuff above except for exceptions (exception to constraint (P2303)) to the P-value in question's distinct values constraint. E.g. see property constraints of IMDb ID (P345), which says among other things that Shiva (Q7499225) and Siva (Q7532265) are allowed to have the same IMDb ID (P345). So those two should be excluded from the results. It Is Me Here (talk) 16:11, 15 September 2023 (UTC)

@It Is Me Here: Property talk:P8286 contains a template for each constrain with a link to a constraint violation report and to a couple of SPARQL queries for that constraint. I think those are the queries you need.
Btw., now the only violation of unique value constraint are the article and the category for the same athlete and we don't seem to be going to find duplicates using this property, although it may work with other properties.--Pere prlpz (talk) 15:26, 16 September 2023 (UTC)

P276 = only countries

Hello, I'd like a list of items having P276 = a country (not a town, a precise place, etc) and having no P17. Thanks Bouzinac💬✒️💛 05:20, 6 September 2023 (UTC)

@Bouzinac: If I'm not missing anything, it seems that there are no items with location (P276) pointing to a sovereign state (Q3624078) or subclass of it:

SELECT DISTINCT ?item ?itemLabel ?country ?countryLabel
 
WHERE
{
  ?item wdt:P276 ?country.
  ?country wdt:P31/wdt:P279? wd:Q3624078.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en,ca". } 
}
Try it!

--Pere prlpz (talk) 15:58, 16 September 2023 (UTC)

@Bouzinac: Sorry, my query had a fatal typo, now fixed, that yielded no results. Now it finds over 21000 results. I had to restrict the subclass tree to avoid timing out, although other optimisations should be possible.--Pere prlpz (talk) 19:26, 17 September 2023 (UTC)

Hello, how about this (not very good) query https://w.wiki/7UDs . I wished this query opens more countries and restrains to any physical stuff (house, road, etc) but not concepts (war, idea, etc) Bouzinac💬✒️💛 16:27, 16 September 2023 (UTC)

After having worked with lists of placenames in Wikipedia, I can tell you that restraining to "any physical stuff" is very difficult, because the trees for most of physical stuff are too large and often mixed with things that aren't physical stuff.
A possible workaround is to select items with coordinates, and/or filter out (with MINUS) the items you don't want.--Pere prlpz (talk) 19:30, 17 September 2023 (UTC)

list of all rappers

I'm trying to do a basic query that returns all entities where occupation is rapper (wdt:P106 wd:Q2252262). I get back almost 8000 results but I noticed that one entry (for example) -- Q-Tip (Q42025) -- is missing, despite the fact that rapper / Q2252262 is indeed one of the listed occupations on his wikidata page. Query is as follows:

SELECT DISTINCT ?rapper ?rapperLabel WHERE {
  {
    ?rapper wdt:P106 wd:Q2252262;
      rdfs:label ?rapperLabel.
    FILTER((LANG(?rapperLabel)) = "en")
  }
}
ORDER BY (?rapperLabel)
Try it!

Not sure why his page wouldn't be included in the results, or how to modify the query to ensure that it is. Any help would be greatly appreciated. Slieb17 (talk) 21:36, 18 September 2023 (UTC)

@Slieb17: As far as I know, prefix wdt means "truthy" statements, that is the values of maximum rank for the property. If there is no value of preferred rank, all normal rank values are included, but if there is at least one value with preferred rank only preferred values are returned. The preferred occupation for Q-Tip (Q42025) is singer (Q177220), therefore this value is used, not rapper (Q2252262), and Q-Tip is not included in your query.
I don't know if singer should be Q-Tip's preferred occupation of if this is an error to be fixed in his item.
A query including all values (or all non deprecated values) is possible but a bit more complex, or at least less usual.--Pere prlpz (talk) 22:24, 18 September 2023 (UTC)

@Slieb17:: After a bit of research I think I got it:

SELECT DISTINCT ?rapper ?rapperLabel WHERE {

    ?rapper p:P106 ?statement. 
    ?statement ps:P106 wd:Q2252262.

SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
}
}

ORDER BY (?rapperLabel)
Try it!

Now Q-Tip and others are included despite of rank.--Pere prlpz (talk) 22:47, 18 September 2023 (UTC)

Fantastic - can't thank you enough, for both the explanation AND the working solution. Truly appreciated. Slieb17 (talk) 23:34, 18 September 2023 (UTC)

Missing Tamil Descriptions wikpedia article available in the Tamil Wikipedia

I want to add missing American actress Tamil descriptions for Available Tamil Wikipedia Articles. Thank you Sriveenkat (talk) 07:25, 11 September 2023 (UTC)

Do you need a query of all American actresses with Tamil Wikipedia article without Tamil language description to find those articles and write descriptions by hand? Or do you want to learn how to automatically put the same Tamil description (e.g. "American actress" in Tamil) in all those items?--Pere prlpz (talk) 15:41, 16 September 2023 (UTC)
@Pere prlpz I want put automatically put the same tamil description Example: "American Actress" translate "அமெரிக்க நடிகை" Thanks. Sriveenkat (talk) 03:44, 17 September 2023 (UTC)
@Sriveenkat: PetScan (with SPARQL) can add properties but I think it can't add descriptions. Therefore I suggest:
  • Using a query to get all American actresses, with their Tamil description if it exists, and export results in a csv file (or any other format).
  • Import the data to an spreadsheet (like Excel or LibreCalc), filter out actresses with Tamil description and create QuickStatements instructions to add the Tamil description to the remaining ones. I would do this part using R or Python instead of Excel because I'm used to them and have scripts for similar jobs, but a spreadsheet can have the work done quite easily.
  • Paste the instructions to QuickStatements and run them.
I wouldn't restrict that to only actresses with article in Tamil Wikipedia because it's not a lot of additional work to upload some thousands of descriptions instead of some hundreds.
Just in case it may be useful, I leave here a query to get American actresses and their Tamil description.--Pere prlpz (talk) 17:58, 17 September 2023 (UTC)
@Pere prlpz Thanks. I am very much obliged to you. Thanks again Sriveenkat (talk) 18:42, 17 September 2023 (UTC)
@Pere prlpz Thank you so much. Now America actresses Adding Tamil Description Adding is ✓ Done. It's possible to create a tutorial video or article about Adding labels and descriptions using Spreadsheets and QuickStatements as your workflow? That's really useful for Beginners. Thanks again Sriveenkat (talk) 03:45, 19 September 2023 (UTC)
@Sriveenkat: I think I've never done that with Excel because I usually use R or sometimes Python for tasks like this one. If Excel worked for you, you can do the tutorial. However, there are already pages and videos on how to use QuiciStatements with Excel or Google Spreadsheets. Googling for it yields a handful of interesting resources.--Pere prlpz (talk) 08:36, 19 September 2023 (UTC)
SELECT ?item ?nameen ?nameta ?descr
WHERE {
  ?item wdt:P31 wd:Q5.
  ?item wdt:P27 wd:Q30.
  ?item wdt:P106/wdt:P297* wd:Q33999.
  ?item wdt:P21 wd:Q6581072.
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
  ?item rdfs:label ?nameen.
}
SERVICE wikibase:label {
bd:serviceParam wikibase:language "ta" .
  ?item rdfs:label ?nameta.
  ?item schema:description ?descr.
}
}
Try it!

Get all parent classes from a list of classes

I currently have a query that gets all classes an item has:

SELECT DISTINCT ?class ?classLabel WHERE {
	SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
	wd:${instance} wdt:P31/wdt:P279* ?class.
}
Try it!

I'd like to skip the instance of (P31) part and query a list of classes instead. For example, I want to get all parent classes of television character (Q15773317), comics character (Q1114461) and literary character (Q3658341) in a single query. How do I do that? – Shisma (talk) 11:30, 16 September 2023 (UTC)

And all classes should be in the ?class variable. The classes requested as well as those in the query. –Shisma (talk) 11:34, 16 September 2023 (UTC)

SELECT DISTINCT ?class ?classLabel WHERE {
	SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
	{ wd:Q3658341 wdt:P279* ?class. }
    UNION 
    { BIND (wd:Q3658341 as ?class) }
    UNION
	{ wd:Q15773317 wdt:P279* ?class. }
    UNION 
    { BIND (wd:Q15773317 as ?class) }
    UNION
	{ wd:Q3658341 wdt:P279* ?class. }
    UNION 
    { BIND (wd:Q3658341 as ?class) }
}
Try it!

Is this it, or is there a better way to do this –Shisma (talk) 11:38, 16 September 2023 (UTC)

You just need to use VALUES instead of repeating the same query for each value:
SELECT DISTINCT ?class ?classLabel WHERE {
	SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
  VALUES ?targetClass {wd:Q3658341 wd:Q15773317 wd:Q365834}
	?targetClass wdt:P279* ?class. 
}
Try it!
Does that work for you?--Pere prlpz (talk) 15:10, 16 September 2023 (UTC)
unfortunately, ?class should also include literary character (Q3658341), television character (Q15773317) and Q365834.– Shisma (talk) 15:36, 16 September 2023 (UTC)
@Shisma: You just need to add as many classes as you want in the "VALUES ?targetClass {wd:Q3658341 wd:Q15773317 wd:Q365834}" line. Now it works with three classes but it would work fine with a handful more.--Pere prlpz (talk) 08:46, 19 September 2023 (UTC)

sets of mythological Greek characters

Hi all, I want to clean up our items which are supposed to be instance of (P31)set of mythological Greek characters (Q26214208).

There are a lot of items not properly classified. We can find them with a query which lists all Wikidata items that

  1. do not have the statement instance of (P31)set of mythological Greek characters (Q26214208) AND
  2. have a sitelink to any page from w:en:Category:Set index articles on Greek mythology

I couldn't manage myself using the Query Builder. Can you help?

Thanks, and have great day, Jonathan Groß (talk) 18:15, 17 September 2023 (UTC)

This is a start, but not exactly what I wanted: https://w.wiki/7Udd Jonathan Groß (talk) 18:52, 17 September 2023 (UTC)

@Jonathan Groß:: Dealing with categories with a query is possible, but I think it's easier to use PetScan, which deals primarily with categories and can combine them with queries.--Pere prlpz (talk) 19:08, 17 September 2023 (UTC)
I've now tried Petscan, but I couldn't do it. Problem is, I want to combine statements from two different wikis. Maybe I'll just go through the category one by one ... Jonathan Groß (talk) 06:10, 18 September 2023 (UTC)
@Jonathan Groß:: Your question only mentions English Wikipedia. If you mean English Wikipedia and Wikidata, PetScan can deal with that. If you mean two different Wikipedias, you can use PetScant to download data from each one and combine them using any program that can manage data. I would do it using R but Excel or any Spreadsheet can work.--Pere prlpz (talk) 08:42, 19 September 2023 (UTC)
@Pere prlpz:: I want to combine info from Wikidata and enwiki only, as stated above, but I can't figure out how to make Petscan do that. To me it seems I have to choose between headers ("Categories", "Page properties" etc.) but can only run a query from one of those headers, not from many put together with different properties. Or am I missing something? Could you perhaps show me how to do it? Jonathan Groß (talk) 09:05, 19 September 2023 (UTC)
@Jonathan Groß:: You can put conditions under all of those headers at the same time. To combine Wikipedia with Wikidata you can use simple options (like items having or not having some properties) or you can use a query in the "Other sources" tab.
For example, here are saxophonists (according to Wikidata) whose article in Catalan Wikipedia is not in category:Saxofonistes nor its subcategories combining categories in cawiki with a query.--Pere prlpz (talk) 10:11, 19 September 2023 (UTC)

Displaying all the participants in a battle

Hello! I have this complex query:

#defaultView:Map{"hide":["?rgb"],"?rgb"}
SELECT 
?veteran 
?veteranLabel 
?veteran_
?layer 
?aita
?aitaLabel 
?aita_
?jaiolekua_aita_
?ama
?amaLabel
?ama_ 
?jaiolekua_ama_
?aitona
?aitonaLabel 
?aitona_ 
?jaiolekua_aitona_ 
?gatazkaLabel
?coord
?rgb 
WITH { SELECT ?veteran ?jaiolekua ?heriotza ?ama ?aita ?aitona WHERE {
?veteran wdt:P241 wd:Q11218.
?veteran wdt:P172 wd:Q126756.
?veteran wdt:P19 ?jaiolekua.
?veteran wdt:P20 ?heriotza.
?veteran wdt:P25 ?ama.
?veteran wdt:P22 ?aita.
OPTIONAL{?aita wdt:P22 ?aitona.}  
} } as %i
with { select ?veteran ?veteran_ ?layer ?ama ?jaiolekua_ama_ ?coord ?rgb WHERE {
INCLUDE %i
?ama wdt:P19 ?jaiolekua_ama.
?jaiolekua_ama wdt:P625 ?coord.
?veteran rdfs:label ?veteranL . FILTER(LANG(?veteranL)="eu")
BIND(CONCAT("Beteranoa: ",?veteranL) as ?veteran_)
?jaiolekua_ama rdfs:label ?jaiolekua_amaL . FILTER(LANG(?jaiolekua_amaL)="eu")
BIND(CONCAT("Amaren jaiolekua: ",?jaiolekua_amaL) as ?jaiolekua_ama_)
BIND("3366CC" as ?rgb)
BIND("Ama" as ?layer)
} } as %j
with { select ?veteran ?veteran_ ?layer ?jaiolekua_aita_ ?aita ?coord ?rgb WHERE {
INCLUDE %i
?aita wdt:P19 ?jaiolekua_aita.
?jaiolekua_aita wdt:P625 ?coord.
?veteran rdfs:label ?veteranL . FILTER(LANG(?veteranL)="eu")
BIND(CONCAT("Beteranoa: ",?veteranL) as ?veteran_)
?jaiolekua_aita rdfs:label ?jaiolekua_aitaL . FILTER(LANG(?jaiolekua_aitaL)="eu")
BIND(CONCAT("Aitaren jaiolekua: ",?jaiolekua_aitaL) as ?jaiolekua_aita_)
BIND("FFCC33" as ?rgb)
BIND("Aita" as ?layer)
} } as %k
with { select ?veteran ?veteran_ ?heriotza ?ama_ ?aita_ ?coord ?layer ?rgb WHERE {
INCLUDE %i
?heriotza wdt:P625 ?coord.
?veteran rdfs:label ?veteranL . FILTER(LANG(?veteranL)="eu")
BIND(CONCAT("Beteranoa: ",?veteranL) as ?veteran_)
?aita rdfs:label ?aitaL . FILTER(LANG(?aitaL)="eu")
BIND(CONCAT("Aita: ",?aitaL) as ?aita_)
?ama rdfs:label ?amaL . FILTER(LANG(?amaL)="eu")
BIND(CONCAT("Ama: ",?amaL) as ?ama_)
BIND("b32425" as ?rgb)
BIND("Heriotza lekua" as ?layer)
} } as %l
with { select ?veteran ?veteran_ ?jaiolekua ?heriotza ?ama_ ?aita_ ?coord ?layer ?rgb WHERE {  
INCLUDE %i
?jaiolekua wdt:P625 ?coord.
?veteran rdfs:label ?veteranL . FILTER(LANG(?veteranL)="eu")
BIND(CONCAT("Beteranoa: ",?veteranL) as ?veteran_)
?aita rdfs:label ?aitaL . FILTER(LANG(?aitaL)="eu")
BIND(CONCAT("Aita: ",?aitaL) as ?aita_)
?ama rdfs:label ?amaL . FILTER(LANG(?amaL)="eu")
BIND(CONCAT("Ama: ",?amaL) as ?ama_)
BIND("00af89" as ?rgb)  
BIND("Jaiolekua" as ?layer)
} } as %m
with { select ?veteran ?gatazka ?coord ?layer ?rgb WHERE {  
INCLUDE %i
?veteran wdt:P607 ?gatazka.
?gatazka wdt:P625 ?coord.
BIND("72777d" as ?rgb)
BIND("Gudua" as ?layer)
} } as %n
with { select ?veteran ?veteran_ ?layer ?jaiolekua_aitona_ ?aitona_ ?coord ?rgb WHERE {
INCLUDE %i

?veteran wdt:P22 ?aita.
?aita wdt:P22 ?aitona. 
  
?aitona wdt:P19 ?jaiolekua_aitona.
?jaiolekua_aitona wdt:P625 ?coord.
?veteran rdfs:label ?veteranL . FILTER(LANG(?veteranL)="eu")
BIND(CONCAT("Beteranoa: ",?veteranL) as ?veteran_)
?jaiolekua_aitona rdfs:label ?jaiolekua_aitonaL . FILTER(LANG(?jaiolekua_aitonaL)="eu")
BIND(CONCAT("Aitonaren jaiolekua: ",?jaiolekua_aitonaL) as ?jaiolekua_aitona_)
BIND("FFCC00" as ?rgb)
BIND("Aitona" as ?layer)
} } as %o
WHERE {
{INCLUDE %j}
UNION
{INCLUDE %k}
UNION
{INCLUDE %l}
UNION
{INCLUDE %m}
UNION
{INCLUDE %n}
UNION
{INCLUDE %o}
SERVICE wikibase:label { bd:serviceParam wikibase:language "eu,en". }
}
Try it!

It works great, but when more than one soldier displayed was in the same battle, it only gives one point with one card, instead of opening all the points and just giving the option to select. Is there a way to show all the soldiers in that battle? Thanks! Theklan (talk) 18:48, 20 September 2023 (UTC)

Musicians, born between specified years

Query musicians, who were born, e.g., between 1800 and 1900. Osmomysl8 (talk) 21:08, 20 September 2023 (UTC)

SELECT ?a ?aLabel ?birth_date WHERE {
:        ?a wdt:P31 wd:Q5 . #instance of human
:        ?a wdt:P106/wdt:P279 wd:Q639669 . #occupation a subclass of musician
:        ?a p:P569/psv:P569 ?birth_date_node .
:        ?birth_date_node wikibase:timeValue ?birth_date .
:        SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
:        FILTER(year(?birth_date) > 1800 && year(?birth_date) < 1900).
:    }
:ORDER by ASC(?birth_date)
:
Try it!
This could do. Theklan (talk) 08:29, 21 September 2023 (UTC)
Yay, this works! Thanks! Osmomysl8 (talk) 15:15, 21 September 2023 (UTC)

How to retrieve the city of a building with sparql?

Hello,

For my first SPARQL request, I tried to retrieve some information about the event "journées européennes des amateurs de nœuds" (Q112064134). This item groups each session in other items (since 2001 to 2023).

The result is correct, but I would like to improve it a city column.

SELECT DISTINCT ?session ?label_fr ?location ?city ?start ?end
WHERE {
  ?item wdt:P393 ?session .
  ?item wdt:P31 wd:Q112064134 .
  ?item rdfs:label ?label_fr filter(lang(?label_fr) = "fr") .
  ?item wdt:P276 ?site .
  OPTIONAL { ?site rdfs:label ?location filter(lang(?location) = "fr") }
  ?item wdt:P580 ?start .
  ?item wdt:P582 ?end .
}
ORDER BY ?label_fr
Try it!

But the location of the event is sometimes the city itself or a specific location in the city. Anyone knows how to retrieve this information?

Jpgibert (talk) 12:24, 22 September 2023 (UTC)

@Jpgibert: Probably something along these lines ... look up the P276 or P131 property paths until you find something that is a human settlement:
SELECT DISTINCT ?session ?itemLabel ?cityLabel ?site ?siteLabel ?start ?end
WHERE {
  ?item wdt:P393 ?session .
  ?item wdt:P31 wd:Q112064134 .
  ?item wdt:P276 ?site .
  OPTIONAL {?site wdt:P276*/wdt:P131* ?city. hint:Prior hint:gearing "forward". 
            ?city wdt:P31/wdt:P279* wd:Q486972. hint:Prior hint:gearing "forward".}
  ?item wdt:P580 ?start .
  ?item wdt:P582 ?end .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "fr". }

}
ORDER BY ?itemLabel
Try it!
--Tagishsimon (talk) 13:44, 23 September 2023 (UTC)
@Tagishsimon: Many thanks for your help. Jpgibert (talk) 15:45, 23 September 2023 (UTC)

Bilingual labels

I'm trying to build a list of physical places in New Zealand and both their en and mi language labels, as the first step to improve mi language labels. I've tried without success queries such as:

SELECT DISTINCT ?item ?itemLabel ?labelen ?labelmi WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en,mi". }
  {
    SELECT DISTINCT ?item WHERE {
      ?item p:P31 ?statement0.
      ?statement0 (ps:P31/(wdt:P279*)) wd:Q2221906.
      ?item p:P17 ?statement1.
      ?statement1 (ps:P17/(wdt:P279*)) wd:Q664.
      ?item rdfs:label ?labelen.
      ?itme rdfs:label ?labelmi.
      FILTER(LANG(?labelen) = "en").
      FILTER(LANG(?labelmi) = "mi").
    }
    LIMIT 100
  }
}
Try it!

121.99.219.62 05:44, 23 September 2023 (UTC)

SELECT DISTINCT ?item ?labelen ?labelmi WHERE {
    ?item wdt:P17 wd:Q664; wdt:P31/wdt:P279* wd:Q58416391.
    optional{?item rdfs:label ?labelen. FILTER(LANG(?labelen) = "en")}
    # optional{?item rdfs:label ?labelmi. FILTER(LANG(?labelmi) = "mi")}
    filter not exists{?item rdfs:label ?labelmi. FILTER(LANG(?labelmi) = "mi")}
} limit 100
Try it!

This simple query gives you a list of spatial entities in NZ with English but without Maori labels. if you replace the "filter not exists" line with the former "optional" labelmi, you get a list with existing English and Maori labels DL2204 (talk) 09:38, 23 September 2023 (UTC)

All articles on the English Wikipedia about locations in the US with the inception year 2023

If possible please help me create a query that would show all articles about places in the United States, that have both latitude and longitude data on Wikidata, as well as having 2023 as the inception year (P571). (I plan on using it to add prominent new establishments and/or their Wikipedia articles to the relevant Wikivoyage articles). Thanks in advance. ויקיג'אנקי (talk) 22:43, 23 September 2023 (UTC)

@ויקיג'אנקי: Something like this - which does not deal with date precision, so 1 January 2023 means, generally, just 2023:
SELECT DISTINCT ?item ?itemLabel ?date ?coords ?article (GROUP_CONCAT(?P31Label;separator="; ") as ?type)
WHERE 
{
  ?item wdt:P571 ?date . hint:Prior hint:runFirst true.
  ?article schema:about ?item ;
  schema:isPartOf <https://en.wikipedia.org/>.
  ?item wdt:P17 wd:Q30 .
  ?item wdt:P625 ?coords .
  filter(?date  > "2022-12-31"^^xsd:dateTime && ?date  < "2024-00-00"^^xsd:dateTime)  
  OPTIONAL {?item wdt:P31/rdfs:label ?P31Label. filter(lang(?P31Label)="en")}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
} group by ?item ?itemLabel ?date ?coords ?article order by ?itemLabel
Try it!
--Tagishsimon (talk) 23:04, 23 September 2023 (UTC)

Historical figure with the most ancient known birth date

Hello, I would like to query which is the historical figure with the most ancient birth date. For example, Alexander the Great (Q8409) was born on 20 July 356 BCE, but are there any other instance of human born earlier about whom we know the exact birth date? Nungalpiriggal (talk) 11:29, 22 September 2023 (UTC)

Iteration by iteration, I've managed (with ChatGPT) to get a query which gives the result (sometimes, when not still times out):
SELECT ?person ?dateOfBirth
WHERE {
  ?person wdt:P31 wd:Q5.  # Выбираем только элементы, которые являются людьми
  ?person wdt:P569 ?dateOfBirth.  # Получаем дату рождения
  FILTER (?dateOfBirth <= "-1000-01-01T00:00:00Z"^^xsd:dateTime).
  ?person p:P569/psv:P569 [wikibase:timePrecision "9"^^xsd:integer].  # Фильтруем по уровню точности "год"
}
ORDER BY ASC(?dateOfBirth)
LIMIT 10  # Ограничиваем количество результатов до 10 элементов
Try it!

This is for dates with precision "year", for the precision "date" one has to change 9 to 11 (but this always times out...). So we get different candidates for your question. Moreover some values are strange, I am working with them. --Infovarius (talk) 21:51, 25 September 2023 (UTC)

Thank you for your suggestions, however trying the "date" precision, it always times out. I have an additional questionː why if I execute your query with the filter for the exact birth date of Alexander the Great (FILTER (?dateOfBirth = "-355-07-20"^^xsd:dateTime)), I do not get any result, while I should get Q8409 instead? Nungalpiriggal (talk) 12:44, 28 September 2023 (UTC)

Show Only France Lawyers

I want a query for Show Only France Lawyers. I mean Items must contain only the lawyer statement in the occupation property. Sriveenkat (talk) 12:28, 27 September 2023 (UTC)

@Sriveenkat: Two answers; I hope one or other is what you wanted. Note that query does not look at the subclass tree of lawyers, just at P106=Q40348.
French citizens who are lawyers and have no other occupation
SELECT DISTINCT ?item ?itemLabel 
WHERE 
{
  ?item wdt:P27 wd:Q142.
  ?item wdt:P106 wd:Q40348.
  filter not exists {?item wdt:P106 ?something. filter (!contains(str(?something),"Q40348") ) }
  
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],fr,en". }
}
Try it!
French citizens who are lawyers, including those who do have other occupation(s)
SELECT DISTINCT ?item ?itemLabel 
WHERE 
{
  ?item wdt:P27 wd:Q142.
  ?item wdt:P106 wd:Q40348. 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],fr,en". }
}
Try it!
--Tagishsimon (talk) 12:51, 27 September 2023 (UTC)
@Tagishsimon Both are Nice Queries Thanks for your support 😊 Sriveenkat (talk) 15:07, 27 September 2023 (UTC)

picking up parameters from the infobox template in kowiki

I currently have a query https://w.wiki/7aVz which picks up instances of fortresses in Korea that do not have co-ordinates. I am hoping for a query which picks up these Qitems, links to the corresponding kowiki article and extracts the co-ordinates from the infobox (when these exist). MargaretRDonald (talk) 22:49, 27 September 2023 (UTC)

@MargaretRDonald: Don't know how to do that, but I've added coords for all fortresses in your report to WD, by hand, from ko wiki, where ko wiki had them. About 67 minutes work ;) hth. --Tagishsimon (talk) 00:03, 28 September 2023 (UTC)
@Tagishsimon: Thank you so much for your 67 minutes of hard work (216 items now reduced to 26) MargaretRDonald (talk) 00:21, 28 September 2023 (UTC)
@MargaretRDonald: You may be looking for https://pltools.toolforge.org/harvesttemplates/ although I've never used it and I can't tell if Tagishsimon could have saved some time with it.--Pere prlpz (talk) 17:06, 28 September 2023 (UTC)
Thank you @Pere prlpz: I will give it a try. MargaretRDonald (talk) 18:43, 28 September 2023 (UTC)

How to list all properties for which the statement has startdate and/or enddate?

I would like to know if there is a way to list all properties for which the statement contains stardate and/or enddate. For example, in case of a person Q33977 (Jules Verne). It has P551 (residence) that contains start and enddate. Similarly, P69 (educated at) have startdate and enddate. How can I find the list of properties and their labels (P551, P69, .... etc)? 2001:610:450:40:0:0:2:A 16:36, 28 September 2023 (UTC)

A lot of properties will be present in at least one item with start or end date. Do you want all of them? And do you want just the property or do you also need the item?--Pere prlpz (talk) 17:10, 28 September 2023 (UTC)
Hello. Thanks for your reply. Yes, I totally understand the scalability issue. I think I can limit the entity (i.e. ?item) to a single type. Can you list all properties which would have start and enddate for items of Human (Q5)?
Apart from that, I wonder if such a list could be generated from the ontology side (not via SPARQL), or via API, because I can see other similar use cases.
Many thanks! 2001:610:450:40:0:0:2:A 12:00, 29 September 2023 (UTC)
For the Jules verne example, something like the query below. See https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format#Properties for more details about how the ?statement to ?predicate to ?property business is working in this query.
SELECT ?item ?itemLabel ?statement ?property ?propertyLabel ?start_date ?end_date
WHERE 
{
  VALUES ?item {wd:Q33977}
  {?statement pq:P580 ?start_date.}
  UNION
  {?statement pq:P582 ?end_date.}
  ?item ?predicate ?statement .
  ?property wikibase:claim ?predicate .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
} order by ?propertylabel ?statement
Try it!
--Tagishsimon (talk) 17:37, 28 September 2023 (UTC)

Occupations and Subclasses

I'm trying to get a list of all musical occupations. The query below gets me most of the way there (I think) by anchoring off MUSICAL PROFESSION (Q66715801). For example, it successfully returns "jazz bassist" (Q64353861) which is a subclass of "bassist" (Q584301) and "jazz musician" (Q15981151). However, the results do NOT include "jazz pianist" (Q61947380). When I look at that entry, it too is a subclass of "jazz musician" just like "jazz bassist," but it's an instance of PROFESSION (Q28640) instead of MUSICAL PROFESSION, so it's not being included.

Is there a way of getting instances of both MUSICAL PROFESSION *AND* PROFESSION if and where they intersect as in the case of "jazz pianist"? Alternatively, is there a way to recursively get all subclasses of any result that gets returned from the query below?

Thanks in advance!

SELECT ?occupation ?occupationLabel ?subclassOf ?subclassOfLabel
WHERE
  ?occupation wdt:P31/wdt:P279* wd:Q66715801
  OPTIONAL {
    ?occupation wdt:P279 ?subclassOf.
    ?subclassOf rdfs:label ?subclassOfLabel.
    FILTER (LANG(?subclassOfLabel) = "en")
  }
  FILTER (LANG(?occupationLabel) = "en")
  SERVICE wikibase:label { 
    bd:serviceParam wikibase:language "en". 
    ?occupation rdfs:label ?occupationLabel.    
  }
}
ORDER BY ?occupationLabel
Try it!
@Slieb17: The clause FILTER (LANG(?occupationLabel) = "en") seems to be causing the problem, though I'm not sure why. I've revised the handing of labels somewhat; possibly this is closer to your needs? As I understand it, the P31/P279* should be doing what you're alluding to in your final paragraph.
SELECT ?occupation ?occupationLabel ?subclassOf ?subclassOfLabel
WHERE {
  ?occupation wdt:P31/wdt:P279* wd:Q66715801
  OPTIONAL {
    ?occupation wdt:P279 ?subclassOf.
#    ?subclassOf rdfs:label ?subclassOfLabel.
#    FILTER (LANG(?subclassOfLabel) = "en")
  }
  #FILTER (LANG(?occupationLabel) = "en")
  SERVICE wikibase:label { 
    bd:serviceParam wikibase:language "en". 
    ?occupation rdfs:label ?occupationLabel.
    ?subclassOf rdfs:label ?subclassOfLabel.
  }
}
ORDER BY ?occupationLabel
Try it!
--Tagishsimon (talk) 02:17, 29 September 2023 (UTC)
Thanks for trying, but Q61947380 ("jazz pianist") still does not appear in the results, so I'm not entirely sure what changed as a result of the modifications you made. Slieb17 (talk) 02:46, 29 September 2023 (UTC)
@Slieb17: Oops, yes. I was confusing myself. Try this.
SELECT DISTINCT ?occupation ?occupationLabel ?subclassOf ?subclassOfLabel
WHERE {
  ?occupation wdt:P279*/wdt:P31/wdt:P279* wd:Q66715801
  OPTIONAL {
    ?occupation wdt:P279 ?subclassOf.
#    ?subclassOf rdfs:label ?subclassOfLabel.
#    FILTER (LANG(?subclassOfLabel) = "en")
  }
  #FILTER (LANG(?occupationLabel) = "en")
  SERVICE wikibase:label { 
    bd:serviceParam wikibase:language "en". 
    ?occupation rdfs:label ?occupationLabel.
    ?subclassOf rdfs:label ?subclassOfLabel.
  }
}
ORDER BY ?occupationLabel
Try it!
--Tagishsimon (talk) 03:01, 29 September 2023 (UTC)
Looks like that did the trick - thank you *SO* much... Slieb17 (talk) 03:13, 29 September 2023 (UTC)

Get all STEM articles on wikipedia

By STEM, I mean articles (roughly) about the concepts in STEM(eg, that you'd traditionally learn in university), rather than articles about biography or sociology of the impact of the concepts.

Eg, I tried this - https://w.wiki/7YWM

I think a very large part of the problem is that wikipedia categories are poorly described by wikidata, eg - https://www.wikidata.org/wiki/Q6544657 If I was able to get the categories, than getting the pages would be much easier. Wikiqrdl (talk) 22:03, 27 September 2023 (UTC)

SELECT DISTINCT ?item WHERE {
  ?item p:P31 ?statement0.
  ?statement0 (ps:P31/(wdt:P279*)) wd:Q336.
  ?item p:P361 ?statement1.
  ?statement1 (ps:P361/(wdt:P279*)) wd:Q8434.
}
LIMIT 100
Try it!
I don't think 'a very large part of the problem is that wikipedia categories are poorly described by wikidata'. It's not the job of wikipedia to document or ape the category structure of a wikipedia. It is possible to use MWAPI services to compile lists of the categories and articles within a parent category, and so the structure of EN wiki's category tree, and the contents of individual categories, are available via the Wikidata query service. I think the main problems are that your ask is a) vast and b) ill-defined, and that c) there is real complexity in getting the articles you want - "concepts in STEM that you'd traditionally learn in university" - whilst excluding everything you do not want, such as "sociology of the impact of the concepts". There is not a simple, magic bullet solution to accessing all such articles, either through wikipedia categories, or through wikidata item coding; rather, it would take laborious analysis of the different strands of STEM on WP or WD to enable your ask to be answered. --Tagishsimon (talk) 22:47, 27 September 2023 (UTC)
Not sure that's a good faith response, especially as I did say roughly. Also, university courses are very standardized. A BSc largely means the same thing at a large collection of different universities. Assuming you actually meant 'not the job of wikidata to document or ape the category structure of a wikipedia', I don't think that's the ask or what I'm referring to.
But anyways, all that aside, the simple fact exists that these classes and instances exist in wikidata, they just don't point to anything. There are 'academic disciplines'. There is the concept of 'science education'. If you were to say the 'not the job of wikidata to catalog abstract knowledge' then I guess I'd have to just accept and move on. But I would say that's deeply unfortunate, as cataloging knowledge is far more interesting than cataloging things that simply exist. Wikiqrdl (talk) 03:10, 28 September 2023 (UTC)
Hmmmm, I found this - https://www.wikidata.org/wiki/Q4671286 Not sure why the unhelpful response, but it happens sometimes I suppose. Wikiqrdl (talk) 03:19, 28 September 2023 (UTC)
The funny thing in all of this that while you're not 'aping' (a really unfortunate choice of words, for real) categories, you are most definitely duplicating the effort. eg: https://www.wikidata.org/wiki/Q431 and https://www.wikidata.org/wiki/Q6544657 Why you don't want to connect these (easy enough to do with a bit of AI automation) when it would be so incredibly helpful for querying wikipedia is frankly beyond me. Wikiqrdl (talk) 03:25, 28 September 2023 (UTC)
So, here's the thing, wikidata does catalog knowledge, but in a very shallow, incomplete and somewhat confusing manner. It's mostly done by 'has part(s)' I believe in a forward direction, but quickly stops as I probe to any reasonable depth. Was that something wikidata started but gave up on? Wikiqrdl (talk) 03:50, 28 September 2023 (UTC)
I think right now you are in the 'little knowledge is a dangerous thing' territory. I don't think you have a clue about wikidata, and you're confused as to why you cannot map wikipedia categories to it. No, WD does not mostly use 'has parts' to encode knowledge. Mostly it uses a combination of instance of (P31) and subclass of (P279), instances and class trees, to provide an ontology; and then all manner of additional properties, as appropriate for particular domains, to add information to classes and/or instances. --Tagishsimon (talk) 05:12, 28 September 2023 (UTC)
To use the example you found, the class tree of zoology - https://w.wiki/7adX ... does not depend on 'has part'. --Tagishsimon (talk) 05:20, 28 September 2023 (UTC)
Was that query supposed to be an example of something? "Catering for office workers" how does that relate to zoology?
Clearly, there is some animosity here between wikidata and wikipedia, especially around categorization and you're taking it out on me. I'm just trying to figure out the knowledge graph on WP and I thought the wikidata folks might have some good ideas. Yikes.. Wikiqrdl (talk) 14:13, 30 September 2023 (UTC)
You're not "just trying to figure out the knowledge graph on WP". You're making no attempt whatsoever to "figure out the knowledge graph on WP". "Was that query supposed to be an example of something?" You evidence a lack of good faith, and that being the case, there's not much point in entertaining the conversation any further. --Tagishsimon (talk) 16:08, 30 September 2023 (UTC)

"Articles (roughly) about the concepts in STEM(eg, that you'd traditionally learn in university), rather than articles about biography or sociology of the impact of the concepts" seems a rather fuzzy statement to be transformed on anything a machine can understand. However, you can use Petscan to see articles in any Wikipedia category of your choice excluding biographies.--Pere prlpz (talk) 17:12, 28 September 2023 (UTC)

  • Look at studied in (P2579) in relation to subclasses of interested sciences. --Infovarius (talk) 21:31, 29 September 2023 (UTC)
    Yes, I see a lot of very helpful tags in wikidata as if they were trying to do exactly what I'm talking about (mapping the knowledge graph), but my sense is that the tree is pretty shallow. My guess is that either a) wikidata doesn't want to flush out the tree or b) it's still early days. I am guessing the latter and you know that isn't a crime. Sometimes these things take time. Wikiqrdl (talk) 14:19, 30 September 2023 (UTC)
    The other issue is that I think wikidata has decided they don't want to have anything to do with wp categories, so none of them are getting tagged. I can certainly see arguments on both sides here, but the biggest issue will be the lack of leveraging what already exists. Wikiqrdl (talk) 14:32, 30 September 2023 (UTC)
@Pere prlpz STEM knowledge is fairly well defined, as well defined as most anything else in wikidata. Certainly just using modern embeddings you could probably tag with about 90% accuracy of the categories on WP. Wikiqrdl (talk) 14:35, 30 September 2023 (UTC)
@Wikiqrdl: If you are requesting a query, please define "STEM knowledge" in a way concrete enough so we can help you at building that query.
If you are not requesting a query but this thread is a debate about how Wikidata should be, then you can ignore my answers.--Pere prlpz (talk) 15:57, 30 September 2023 (UTC)

All cities with more than 300,000 inhabitants that don't have an article in the Hebrew Wikivoyage

Please help me create a query that would show all the cities around the world that have more than 300,000 inhabitants that still don't have an article in the Hebrew Wikivoyage. Thanks in advance. ויקיג'אנקי (talk) 08:41, 29 September 2023 (UTC)

@ויקיג'אנקי: Something like this. There's a few oddities in the report, but I'm guessing you can cope with these:
SELECT DISTINCT ?item ?itemLabel ?country ?countryLabel (MAX(?pop) as ?population)
WHERE 
{
  ?item wdt:P31/wdt:P279* wd:Q515.
  ?item wdt:P1082 ?pop.
  filter (?pop > 300000)
  FILTER NOT EXISTS {
  ?article schema:about ?item ;
  schema:isPartOf <https://he.wikivoyage.org/> . }
  OPTIONAL {?item wdt:P17 ?country . 
           ?country wdt:P463 wd:Q1065 .}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],he,en". }
} GROUP BY ?item ?itemLabel ?country ?countryLabel ORDER BY ?countryLabel ?itemLabel
Try it!
--Tagishsimon (talk) 21:27, 29 September 2023 (UTC)
This query includes also the cities that already have an article on the Hebrew Wikivoyage (for example, on the first page of results the list includes Amsterdam, Copenhagen, Rotterdam, Tirana, Kabul, Melbourne, Sydney, Vienna, Dhaka, and many others. I would guess that it didn't exclude the cities that already have an article in the Hebrew Wikivoyage. ויקיג'אנקי (talk) 09:47, 30 September 2023 (UTC)
@ויקיג'אנקי: Try again; the schema:isPartOf line was a bit mangled. --Tagishsimon (talk) 15:37, 30 September 2023 (UTC)
@Tagishsimon: It keeps giving me "Query timeout limit reached". Any way to fix that? ויקיג'אנקי (talk) 21:46, 30 September 2023 (UTC)


@ויקיג'אנקי: This might work...
SELECT DISTINCT ?item ?itemLabel ?country ?countryLabel ?population
WITH {SELECT DISTINCT ?item ?country  (MAX(?pop) as ?population)
WHERE 
{
  ?item wdt:P31/wdt:P279* wd:Q515.
  ?item wdt:P1082 ?pop.
  filter (?pop > 300000)
  } GROUP BY ?item  ?country  } as %i
WHERE
{
  INCLUDE %i
  FILTER NOT EXISTS {
  ?article schema:about ?item ;
  schema:isPartOf <https://he.wikivoyage.org/> . }

  OPTIONAL {?item wdt:P17 ?country . 
           ?country wdt:P463 wd:Q1065 .}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],he,en". }
} ORDER BY ?countryLabel ?itemLabel
Try it!
--Tagishsimon (talk) 22:01, 30 September 2023 (UTC)

Overcategorisation in P279 (finding items with both a class and its superclass added)

I'm not sure how this problem can be described better in English, but here is it:

Is there a simple solution to find such items using SPARQL? Maybe there is some solution to this published here, but I'm not sure how to properly describe this problem and I couldn't find any info on this. Wostr (talk) 18:17, 29 September 2023 (UTC)


@Wostr: Your description is clear: item is described as a subclass of Class A and Class B, where Class A is itself a subclass of Class B; so the item might better be described only as a subclass of Class A. I'm sure that's a *very* common pattern.
Illustratively, some SPARQL below which easily finds candidates. Probably not very efficient SPARQL, and there are many items, and then many P279s per item.
One issue is that the problem may in part arise from source A specifying that the item is a subclass of A, and source B specifying it is a subclass of B; and so rather than deleting 'is a subclass of B', the better approach, to preserve the source reference, would be to use rank to specify 'is a subclass of A' as a preferred statement, and 'is a subclass of B' as a normal statement. Another issue would be getting other users to understand why P279s were ranked.
SELECT DISTINCT ?item ?itemLabel ?P279 ?P279Label ?P279_2 ?P279_2Label
WHERE {
   SERVICE bd:slice {
     ?item wdt:P31 wd:Q113145171.
      bd:serviceParam bd:slice.offset 0 . # Start at item number (not to be confused with QID)
      bd:serviceParam bd:slice.limit 100 . # List this many items
  }
  
  ?item wdt:P279 ?P279.
  ?item wdt:P279 ?P279_2.
 
  filter(str(?P279) != str(?P279_2))
  
  ?P279 wdt:P279* ?P279_2 . hint:Prior hint:gearing "forward".
         
  SERVICE wikibase:label {     bd:serviceParam wikibase:language "en".   }
}
Try it!
--Tagishsimon (talk) 21:19, 29 September 2023 (UTC)
Thank you! It's not meant for any automatic removal of classes, only manual curation, but it will help me a lot :) Wostr (talk) 23:08, 29 September 2023 (UTC)
Good; well, as you probably deduced, play with the offset and limit values to slice through the set of Q113145171 items. --Tagishsimon (talk) 23:14, 29 September 2023 (UTC)
Yep, I got this, thanks. Wostr (talk) 00:58, 30 September 2023 (UTC)