Wikidata:Bot requests/Archive/2021/10

From Wikidata
Jump to navigation Jump to search


request to populate items with relevant information from snwiki (2021-10-10)

Request date: 10 October 2021, by: Capmo

Task description

There are currently 1815 articles in Category:Mazita eVanhu of the Shona Wikipedia. All of them seem to have an item at Wikidata, but the only information provided at item creation was a link to the article. These items should be populated with relevant information as was done here.

Discussion


Request process

Accepted by (Edoderoo (talk) 18:11, 17 October 2021 (UTC)) and under process
Task completed, see source code. About 15 items could not get a description, because they are actually merge candidates with other items that have the same label/description. (05:28, 18 October 2021 (UTC)) The properties could have been added with PetScan easily, for descriptions we used to have descriptioner, but this tool died recently by lack of maintenance. Edoderoo (talk) 08:33, 19 October 2021 (UTC)

Great! Thank you. —capmo (talk) 09:32, 19 October 2021 (UTC)
I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. Matěj Suchánek (talk) 11:46, 28 October 2021 (UTC)

request to delete wrong references (2021-10-27)

Request date: 27 October 2021, by: LutiV

Link to discussions justifying the request
Task description

Please delete the wrong references from title (P1476) and genre (P136): it was used the property Archivio Storico Ricordi person ID (P8290) instead property Archivio Storico Ricordi opera ID (P8732). The list is: https://w.wiki/4DFS

Licence of data to import (if relevant)
Discussion

I would use the following query:

SELECT ?item ?st ?ref
WHERE {
  ?item wdt:P8732 ?id .
  ?item ?p ?st . ?st prov:wasDerivedFrom ?ref . ?ref pr:P8290 ?id .
}
Try it!

For all references listed, Archivio Storico Ricordi person ID (P8290) should be substituted with Archivio Storico Ricordi opera ID (P8732). --Epìdosis 15:10, 30 October 2021 (UTC)

Request process

Accepted by (MisterSynergy (talk) 22:52, 31 October 2021 (UTC)) and under process
Task completed (22:52, 31 October 2021 (UTC))

I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. MisterSynergy (talk) 22:52, 31 October 2021 (UTC)

Request to add publication numbers to NGA lighthouse ID (P3563) (2021-09-30)

Request date: 30 September 2021, by: MSGJ

Link to discussions justifying the request
Task description

There are more than 1000 violations of the formatting constraint listed at Wikidata:Database reports/Constraint violations/P3563#"Format" violations. In most cases this is because the publication number is missing from the identifier. This is a 3-digit number (110, 111, 112, 113, 114, 115 or 116). Without this number the identifier is not unique and so almost useless.

The publication number depends on the area of the world (see map). It could be deduced from coordinate location (P625) but I think this may be too difficult. Probably easier to identify each country (P17) to the relevant number. I can help to generate this list. (There will be a few exceptions, for example overseas territories of some countries, but I can track down and fix these.)

Thanks for considering this task — Martin (MSGJ · talk) 20:22, 30 September 2021 (UTC)

Example

Sunosaki Lighthouse (Q1087248) was in the list of violations. NGA lighthouse ID (P3563) was 5004 and it was missing its publication number. Its country (P17) is Japan (Q17) which means it is in area 112. So NGA lighthouse ID (P3563) is changed to 112-5004 and now the everything is good.

Discussion

Hi @MSGJ:, I would like to have the list (i.e country QID => code) . Ammarpad (talk) 14:41, 3 October 2021 (UTC)

@Ammarpad: I do not yet have a list, but this is copied from en:List of lights#United States:
  • PUB. 110 - Greenland, the East Coasts of North and South America (excluding Continental U.S.A. except the East Coast of Florida) and the West Indies
  • PUB. 111 - The West Coasts of North and South America (Excluding Continental U.S.A. and Hawaii), Australia, Tasmania, New Zealand, and the Islands of the North and South Pacific Oceans
  • PUB. 112 - Western Pacific and Indian Oceans Including the Persian Gulf and Red Sea
  • PUB. 113 - The West Coasts of Europe and Africa, the Mediterranean Sea, Black Sea and Azovskoye More (Sea of Azov)
  • PUB. 114 - British Isles, English Channel and North Sea
  • PUB. 115 - Norway, Iceland and Arctic Ocean
  • PUB. 116 - Baltic Sea with Kattegat, Belts and Sound and Gulf of Bothnia

If you could help me generate a query for the items with missing publication numbers, sorted by Country, then I will start making the list based on the above. Thanks — Martin (MSGJ · talk) 10:00, 4 October 2021 (UTC)

Actually I think I can handle this myself using QuickStatements. Thanks anyway — Martin (MSGJ · talk) 22:12, 5 October 2021 (UTC)
(Belated response) I am not sure how to write SPARQL queries, if that's what you wanted. But I created this (User:AmmarBot/P3563) with custom script. Is that what you're looking for? Ammarpad (talk) 07:33, 7 October 2021 (UTC)
Request process

Accademia delle Scienze di Torino multiple references (updated)

Request date: 30 October 2021, by: Epìdosis

Link to discussions justifying the request
Task description

Given the following query:

SELECT DISTINCT ?item
WHERE {
  ?item wdt:P8153 ?ast .
  ?item p:P570 ?statement.
  ?reference1 pr:P248 wd:Q2822396.
  ?reference2 pr:P248 wd:Q2822396.
  ?statement prov:wasDerivedFrom ?reference1.
  ?statement prov:wasDerivedFrom ?reference2.
  FILTER (?reference1 != ?reference2)
}
Try it!

In many items there are multiple references to date of death (P570) referring to Academy of Sciences of Turin (Q2822396)=Accademia delle Scienze di Torino ID (P8153). Cases:

  1. three references: maintain the first (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)+subject named as (P1810)), delete the second (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)), delete the third (stated in (P248)+retrieved (P813)) transferring the retrieved (P813) to the first
    1. three references bis: if the first is stated in (P248)+Accademia delle Scienze di Torino ID (P8153)+subject named as (P1810)+retrieved (P813), the second and the third get simply deleted
    2. three references ter: if there is a reference with reference URL (P854) containing a string "accademiadellescienze", it should be deleted; maintain the second (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)), delete the third (stated in (P248)+retrieved (P813)) transferring the retrieved (P813) to the first
  2. two references: maintain the second (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)), delete the third (stated in (P248)+retrieved (P813)) transferring the retrieved (P813) to the first

Repeat the above query substituting date of birth (P569) to date of death (P570). Cases:

  1. two references: maintain the first (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)+subject named as (P1810)), delete the second (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)+retrieved (P813)) transferring the retrieved (P813) to the first
    1. two references bis: if the first is stated in (P248)+Accademia delle Scienze di Torino ID (P8153)+subject named as (P1810)+retrieved (P813), the second gets simply deleted
    2. two references ter: if there is a reference with reference URL (P854) containing a string "accademiadellescienze", it should be deleted; maintain the second (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)+retrieved (P813))

Important: in all cases stated in (P248)Academy of Sciences of Turin (Q2822396) should become stated in (P248)www.accademiadellescienze.it (Q107212659).

Discussion

@Ladsgroup: as his bot is probably ready for doing this; the previous request was archived despite not being solved. --Epìdosis 15:03, 30 October 2021 (UTC)

Request process

request to add YOB and/or YOD to TP descriptions (2021-09-01)

Request date: 1 September 2021, by: Jura1

Task description

Many TP imported items have a description in the form "Peerage person ID=\d*". These were added when these items didn't include more information.

In the meantime, some of these items include date of birth (P569) and/or date of death (P570). To make it easier to identify them, the years from these dates should be added to the description.

  • Sample edit: [1].
  • Query to find items (currently 28776):
SELECT DISTINCT ?item ?itemLabel ?d
{
  hint:Query hint:optimizer "None".
  ?item wdt:P4638 [] .
  ?item (wdt:P569|wdt:P570) [] .
  ?item schema:description ?d . 
  FILTER( lang(?d) && regex (?d, "^Peerage person ID=\\d+$") ) 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!

Thanks --- Jura 12:44, 1 September 2021 (UTC)

Discussion

I had a look to some of them, but the data is quite messy, due to a source that is messy too. I do not see a good reason to have this data in WikiData, the only reason it got imported is because it was there. Any effort in describing this, will not make a lesser mess. Edoderoo (talk) 20:09, 19 September 2021 (UTC)

Also adding dates to descriptions adds data duplication, causes problems when the dates are found to be wrong later and don’t really help that much for TP entries from my experience. --Emu (talk) 08:13, 29 October 2021 (UTC)

labels and descriptions are meant to duplicate information also available as statements. If you have a better strategy for TP, let's hear it. --- Jura 12:42, 29 October 2021 (UTC)
There’s a reason why dates aren’t mentioned in WD:D. As for a better strategy: Just leave it as it is. Your claim that different descriptions would “make it easier to identify them” begs the question: Why? To whom? And will those people try to add information to those items or does it just add aesthetic value? --Emu (talk) 16:54, 29 October 2021 (UTC)
Can you point me to the explanation with the reason (for English descriptions)? It's a standard element for descriptions in Dutch.
Maybe you can also explain what the TP ID should be doing instead?
It's fairly common that years (not dates) of birth/death are used to disambiguate. At least in databases .. --- Jura 18:47, 29 October 2021 (UTC)
I don’t know the full backstory of English descriptions in Wikidata. What I do know is that descriptions with YOB and/or YOD do often cause a lot of work when dates are corrected.
I’m not sure what you mean by your second question. If you refer to the descriptions Peerage person ID=: Well, they are there. There’s little use in deleting them so why bother.
And yeah, they are. You don’t need descriptions for that. --Emu (talk) 20:38, 29 October 2021 (UTC)
@Emu To whom? For example to people who work with Mix'n'Match to pair external databases to these items. I definitely see merit in having dates of birth and death in description, in absence of a better descriptor. Vojtěch Dostál (talk) 19:55, 29 October 2021 (UTC)
@Vojtěch Dostál: I don’t quite follow – M’n’M has autodescriptions that include life dates.
Don’t get me wrong, if you two are really keen on this bot job, go ahead. I just don’t see the need. --Emu (talk) 20:38, 29 October 2021 (UTC) 
M'n'M also enables to view the manual descriptions. Other use cases include looking up specific names in search bar. Anyway, I cannot currently retrieve the dates I need for this - I need only statements with >=9 precision and non-deprecated rank, which is too demanding for query service and times out:
SELECT distinct ?item (sample(?dob) as ?birth) (sample(?dod) as ?death) WHERE {
:::::?item wdt:P4638 [] .
:::::optional {?item p:P569 [ps:P569 ?dob ; psv:P569/wikibase:timePrecision ?prec1 ; wikibase:rank ?rank1 ] filter(?prec1 > 8 && ?rank1 != wikibase:DeprecatedRank ).}
:::::optional {?item p:P570 [ps:P570 ?dod ; psv:P570/wikibase:timePrecision ?prec2 ; wikibase:rank ?rank2 ] filter(?prec2 > 8 && ?rank2 != wikibase:DeprecatedRank ).}
:::::} group by ?item
:::::
Try it!
Vojtěch Dostál (talk) 15:29, 31 October 2021 (UTC)
Request process

request to depreciated ethnic group only sourced with P143 (2021-10-23)

Request date: 23 October 2021, by: Fralambert

Link to discussions justifying the request
Task description
Hi, since ethnic group (P172) is a highly contencious subject, the property already mandate a source and imported from Wikimedia project (P143) is not a reliable source, it would be fine if a bot put a depreciated rank when the statement in ethnic group (P172) use only imported from Wikimedia project (P143) as a source. Also the bot could add reason for deprecated rank (P2241) and source known to be unreliable (Q22979588)as a qualifier. We could also only remove statement with this source, but they are likely to come back, so depreciated them would be a best. --Fralambert (talk) 15:12, 23 October 2021 (UTC)
Licence of data to import (if relevant)
Discussion
Request process

Accepted by Vojtěch Dostál (talk) 17:20, 2 October 2022 (UTC) and under process


✓ Done using the following queries:

It would be good if someone turned this into a regular job (eg. once every month) because we already see new items popping up in the queries. I currently do not operate automated bots like that. @BrokenSegue, MisterSynergy: maybe you'd be interested. Vojtěch Dostál (talk) 21:10, 7 October 2022 (UTC)

Hi! @BrokenSegue, MisterSynergy, Vojtěch Dostál, ChristianKl, Fralambert, Hsarrazin: I think that in the majority of cases (more than 50%), ethnicity is not a contentious topic.

@Vladimir Alexiev: Sorry, I think we were not pinged about your message. IMO, the proof is that a reliable source states the ethnicity. As for the removal from wdtruthy statements, that is expected behavior (we would not want our users to consider these statements as reliable, hence the requirement for the reference) Vojtěch Dostál (talk) 09:56, 22 November 2022 (UTC)
Let me repeat myself: What do you consider a reliable source about ethnicity? Even the BG citizen register (ESGRAON) is not that, since it registers a birth in BG, but now what ethnicity the person self-identifies with. Vladimir Alexiev (talk) 13:03, 25 November 2022 (UTC)
Do you have any analysis or evidence of how many of the statements you removed were actually WRONG? Vladimir Alexiev (talk) 13:04, 25 November 2022 (UTC)
For most of the Wikis, we can't use a source of another Wiki as reliable. Why this is different for Wikidata? We should probably put end to exporting data from other Wikis that make this one less reliable. As finding external reliable source, this is possible. Like for this judge, or this singer. Fralambert (talk) 20:15, 25 November 2022 (UTC)