Shortcut: WD:RBOT

Wikidata:Bot requests

From Wikidata
Jump to navigation Jump to search
Bot requests
If you have a bot request, add a new section using the button and tell exactly what you want. To reduce the process time, first discuss the legitimacy of your request with the community in the Project chat or in the Wikiprojects's talk page. Please refer to previous discussions justifying the task in your request.

For botflag requests, see Wikidata:Requests for permissions.

Tools available to all users which can be used to accomplish the work without the need for a bot:

  1. PetScan for creating items from Wikimedia pages and/or adding same statements to items
  2. QuickStatements for creating items and/or adding different statements to items
  3. Harvest Templates for importing statements from Wikimedia projects
  4. OpenRefine to import any type of data from tabular sources
  5. WikibaseJS-cli to write shell scripts to create and edit items in batch
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2021/05.
Filing cabinet icon.svg
SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 2 days.

Import Treccani IDs[edit]

Request date: 6 February 2019, by: Epìdosis

Task description

At the moment we have four identifiers referring to http://www.treccani.it/: Treccani's Dizionario biografico degli italiani ID (P1986), Treccani ID (P3365), Treccani's Enciclopedia Italiana ID (P4223), Treccani's Dizionario di Storia ID (P6404). Each article of these works has, in the right column "ALTRI RISULTATI PER", a link to the articles regarding the same topic in other works (e.g. Ugolino della Gherardesca (Q706003) Treccani ID (P3365) conte-ugolino, http://www.treccani.it/enciclopedia/conte-ugolino/ has links also to Enciclopedia Italiana (Treccani's Enciclopedia Italiana ID (P4223) and Dizionario di Storia (Treccani's Dizionario di Storia ID (P6404)). This cases are extremely frequent: many items have Treccani's Dizionario biografico degli italiani ID (P1986) and not Treccani ID (P3365)/Treccani's Enciclopedia Italiana ID (P4223); others have Treccani ID (P3365) and not Treccani's Enciclopedia Italiana ID (P4223); nearly no item has Treccani's Dizionario di Storia ID (P6404), recently created.

My request is: check each value of these identifiers in order obtain values for the other three identifiers through the column "ALTRI RISULTATI PER".

Discussion

Fix local dialing code (P473) wrongly inserted[edit]

Request date: 7 November 2019, by: Andyrom75

Task description

Several entities has a wrong value for the local dialing code (P473) according to the format as a regular expression (P1793) specified in it: [\d\- ]+, as clarified "excluded, such as: ,/;()+"

Typical examples of wrong values, easily identified are the following two:

  1. local dialing code (P473) that includes at the beginning the country calling code (P474)
  2. local dialing code (P473) that include at the beginning the "optional" zero
  • Case 1 can be checked looking for "+", when present, should be compared with the relevant country calling code (P474) and if matched, it should be removed
  • Case 2 can be checked looking for "(" and ")" with zeros inside. If matched it should be removed
Discussion
Request process

weekly import of new articles (periodic data import)[edit]

To avoid Wikidata getting stale, it would be interesting to import new papers on a weekly basis. Maybe with a one week delay. This for repositories where this can be done.

@Daniel Mietchen: --- Jura 12:16, 7 August 2019 (UTC)

I'd certainly like to see this tested, e.g. for these two use cases:
  1. https://www.ncbi.nlm.nih.gov/pubmed/?term=zika
  2. all of PubMed Central, i.e. articles having a PMCID (P932), which point to a full text available from PMC.
--Daniel Mietchen (talk) 03:08, 23 August 2019 (UTC)
The disadvantage of skipping some might be that one wouldn't now if it's complete or not. --- Jura 17:00, 25 August 2019 (UTC)
  • Still good to have. --- Jura 21:19, 26 March 2020 (UTC)
@Jura1: Is it fine to import all articles weekly? It seems will be 31K articles every week. --Kanashimi (talk) 10:16, 23 June 2020 (UTC)
  • Yes, I still think that would be useful. --- Jura 11:58, 23 June 2020 (UTC)

@Daniel Mietchen: This seems like a WikiCite request, where do you think we should discuss this? I'm inclined to close the request here and discuss implementation elsewhere (somewhere where it may attract more attention).Vojtěch Dostál (talk) 14:08, 9 November 2020 (UTC)

  • @GZWDer: might be doing some of it. If it's something that can or should be done, I don't the request should be removed from here. --- Jura 14:20, 9 November 2020 (UTC)

Bringing SourceMD full back online would be good. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:05, 11 November 2020 (UTC)

@Jura1, Daniel Mietchen, Pigsonthewing, Vojtěch Dostál: Maybe I can do this. I just wonder if we should import all articles. --Kanashimi (talk) 00:29, 18 November 2020 (UTC)

@Kanashimi: We definitely should :) If we want to "create a large open bibliographic database within Wikidata" (see Wikidata:WikiProject Source MetaData), we don't have a choice.Vojtěch Dostál (talk) 19:51, 22 November 2020 (UTC)
I think "new" is different from "all". Anyways, the above suggests a one week delay, but maybe the optimal one is different. --- Jura 08:26, 24 November 2020 (UTC)

@Jura1, Daniel Mietchen, Pigsonthewing, Vojtěch Dostál: Hi, I have make a new request in Wikidata:Requests for permissions/Bot/Cewbot 4. Please give me some suggests, thank you. --Kanashimi (talk) 03:54, 27 November 2020 (UTC)

@Daniel Mietchen, Jura1: Are there any model entity so I can know what to import? --Kanashimi (talk) 04:51, 27 November 2020 (UTC)

Periodic update of identifiers' alphabetic sorting[edit]

Request date: 6 April 2020, by: Epìdosis

Link to discussions justifying the request
Task description

An admin-bot (as MediaWiki:Wikibase-SortedProperties is protected) should periodically (e.g. every week):

  • consult the list of all properties
  • choose only the properties with "external-ID" datatype
  • exclude all the properties which are present in one of the sections before "IDs with type "external-id" - alphabetical order"
  • edit the section "IDs with type "external-id" - alphabetical order" inserting a list of all the remaining properties, according to this format:

* Pnumber (English label)

Discussion


Request process

fix ALLCAPS of items imported from MIC[edit]

Request date: 5 October 2020, by: Vladimir Alexiev

Link to discussions justifying the request

https://www.wikidata.org/wiki/Topic:Vv1zfojnvvo11oj8 initiated by @Jura1:

Task description

I have imported a bunch of items with MIC market code (P7534) (stock exchanges and the like), see https://editgroups.toolforge.org/b/OR/ab49ffaac2/.

Some of them come with ALLCAPS names or descriptions, so they are listed at https://www.wikidata.org/wiki/Wikidata:Database_reports/Complex_constraint_violations/P7534.

Can someone help with fixing the names and descriptions to "Title Case"? (I thought descriptions should be in "Sentence case" but very often they also contain the exchange name)

Please note that prepositions should be in lower case, eg "BOLSA DE COMERCIO DE SANTA FE" should be come "Bolsa de Comercio de Santa Fe".

Licence of data to import (if relevant)
Discussion
  • The linked constraint page lists 10 items with all-caps labels and 10 items with all-caps descriptions. You should fix this small number by hand as writing a bot for this would take hours and the correct (automatic) handling of prepositions is difficult. --Pyfisch (talk) 10:14, 5 October 2020 (UTC)
  • 10 is just a selection. There are many more. --- Jura 10:18, 14 October 2020 (UTC)

The task is more difficult since there are many acronyms that must be left as is (eg APA, OTF, OTP, NASDAQ, STOXX, etc). So the bot should only change (capitalize) usual words found in a dictionary --Vladimir Alexiev (talk) 02:49, 11 December 2020 (UTC)

Request process

Year-qualifier for "students count" (P2196) values[edit]

SELECT DISTINCT ?item ?itemLabel ?sl
{
  ?item wdt:P2196 ?value .
  FILTER NOT EXISTS { ?item p:P2196 / pq:P585 [] }
  ?item wikibase:sitelinks ?sl .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

Try it!

The items above have student counts, but no point in time (P585)-qualifier (currently 2440 of 56228 items with the property). It would be good to find a way to add the year to these.

I noticed some come from dewiki infoboxes which don't include a year either. --- Jura 06:34, 29 November 2020 (UTC)

So @Jura1: where can we get the years from? Also, the way the query is written doesn't show the full extent of the problem. If a University has 1 claim with year and 100 without year, you won't count it --Vladimir Alexiev (talk) 02:55, 11 December 2020 (UTC)

I'm not sure about a possible source. Maybe another language Wikipedia infobox, Wikipedia article text or an external source.
Agree that more statements might need completion, but the above are most in need of it. --- Jura 07:16, 11 December 2020 (UTC)

Modules and template: sync with from source[edit]

To make a template work on Wikidata, I imported a series of modules from frwiki. See list at Topic:Vyn971t0krpm0n8m.

It would be helpful if there was a way to have these periodically checked for updates (daily or weekly?) and, if there is an update, re-imported from source.

Ideally, I suppose the modules here would be semi or fully protected.

Maybe we could define a badge for use on sitelinks to define the source and automatically sync copies. Sample: at Q14920430 source is frwiki and automatically copy at wikidatawiki. --- Jura 08:33, 29 November 2020 (UTC)

I mentioned that last point at Wikidata:Project_chat#New_badges_for_templates_and_modules. --- Jura 08:53, 29 November 2020 (UTC)

Identifiant National Football League[edit]

Request date: 3 December 2020, by: Sismarinho

NFL.com ID (former scheme) (P3539)
  • (Sorry french) Bonjour, il y a un problème avec les identifiants NFL de nombreuses fiches depuis que l'architecture du site de la NFL a changé. Désormains c'est le nom de la personne. Par exemple pour Bronko Nagurski (Q927663) c'est bronko-nagurski. Un bot peut-il faire cette requête ?
Task description
Licence of data to import (if relevant)
Discussion
  • Please propose a new property for the new scheme. Once done, this could be filled by bot or in some other way. --- Jura 07:45, 3 December 2020 (UTC)
Request process

Move publisher from qualifier to reference[edit]

Request date: 3 December 2020, by: 4ing

Link to discussions justifying the request
Task description

For several municipality of the Netherlands (Q2039348), publisher (P123) has erroneously been added as a qualifier to population (P1082). It should be moved to the reference. In addition, some references include point in time (P585), which should be converted to a qualifier. And finally, the most recent value for population (P1082) should be set to preferred rank. See Vlissingen (Q10084) (relevant version) as an example.

Discussion


Request process

Preferred rank for areas of German municipalities[edit]

Request date: 3 December 2020, by: 4ing

Link to discussions justifying the request
Task description

Most urban municipality of Germany (Q42744322) and municipality of Germany (Q262166) have two values for area (P2046): one imported from other Wikimedia projects or DBpedia (Q465), and one with DESTATIS (Q764739) as reference (incl. reference URL (P854), title (P1476), archive URL (P1065) etc.). The latter also has point in time (P585) as qualifier. This latter value should be given preferred rank if not already done manually. Example: Borkum (Q25082).

Discussion


Request process

Replace imported from Wikimedia project (P143)=Minor Planet Center (Q522039)[edit]

SELECT ?item ?itemLabel ?prop ?propLabel ?value ?valueLabel ?st
WHERE
{
  ?st prov:wasDerivedFrom/pr:P143 wd:Q522039 .
  hint:Prior hint:rangeSafe true .
  ?item ?p ?st .
  ?prop wikibase:claim ?p ; wikibase:statementProperty ?ps .
  ?st ?ps ?value 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
LIMIT 100

Try it!

As imported from Wikimedia project (P143) should only be used with WMF projects, the above should be replaced with some other property, e.g. "stated in" or "publisher", and possibly a different value. The query currently finds ca. 48000 statements --- Jura 09:57, 6 December 2020 (UTC)


Replace imported from Wikimedia project (P143)=Historic England (Q19604421)[edit]

SELECT ?item ?itemLabel ?prop ?propLabel ?value ?valueLabel ?st
WHERE
{
  ?st prov:wasDerivedFrom/pr:P143 wd:Q19604421 .
  hint:Prior hint:rangeSafe true .
  ?item ?p ?st .
  ?prop wikibase:claim ?p ; wikibase:statementProperty ?ps .
  ?st ?ps ?value 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
LIMIT 100

Try it!

As imported from Wikimedia project (P143) should only be used with WMF projects, the above should be replaced with some other property, e.g. "stated in" or "publisher", and possibly a different value. The query currently finds ca. 4300 statements --- Jura 09:57, 6 December 2020 (UTC)

Replace imported from Wikimedia project (P143)=Terrassa Museum (Q4894452)[edit]

SELECT ?item ?itemLabel ?prop ?propLabel ?value ?valueLabel ?st
WHERE
{
  ?st prov:wasDerivedFrom/pr:P143 wd:Q4894452 .
  hint:Prior hint:rangeSafe true .
  ?item ?p ?st .
  ?prop wikibase:claim ?p ; wikibase:statementProperty ?ps .
  ?st ?ps ?value 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en,ca". }
}
LIMIT 100

Try it!

As imported from Wikimedia project (P143) should only be used with WMF projects, the above should be replaced with some other property, e.g. "stated in" or "publisher", and possibly a different value. The query currently finds ca. 5100 statements --- Jura 09:57, 6 December 2020 (UTC)

@ESM: Can you fix your batch from 4 years ago? I'm looking at Q27651146 and in addition to the above, I see the following problems:

  • there's neither link nor image. If I cannot SEE this chair, what is the value of having this record in WD?
  • stupid title "Cadira inv 15283". How can I determine if I'm interested in this item from its number alone?
  • it IS a chair, it does not DEPICT a chair
  • made of Tissue? Doubt that very much since the creator is not Dr Hannibal

We don't need dumps of defective museum catalog exports in WD! Let's not turn the Sum of All Paintings idea into Sum of All Junk. Fix your stuff or I request a bot to delete all these items --Vladimir Alexiev (talk) 03:11, 11 December 2020 (UTC)

@Vladimir Alexiev: I'd have appreciated you being a bit less harsh. Will try mending things when I have time to do so. Feel free to do whatever you please with the data you don't like. --ESM (talk) 08:53, 14 December 2020 (UTC)
  • @ESM, Vladimir Alexiev: the item is fairly complete (I wish more were like that). There is no requirement that there is a link to website, but link to the publication used to source this could be helpful (see @MisterSynergy: comment below). "material used"="tissue" could use an "applies to part" qualifier and another item for "tissue" (currently: Q40397). depicts (P180) isn't useful if instance of (P31)="chair" is correct. I can help fix them. --- Jura 12:27, 14 December 2020 (UTC)
  • removed 29 depicts statements found with [1]. Also, about the labels: I'm not really convinced that "cadira" would be better than the current "Cadira inv 15283". --- Jura 12:36, 14 December 2020 (UTC)
  • @ESM: Sorry, guess I had a bad day.
  • @Jura1: title="chair"@en, descr="Terassa museum, inventary number 15283" --Vladimir Alexiev (talk) 20:53, 15 December 2020 (UTC)

I can offer to make the edits as I already have bot code available that moves from imported from Wikimedia project (P143) to stated in (P248) and also changes the value item from Terrassa Museum (Q4894452) to something else. However, I do not want to create museum catalog items by myself as that often requires expertise that I do not have. So if you can set up such a museum catalog item, I can make the replacements in all the items efficiently—a task that would otherwise be quite difficult. —MisterSynergy (talk) 10:26, 14 December 2020 (UTC)

@MisterSynergy: Thank you very much for your offer and please excuse my silence for such a long time. Feel free to move all the references that use imported from Wikimedia project (P143) to stated in (P248). I'm not sure if we should change the value from Terrassa Museum (Q4894452) to something else like "Catalog of Museu de Terrassa's collection", which would have instance of (P31) = collection catalog (Q5146094). @Jura1: what do you think about this? The data (in this case and similar ones you spotted too) come from a database dump from the museum's collections management system, so I'm afraid there's no URL or similar tangible element to point as a source.
On the other hand, I'm aware of mistakes such as using tissue (tissue (Q40397) instead of woven fabric (Q1314278). They are originated in translation mistakes since both words in my language are written the same way and I failed at checking the Qs before uploading the statements. I'm sorry about that and would like to mend it, even though I'm struggling to find the time to do so. --ESM (talk) 17:18, 15 February 2021 (UTC)

Replace imported from Wikimedia project (P143)=Municipal Institute of Museums of Reus (Q23687366)[edit]

SELECT ?item ?itemLabel ?prop ?propLabel ?value ?valueLabel ?st
WHERE
{
  ?st prov:wasDerivedFrom/pr:P143 wd:Q23687366 .
  hint:Prior hint:rangeSafe true .
  ?item ?p ?st .
  ?prop wikibase:claim ?p ; wikibase:statementProperty ?ps .
  ?st ?ps ?value 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
LIMIT 100

Try it!

As imported from Wikimedia project (P143) should only be used with WMF projects, the above should be replaced with some other property, e.g. "stated in" or "publisher", and possibly a different value. The query currently finds ca. 5800 statements --- Jura 09:57, 6 December 2020 (UTC)

  • @ESM: similar to the previous, would you make an item with could use as value for stated in (P248) ? --- Jura 13:25, 18 December 2020 (UTC)

Replace imported from Wikimedia project (P143)=Landesfilmsammlung Baden-Württemberg (Q24469969), Haus des Dokumentarfilms (Q1590879)[edit]

SELECT ?item ?itemLabel ?prop ?propLabel ?value ?valueLabel ?st
WHERE
{
  ?st prov:wasDerivedFrom/pr:P143 wd:Q24469969 .
  hint:Prior hint:rangeSafe true .
  ?item ?p ?st .
  ?prop wikibase:claim ?p ; wikibase:statementProperty ?ps .
  ?st ?ps ?value 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
LIMIT 100

Try it!

SELECT ?item ?itemLabel ?prop ?propLabel ?value ?valueLabel ?st
WHERE
{
  ?st prov:wasDerivedFrom/pr:P143 wd:Q1590879 .
  hint:Prior hint:rangeSafe true .
  ?item ?p ?st .
  ?prop wikibase:claim ?p ; wikibase:statementProperty ?ps .
  ?st ?ps ?value 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
LIMIT 100

Try it!

As imported from Wikimedia project (P143) should only be used with WMF projects, the above should be replaced with some other property, e.g. "stated in" or "publisher", and possibly a different value. The two queries currently find ca. 7800 statements --- Jura 09:57, 6 December 2020 (UTC)


Replace imported from Wikimedia project (P143)=Museu d'Art Jaume Morera (Q5476145)[edit]

SELECT ?item ?itemLabel ?prop ?propLabel ?value ?valueLabel ?st
WHERE
{
  ?st prov:wasDerivedFrom/pr:P143 wd:Q5476145 .
  hint:Prior hint:rangeSafe true .
  ?item ?p ?st .
  ?prop wikibase:claim ?p ; wikibase:statementProperty ?ps .
  ?st ?ps ?value 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
LIMIT 100

Try it!

As imported from Wikimedia project (P143) should only be used with WMF projects, the above should be replaced with some other property, e.g. "stated in" or "publisher", and possibly a different value. The query currently finds ca. 7800 statements --- Jura 09:57, 6 December 2020 (UTC)


Replace imported from Wikimedia project (P143)=Istat (Q214195)[edit]

SELECT ?item ?itemLabel ?prop ?propLabel ?value ?valueLabel ?st
WHERE
{
  ?st prov:wasDerivedFrom/pr:P143 wd:Q214195 .
  hint:Prior hint:rangeSafe true .
  ?item ?p ?st .
  ?prop wikibase:claim ?p ; wikibase:statementProperty ?ps .
  ?st ?ps ?value 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
LIMIT 100

Try it!

As imported from Wikimedia project (P143) should only be used with WMF projects, the above should be replaced with some other property, e.g. "stated in" or "publisher", and possibly a different value. The query currently finds ca. 8000 statements --- Jura 09:57, 6 December 2020 (UTC)

Replace imported from Wikimedia project (P143)=Historical Commission of the Bavarian Academy of Sciences (Q1419226)[edit]

SELECT ?item ?itemLabel ?prop ?propLabel ?value ?valueLabel ?st
WHERE
{
  ?st prov:wasDerivedFrom/pr:P143 wd:Q1419226 .
  hint:Prior hint:rangeSafe true .
  ?item ?p ?st .
  ?prop wikibase:claim ?p ; wikibase:statementProperty ?ps .
  ?st ?ps ?value 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
LIMIT 100

Try it!

As imported from Wikimedia project (P143) should only be used with WMF projects, the above should be replaced with some other property, e.g. "stated in" or "publisher", and possibly a different value. The query currently finds ca. 8400 statements --- Jura 09:57, 6 December 2020 (UTC)

Replace imported from Wikimedia project (P143)=Natural History Museum (Q309388)[edit]

SELECT ?item ?itemLabel ?prop ?propLabel ?value ?valueLabel ?st
WHERE
{
  ?st prov:wasDerivedFrom/pr:P143 wd:Q309388 .
  hint:Prior hint:rangeSafe true .
  ?item ?p ?st .
  ?prop wikibase:claim ?p ; wikibase:statementProperty ?ps .
  ?st ?ps ?value 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
LIMIT 100

Try it!

As imported from Wikimedia project (P143) should only be used with WMF projects, the above should be replaced with some other property, e.g. "stated in" or "publisher", and possibly a different value. The query currently finds ca. 23000 statements --- Jura 09:57, 6 December 2020 (UTC)


replace reference URL (P854) = Petscan[edit]

SELECT *
WHERE
{
  hint:Query hint:optimizer "None".
  ?ref pr:P854 ?value .
  FILTER( REGEX( STR( ?value ), "petscan" )  )
  ?statement prov:wasDerivedFrom ?ref;
}
LIMIT 200

Try it!

reference URL (P854) could be replaced with Wikimedia import URL (P4656). --- Jura 12:35, 6 December 2020 (UTC)

Sample edit [2]. Not sure how to do it with wikibase-cli --- Jura 13:40, 18 December 2020 (UTC)

Create person items from Wikisource entries (matr. Oxonienses)[edit]

SELECT ?item ?itemLabel ?itemDescription
{
	?item wdt:P1433 wd:Q19036877 . 
	FILTER NOT EXISTS { ?item wdt:P921 [] }
	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

Try it!

For the above, it could be interesting to create an item for each person without main subject (P921) (currently 27400).

Sample: Everest, Robert (Q94820208) with s:Everest,_RobertRobert Everest (Q104057081).

for info: @Miraclepine, Charles Matthews: --- Jura 19:20, 9 December 2020 (UTC)

Of course I have wondered about this. I think the proportion of people there who are not really notable would be at least 50%. That isn't a definitive argument, but I regard having to sift through numerous country vicars to do a disambiguation run as fairly undesirable. It would be rather better to have them in mix'n'match by some device. That would correspond to what has gone in with Cambridge alumni. Charles Matthews (talk) 19:39, 9 December 2020 (UTC)
  • I tried to find a way to use Mix'n'match for Wikisource entries, but people seemed to think that it's not desirable.
Maybe some filtering should be done beforehand, but it shouldn't be too complex to identify duplicates based on YOB and name once the items created. --- Jura 19:44, 9 December 2020 (UTC)
  • Maybe ORCID is the better comparison in terms of notability. --- Jura 13:30, 18 December 2020 (UTC)

Cleaning of streaming media services urls[edit]

Request date: 12 December 2020, by: Swicher

I'm not sure if this is the best place to propose it but when reviewing the urls of a query with this script:

import requests
from concurrent.futures import ThreadPoolExecutor

# Checks the link of an item, if it is down then saves it in the variable "novalid"
def check_url_item(item):
    # Some sites may return error if a browser useragent is not indicated
    useragent = 'Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77'
    item_url = item["url"]["value"]
    print("Checking %s" % item_url, end="\r")
    req = requests.head(item_url, headers = {'User-Agent': useragent}, allow_redirects = True)
    if req.status_code == 404:
        print("The url %s in the element %s returned error" % (item_url, item["item"]["value"]))
        novalid.append(item)

base_query = """SELECT DISTINCT ?item ?url ?value
{
%s
  BIND(IF(ISBLANK(?dbvalue), "", ?dbvalue) AS ?value)
  BIND(REPLACE(?dbvalue, '(^.*)', ?url_format) AS ?url)
}"""
union_template = """  {{
    ?item p:{0} ?statement .
    OPTIONAL {{ ?statement ps:{0} ?dbvalue }}
    wd:{0} wdt:P1630 ?url_format.
  }}"""
properties = [
    "P2942", #Dailymotion channel
    "P6466", #Hulu movies
    "P6467", #Hulu series
]
# Items with links that return errors will be saved here
novalid = []

query = base_query % "\n  UNION\n".join([union_template.format(prop) for prop in properties])
req = requests.get('https://query.wikidata.org/sparql', params = {'format': 'json', 'query': query})
data = req.json()

# Schedule and run 25 checks concurrently while iterating over items
check_pool = ThreadPoolExecutor(max_workers=25)
result = check_pool.map(check_url_item, data["results"]["bindings"])

I have noticed that almost half are invalid. I do not know if in these cases it is better to delete or archive them but a bot should periodically perform this task since the catalogs of streaming services tend to be very changeable (probably many of these broken links are due to movies/series whose license was not renewed). Unfortunately I could only include Hulu and Dailymotion since the rest of the services have the following problems:

For those sites it is necessary to perform a more specialized check than a HEAD request (like using youtube-dl (Q28401317) for Youtube).

In the case of Hulu I have also noticed that some items can have two valid values in Hulu movie ID (P6466) and Hulu series ID (P6467) (see for example The Tower of Druaga (Q32256)) so you should take that into account when cleaning links.

Request process

Add info about subject to items with generic title "obituary"[edit]

SELECT DISTINCT ?item ?itemLabel ?itemDescription ?pubvenueLabel
{
	?item wdt:P31 wd:Q13442814 . 
    { ?item rdfs:label "OBITUARY"@en } UNION { ?item rdfs:label "Obituary"@en }
	FILTER NOT EXISTS { ?item wdt:P921 [] }
    OPTIONAL { ?item wdt:P1433 ?pubvenue }
	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

Try it!

It seems with have more than 2500 items where we lack info about the person (P921).

It would be helpful if the name and possibly the lifespan of the person could be added to the items description or elsewhere. --- Jura 11:33, 18 December 2020 (UTC)

Trailing space (" ") in labels[edit]

Somehow I thought it wasn't possible, but the Italian label at "Diva" had and at Ariodante (Q22813616) still has a trailing space. Edits are from 2015/2016. I think it would be good to clean this up. Not sure what would be most efficient way to find them. --- Jura 17:48, 19 December 2020 (UTC)


Bulk create items for given names in Russian (Cyrillic script)[edit]

SELECT ?item ?itemLabel ?itemDescription ?itemAltLabel
{
	?item wdt:P31/ wdt:P279? wd:Q202444 . 
	?item wdt:P282 wd:Q8209 . 
	?item wdt:P407 wd:Q7737 .
	FILTER NOT EXISTS { ?item wdt:P282 wd:Q8229 }  
	SERVICE wikibase:label { bd:serviceParam wikibase:language "ru,en" }
}

Try it!

Sample item: Q104431130

Currently the above only finds some 310 items (or 286 if one excludes the ones that incorrectly mix them with Latin script given name items).

A few more might be available, but incomplete.

I think it would be interesting to have a more complete dataset available. --- Jura 23:03, 22 December 2020 (UTC)

Replace pr:P1343 with pr:P248[edit]

SELECT ?item ?itemLabel ?value ?valueLabel ?statement
WHERE
{
	{
		SELECT DISTINCT ?item ?value ?statement
		WHERE
		{
			?ref pr:P1343 ?value .
			?statement prov:wasDerivedFrom ?ref .
			?item ?p ?statement .
		}
	} .
	FILTER( ?item NOT IN ( wd:Q4115189, wd:Q13406268, wd:Q15397819 ) ) .
	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" } .
}

Try it!

The query finds some 3570 references using described by source (P1343). As Epìdosis noted on Property_talk:P1343#Use_in_references, stated in (P248) would generally be the property to use. --- Jura 08:29, 23 December 2020 (UTC)

Thanks Jura. I had reported the problem directly to @Ladsgroup: who had fixed some of them with his bot, but evidently there are some more needing intervention. --Epìdosis 09:16, 23 December 2020 (UTC)
Yeah, I have been cleaning it up for a while now, I just need to constant re-run it. Hopefully, it'll be done soon Amir (talk) 06:23, 24 December 2020 (UTC)
If new ones come up in bulk, the relevant user (or bot operator) should be advised.
Trying to figure out where they came from, I found https://www.wikidata.org/w/index.php?title=Q102075911 but the redirect was deleted. @Epìdosis: Why that?
Anyways, maybe Krbot could autofix occasional ones going forward. @Ivan_A._Krestinin: what do you think? --- Jura 18:50, 28 December 2020 (UTC)
Regarding Q102075911: according to Wikidata:Requests for comment/Redirect vs. deletion, "Deleting is however appropriate if an item has not been existed longer than 24 hours and if it's clear that it's not in use elsewhere."
I surely support the autofix and I thank again @Ladsgroup: for the cleaning! --Epìdosis 20:04, 28 December 2020 (UTC)
Very strange that RFC. Makes me wonder how User:Stryn determined the "consensus". --- Jura 08:58, 29 December 2020 (UTC)

Ontario public school contact info[edit]

Request date: 27 December 2020, by: Jtm-lis

Link to discussions justifying the request
Task description

https://www.wikidata.org/wiki/Wikidata:Dataset_Imports/_Ontario_public_school_contact_information

Licence of data to import (if relevant)
Discussion

Untitled requests about Wikinews[edit]

Request process

Request date: 29 December 2020, by: NMaia

Link to discussions justifying the request
Task description

For Wikinews article (Q17633526) entries, it would also be useful to add:

Licence of data to import (if relevant)
Discussion

NMaia (talk) 14:14, 29 December 2020 (UTC)

Request process

Malformed entries related to "Swiss National Sound Archives ID" (P6770)[edit]

SELECT ?item ?itemLabel ?qid 
{
  ?item wdt:P6770 ?value ; wdt:P31 wd:Q5 .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en"  }
  BIND(xsd:integer ( strafter(str(?item), "Q")) as ?qid) 
  FILTER( ?qid > 	64000000 )
}

Try it!

The above finds approx. 1400 items created some time ago. Many invert family name and given names.

Samples: Q65032428, Q65035029. Some may have been partially fixed since 2019.

@AlessioMela: --- Jura 22:54, 29 December 2020 (UTC)

Unfortunately, there is no way for a bot to know wether the firstname/lastname is twisted, or not. I can write a script to turn them around all, but everything that got fixed manually since 2019, will get twisted again. Theoretically, the script can browse through the history, parsing it to see if the en-label got edited since the item was created, if someone can write a script like that, it might make sense to fix it by bot. Edoderoo (talk) 14:47, 30 December 2020 (UTC)
For the two samples above, it's fairly obvious that they are inverted.
It seems that P6770 generally provides a fairly straightforward format: "SURNAME, Given name". Maybe it could be checked against this or some other source.
@AlessioMela: can you check the data you have and fix it from that? --- Jura 11:15, 31 December 2020 (UTC)
Hi all, yes I can confirm the problem. As Edoderoo said we can't act automatically. Even in the raw data there wasn't anything better. I think the major part of the items with P6770 have the correct name-familyname order. Unfortunately there was a batch of inverted names, that I didn't recognize during the inital test and during the bot running. --AlessioMela (talk) 14:55, 31 December 2020 (UTC)
@AlessioMela:, if you have the data available, can you add named as (P1810) qualifiers to P6770? If not, can you try to identify the batch with inverted names? There are just too many that are the wrong way round. --- Jura 14:59, 31 December 2020 (UTC)

Accademia delle Scienze di Torino multiple references[edit]

Request date: 30 December 2020, by: Epìdosis

Link to discussions justifying the request
Task description

Given the following query:

SELECT DISTINCT ?item
WHERE {
  ?item wdt:P8153 ?ast .
  ?item p:P570 ?statement.
  ?reference1 pr:P248 wd:Q2822396.
  ?reference2 pr:P248 wd:Q2822396.
  ?statement prov:wasDerivedFrom ?reference1.
  ?statement prov:wasDerivedFrom ?reference2.
  FILTER (?reference1 != ?reference2)
}

Try it!

In many items there are multiple references to date of death (P570) referring to Academy of Sciences of Turin (Q2822396)=Accademia delle Scienze di Torino ID (P8153). Cases:

  1. three references: maintain the first (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)+named as (P1810)), delete the second (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)), delete the third (stated in (P248)+retrieved (P813)) transferring the retrieved (P813) to the first
    1. three references bis: if the first is stated in (P248)+Accademia delle Scienze di Torino ID (P8153)+named as (P1810)+retrieved (P813), the second and the third get simply deleted
    2. three references ter: if there is a reference with reference URL (P854) containing a string "accademiadellescienze", it should be deleted; maintain the second (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)), delete the third (stated in (P248)+retrieved (P813)) transferring the retrieved (P813) to the first
  2. two references: maintain the second (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)), delete the third (stated in (P248)+retrieved (P813)) transferring the retrieved (P813) to the first

Repeat the above query substituting date of birth (P569) to date of death (P570). Cases:

  1. two references: maintain the first (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)+named as (P1810)), delete the second (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)+retrieved (P813)) transferring the retrieved (P813) to the first
    1. two references bis: if the first is stated in (P248)+Accademia delle Scienze di Torino ID (P8153)+named as (P1810)+retrieved (P813), the second gets simply deleted
    2. two references ter: if there is a reference with reference URL (P854) containing a string "accademiadellescienze", it should be deleted; maintain the second (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)+retrieved (P813))
Discussion

@Ladsgroup: as his bot is probably ready for doing this. --Epìdosis 11:56, 30 December 2020 (UTC)

Request process

request to import podcast identifiers (2021-01-03)[edit]

Request date: 3 January 2021, by: Sdkb

Link to discussions justifying the request
Task description

Several properties have recently been created (see e.g. Castbox show ID (P9005) for podcast identifiers), which are being used for the new w:Template: Podcast platform links on Wikipedia. I was told to come here to get help importing the identifiers for a bunch of podcast items.

Licence of data to import (if relevant)
Discussion


Request process

request to remove "(täsmennyssivu)" and "(sukunimi)" from labels (2021-01-07)[edit]

Request date: 7 January 2021, by: 87.95.206.253

Link to discussions justifying the request
Task description

Please remove (täsmennyssivu) and (sukunimi) from Finnish-language primary labels. Either remove them completely or move to AKA sections.
Reason: "(täsmennyssivu)" is fi-wiki's equivalent of en-wiki's "(disambiguation)", and "(sukunimi)" of "(surname)". Those shouldn't be added to primary labels. I've notified two bot operators concerning additions of "(täsmennyssivu)" to labels: special:diff/1336656747 and Topic:W131u7lu86s1qpxl. Thanks!

Licence of data to import (if relevant)
Discussion


Request process

Fix social media account inconsistencies (2021-01-09)[edit]

Request date: 9 January 2021, by: AntisocialRyan

Task description
Discussion

As the person who created Q87406427 for exactly this purpose, I believe it should be end cause (P1534) since an account being suspended does not make the statement invalid or deprecated. It just has an end time for when it stopped being true (if my Twitter account is suspended tomorrow, it was still my Twitter account from the point of creation until then). --SilentSpike (talk) 22:28, 4 February 2021 (UTC)

Request process

Admin bot for deletion of 100k non-notable items[edit]

Request date: 19 January 2021, by: Epìdosis

Link to discussions justifying the request
Task description

Bot-deletion of the following items:

Discussion

I ping @Ladsgroup: and @MisterSynergy: as I know they have admin-bots. Thanks in advance, --Epìdosis 16:50, 19 January 2021 (UTC)

  • Not sure whether we should delete those at all. You can find plenty of similar datasets with *very* limited obvious use for our project. I'd say the items do meet the notability policy, so a much clearer consensus should be reached IMO, and it should be clear how this consensus relates to other similar situations. We could probably delete millions of items for the same reason, but do we want to? —MisterSynergy (talk) 17:02, 19 January 2021 (UTC)
    @MisterSynergy: I know that the situation is unclear and that it regards tens of thousands of items, maybe more. For this reason I have opened a general discussion (the third link above) and I've waited for a week with no feedback, whilst in the Project chat (the first link above) there seemed to be wide consensus for deletion. If you want to notify the discussion in other pages, or open a RfC or whatever I obviously support this, as I perfectly agree about the necessity to reach a clear conclusion on this point. Thanks as always, --Epìdosis 17:18, 19 January 2021 (UTC)
    Well, I don't find it that clear. There are quite some complaints about the bot operator not having their import approved with a separate bot task, but User:GZWDer is right that this is not explicitly required anywhere. In general, we have never managed to cleanly define batch editing and its distinction from bot editing in our policies, and we never managed to update the bot policy in a way that is suits Wikidata. It still bases on experiences made with bots in Wikipedias before Wikidata was launched, although this project relies on automated (bot) editing *much* more than Wikipedias do.
    To me, the discussions linked above seem to be fueled by the aversion to User:GZWDer that many users seem to feel. To be quite honest, I also do not like their behavior in most cases, as they aggressively edit in the gray area of our policies, and they are not very open for input from other users. This is genuinely a problem in a collaborative project, but as long as they do not clearly violate policies, which is not the case here in my opinion, I do not see a reason to suppress their contributions; please also mind that in my opinion Wikidata:Deletion policy does not allow to make use of the deletion tool here (I do consider these items notable according to WD:N).
    So, I think we should instead try to improve the policies so that this should not happen again, rather than to set a precedent for a sympathy-based use of the deletion tool. —MisterSynergy (talk) 20:01, 19 January 2021 (UTC)
    @MisterSynergy: OK, I perfectly agree about the need of updating our bot editing policy in order to avoid discussions post factum about great amounts of edits. However, my point isn't about the operate of GZWDer, but instead about the fact that, according to my interpretation, these items should be deleted because they don't respect WD:N. The discussion I opened was an attempt to find consensus about their respect, or not, of WD:N and no user contested my interpretation about the fact that didn't respect it, so I have also edited Help:Sources accordingly. I think that two separate discussions are then useful: one about bot policies; the other, already open but desert as of now, about the possibility for encyclopedia articles not having Wikisource sitelinks to fit WD:N (about which I am personally skeptical). --Epìdosis 21:45, 19 January 2021 (UTC)
    With the DOI claims, there is no doubt that they do meet the notability requirements. There also seem to be valid references on all (or at least most) of the claims. Notability is not an issue here; a deletion based on a not-notable claim would be completely at odds with our standard practice. —MisterSynergy (talk) 21:58, 19 January 2021 (UTC)
    @MisterSynergy: I partially disagree about their notability for one reason: as in these cases the DOI (P356) in fact coincides with the respective identifiers present in the items of the subjects of these articles, I agree with what @Bovlb: said in the Project chat: "Unless we're going to get a lot more information on these items, it seems to me that this sort of import would be better embodied in an identifier property." In general, my position is that the notability of items containing DOI (P356) can be taken for sure unless the DOI (P356) overlaps with an existing Wikidata property. I would prefer having a brief discussion somewhere about the fact that whichever item having DOI (P356) is notable, in order to finally add a statement DOI (P356) instance of (P31) Wikidata property for an identifier that suggests notability (Q62589316) and reach a general conclusion about this point. --Epìdosis 22:14, 19 January 2021 (UTC) P.S. As I'm not an expert of copyright, just a little confirmation: importing this sort of bibliographic metadata is CC0-compliant, isn't it?
    Well, the Benezit ID (P2843) identifier and the DOI are not identical. The identifier property was poorly managed in the past on Wikidata, but apparently mistakes/poor decisions have been made on the side of the external database as well which sort of contributed to the mess on the property page. The DOIs identify the encyclopedia articles, and the Benezit ID (P2843) identifiers identify the persons described in the articles. Of course, the URLs should *not* be identical, but Benezit ID (P2843) unfortunately uses DOI urls since March 2020 (Special:Diff/1130357608). The formatter URL should instead point to the URL which the DOI resolves to—unfortunately the identifiers would have to be changed as well then (i.e. rather make a new property and reset Benezit ID (P2843) to the pre-March 2020 state).
    The amount of content which is available in the items is not a relevant factor. There is no rule that there should be a "lot more information" available about something in order to be admissable here. —MisterSynergy (talk) 23:21, 19 January 2021 (UTC)

OK, thanks @MisterSynergy: for all the answers. As of now, it is quite clear that this problem certainly needs further discussions in other pages and I'm now convinced that probably the deletion is not to be performed anyway, as these items are notable because of DOI (P356) with few doubts. We can close, at least for now, this bot-request. Thanks again and good night, --Epìdosis 23:29, 19 January 2021 (UTC)

Request process

request to fix labels of humans - disambiguator (2021-01-24)[edit]

English labels for humans shouldn't end with a ")".

The following finds some 175 of them, all with "politician" in the label.

SELECT *
{
  hint:Query hint:optimizer "None".
  SERVICE wikibase:mwapi {
    bd:serviceParam wikibase:endpoint "www.wikidata.org" .
    bd:serviceParam wikibase:api "Generator" .
    bd:serviceParam mwapi:generator "search" .
    bd:serviceParam mwapi:gsrsearch 'inlabel:politician@en haswbstatement:P31=Q5' .
    bd:serviceParam mwapi:gsrlimit "max" .    
    bd:serviceParam mwapi:gsrnamespace "0" .    
    ?item wikibase:apiOutputItem mwapi:title  .    
  }
  ?item rdfs:label ?l.
  FILTER(REGEX(?l, "\\)$") && lang(?l)="en").  
}

Try it!

The usual fix would be to remove the disambiguator or make the label into an alias. The same can probably be done for other occupations/languages. --- Jura 16:10, 24 January 2021 (UTC)

@Jura1: I can code and run this, but would need a generic query to start from that would retrieve all entries regardless of occupation. No problem with stepping it over offsets if it takes a while to run. I think it would also need a bot request approval, since I don't think it falls under any of pi bot's other tasks. Thanks. Mike Peel (talk) 19:49, 24 March 2021 (UTC)
SELECT *
WITH
{
  SELECT ?value (count(*) as ?ct)
  {
    ?item wdt:P106 ?value
  }
  GROUP BY ?value    
  ORDER BY DESC(?ct) 
  OFFSET 0        
  LIMIT 50
}
AS %value
WHERE
{
  INCLUDE %value 
  hint:Query hint:optimizer "None".  
  ?value rdfs:label ?v . FILTER( lang(?v) = "en" ) 
  BIND( CONCAT( 'inlabel:"',?v,'@en" haswbstatement:P31=Q5') as ?search)
  { 
  SERVICE wikibase:mwapi {
    bd:serviceParam wikibase:endpoint "www.wikidata.org" .
    bd:serviceParam wikibase:api "Generator" .
    bd:serviceParam mwapi:generator "search" .
    bd:serviceParam mwapi:gsrsearch ?search .
    bd:serviceParam mwapi:gsrlimit "max" .    
    bd:serviceParam mwapi:gsrnamespace "0" .    
    ?item wikibase:apiOutputItem mwapi:title  .    
  }
  }
  ?item rdfs:label ?l.
  FILTER(REGEX(?l, "\\)$") && lang(?l)="en").  
}

Try it!

request to correct P248 values in references (2021-02-04)[edit]

Request date: 4 February 2021, by: Trade

Link to discussions justifying the request
Task description
Replace stated in (P248) > Unterhaltungssoftware Selbstkontrolle (Q157754) with stated in (P248) > USK Classification Database (Q105272106)
Replace stated in (P248) > Entertainment Software Rating Board (Q191458) with stated in (P248) > ESRB Rating Database (Q105295303)
Replace stated in (P248) > Pan European Game Information (Q192916) with stated in (P248) > PEGI Rating Database (Q105296817)
Replace stated in (P248) > Australian Classification Board (Q874754) with stated in (P248) > Australian Classification database (Q105296839)
Replace stated in (P248) > Australian Classification (Q26708073) with stated in (P248) > Australian Classification database (Q105296839)
Replace stated in (P248) > British Board of Film Classification (Q861670) with stated in (P248) > BBFC database (Q105296939)

--Trade (talk) 12:51, 4 February 2021 (UTC)

Discussion

@Trade: As far as I can understand, the involved property is not stated as (P1932) but stated in (P248), in references:

Is it correct? --Epìdosis 13:09, 4 February 2021 (UTC)

Yes @Epìdosis:--Trade (talk) 13:10, 4 February 2021 (UTC)
OK, title edited. --Epìdosis 13:12, 4 February 2021 (UTC)
Request process

reference URL (P854)Holocaust.cz person ID (P9109) (2021-02-05)[edit]

Request date: 5 February 2021, by: Daniel Baránek

Task description

After intoducing Holocaust.cz person ID (P9109), reference URL (P854) in references can be replaced by this new identificator. The result of edits should be like this. It is 285,282 references. You can see all references, their reference URL (P854) value and value for Holocaust.cz person ID (P9109) here:

SELECT ?ref ?url ?id WHERE {
  ?ref prov:wasDerivedFrom [ pr:P248 wd:Q104074149 ; pr:P854 ?url ].
  BIND (REPLACE(STR(?url),"^.*/([0-9]+)[-/].*$","$1") as ?id)
  }

Try it!

Discussion


Request process

request to mass change stated in (P248) in references after new restrictions on property (2021-02-05)[edit]

Request date: 5 February 2021, by: Sapfan ; major update 14 March 2021

Link to discussions justifying the request
Task description

Task 1: Replace archives with archive collections in stated in (P248)

Can you please replace the value of stated in (P248) in all references according to this mapping table:

If stated in (P248) is and title (P1476) begins with then change stated in (P248) to
Prague City Archives (Q19672898) Archiv hl. m. Prahy, Matrika Collection of Registry Books at Prague City Archives (Q105319160)
Prague City Archives (Q19672898) Archiv hl. m. Prahy, Soupis pražských List of residents in Prague 1830-1910 (1920) (Q105322358)
Moravian regional archive (Q12038677) Moravský zemský archiv, Matrika Collection of Registry Books at Moravian Regional Archive (Q102116996)
Státní oblastní archiv v Litoměřicích (Q18920590) SOA Litoměřice, Matrika Collection of Registry Books at Litoměřice State Archive (Q105319095)
Regional state archive in Pilsen (Q21953079) SOA Plzeň, Matrika Collection of Registry Books at Pilsen State Archive (Q105319092)
Státní oblastní archiv v Praze (Q12056840) SOA Praha, Matrika Collection of Registry Books at Prague State Archive (Q105319086)
Regional State Archives in Třeboň (Q12056841) SOA Třeboň, Matrika Collection of Registry Books at Třeboň State Archive (Q105319089)
Státní oblastní archiv v Zámrsku (Q17156873) SOA Zámrsk, Matrika Collection of Registry Books at Zámrsk State Archive (Q105319097)
Zemský archiv v Opavě (Q10860553) Zemský archiv v Opavě, Matrika Collection of Registry Books at Opava Regional Archive (Q105319099)
Museum of Czech Literature (Q5979897) Kartotéka Jaroslava Kunce Kunc Jaroslav (Q82329263)

Task 2a: Retrieve information from free text in title (P1476) and fill it into specific properties

If stated in (P248) is (formerly) then the standard format of title (P1476) is and we can derive: volume (P478) inventory number (P217) page(s) (P304)
Collection of Registry Books at Prague City Archives (Q105319160) Prague City Archives (Q19672898) Archiv hl. m. Prahy, Matrika zemřelých u sv. Ludmily na Vinohradech, sign. VIN Z6, s. 201 between "sign." and "," (here: VIN Z6) (N/A - ignore) after ", s."
Collection of Registry Books at Moravian Regional Archive (Q102116996) Moravian regional archive (Q12038677) Moravský zemský archiv, Matrika zemřelých Brno - sv. Tomáš 17056, s. 59 between the word "narozených", "oddaných" or "zemřelých" and "," (here: Brno - sv. Tomáš 17056) (N/A - ignore) after ", s."
Collection of Registry Books at Litoměřice State Archive (Q105319095) Státní oblastní archiv v Litoměřicích (Q18920590) SOA Litoměřice, Matrika zemřelých Z • inv. č. 4505 • sig. 96/20 • 1784 - 1831 • Dubany, Evaň, Libochovice, Poplze, Radovesice, Slatina, s. 92 between "sig." and "•" between "inv. č." and "•" after ", s."
Collection of Registry Books at Pilsen State Archive (Q105319092) Regional state archive in Pilsen (Q21953079) SOA Plzeň, Matrika narozených Rokycany 16, s. 54 between the word "narozených", "oddaných" or "zemřelých" and "," (here: Rokycany 16) (N/A - ignore) after ", s."
Collection of Registry Books at Prague State Archive (Q105319086) Státní oblastní archiv v Praze (Q12056840) SOA Praha, Matrika narozených Lošany 20, s. 82 between the word "narozených", "oddaných" or "zemřelých" and "," (here: Lošany 20) (N/A - ignore) after ", s."
Collection of Registry Books at Třeboň State Archive (Q105319089) Regional State Archives in Třeboň (Q12056841) SOA Třeboň, Matrika narozených Písek 18, s. 529 between the word "narozených", "oddaných" or "zemřelých" and "," (here: Písek 18) (N/A - ignore) after ", s."
Collection of Registry Books at Zámrsk State Archive (Q105319097) Státní oblastní archiv v Zámrsku (Q17156873) SOA Zámrsk, Matrika zemřelých v Hradci Králové, sign. 51-7657, ukn 3069, s. 364 between "sign." and "," (here: 51-7657) from "ukn" to the next "," (here: ukn 3069) after ", s."
Collection of Registry Books at Opava Regional Archive (Q105319099) Zemský archiv v Opavě (Q10860553) Zemský archiv v Opavě, Matrika narozených N • inv. č. 3142 • sig. Je III 4 • 1780 - 1792 • Bobrovník, Bukovice,…, s. 266 between "sig." and "•" (here: sig. Je III 4) between "inv. č." and "•" after ", s."

The properties should only be entered if they do not exist already (because the most recently added references already follow the new format). Property volume (P478) gives an error if it contains non-latin characters (such as Dobříš 38 in occupation of František Danielovský (Q105947011)), but this is probably unavoidable - we cannot distort naming standards defined at source.

Task 2b: Fill type of reference (P3865) from text in title (P1476)

Relates to all records processed in tasks 1 or 2a (i.e., selected archive collections according to stated in (P248)). Do not fill again if already entered (refers to recently added items).

If title (P1476) contains Example then enter type of reference (P3865)
Matrika narozených SOA Praha, Matrika narozených Kostelec nad Černými lesy 19, s. 194 birth registry (Q11971341)
Matrika oddaných Moravský zemský archiv, Matrika oddaných Brno - sv. Janů (u minoritů) 16983, s. 484 marriage registry (Q14324227)
Matrika zemřelých SOA Zámrsk, Matrika zemřelých fary Chotěboř, sign. 8291, ukn 3541, s. 283 death registry (Q12029619)
Soupis pražských Archiv hl. m. Prahy, Soupis pražských obyvatel, list 162 • 1864 • Čáslavský, Karel conscription sheet (Q105921971)

Task 3: Replace Wikimedia import URL (P4656) with reference URL (P854) with the same value

By mistake, I have created many links to pages of parish registers, which have been uploaded to Commons, as Wikimedia import URL (P4656) instead of the ordinary reference URL (P854) and now the links are not displayed in some Wikipedia templates. Example: František Fischer (Q97466111). It only relates to SOA Zámrsk (stated in (P248) = Collection of Registry Books at Zámrsk State Archive (Q105319097), formerly Státní oblastní archiv v Zámrsku (Q17156873)).

My mistake - but can please someone correct it with reasonable effort (i.e., by mass replacement). Sorry about it!

Task 4: Replace rotten links

Finally, many links in references pointing to Collection of Registry Books at Moravian Regional Archive (Q102116996) (formerly Moravian regional archive (Q12038677)) stopped working after a website reorganization due to Adobe Flash Player retirement. The more frequently used ones are:

Existing reference URL (P854) portion (example before) should be replaced with (example after) Belongs to volume (P478)
http://actapublica.eu/matriky/brno/prohlizec/10374/?strana= http://actapublica.eu/matriky/brno/prohlizec/10374/?strana=9 https://www.mza.cz/actapublica/matrika/detail/10266?image=216000010-000253-003381-000000-017056-000000-VR-B08429-nnnn0.jp2 (where nnnn = number after "strana=" padded with leading 0) https://www.mza.cz/actapublica/matrika/detail/10266?image=216000010-000253-003381-000000-017056-000000-VR-B08429-00090.jp2 Brno - sv. Tomáš 17056
http://actapublica.eu/matriky/brno/prohlizec/11303/?strana= http://actapublica.eu/matriky/brno/prohlizec/11303/?strana=37 https://www.mza.cz/actapublica/matrika/detail/11133?image=216000010-000253-003381-000000-017057-000000-VR-B08430-nnnn0.jp2 (where nnnn = number after "strana=" padded with leading 0) https://www.mza.cz/actapublica/matrika/detail/11133?image=216000010-000253-003381-000000-017057-000000-VR-B08430-00370.jp2 Brno - sv. Tomáš 17057

Thanks in advance! --Sapfan (talk) 21:30, 5 February 2021 (UTC), update: --Sapfan (talk) 10:50, 14 March 2021 (UTC)

Licence of data to import (if relevant)

(No licence - public domain data already present on WD)

Discussion
  • Wouldn't "publisher" be the more appropriate property for the organization?
"stated in" indicates the publication, not the archives or museum this comes from. This can be a (printed or online) catalogue or database. That the items described there are also part of a specific collection is another question.
Maybe Help:Sources#Databases explains it better. --- Jura 21:45, 5 February 2021 (UTC)
Actually, this is the reason for the request. Right now, "stated in" points to an organization. We now want to point to the collection (or database, as you write). Do you see any issue with it? What would be a more appropriate value for this property, if it is an archive collection? --Sapfan (talk) 22:04, 5 February 2021 (UTC)
"collection" is generally what is in an archive not its catalog. A catalog describes the collection and can include data about a person.
Maybe Q101498693 (linked from Q105319160 mentioned above) given its P31 value.
Another way of looking at it, would to identify the publication the date of death "15 June 1920" is mentioned in (this is from the sample mentioned above Q97993619#P570), just as one would in reference at Wikipedia.
It's not really something that can be guessed, one needs to determine the publication. --- Jura 06:29, 6 February 2021 (UTC)
Hi Jura, thanks for trying to find the best reference. But I have doubts about the direction you are heading to.
  • Whenever someone is born, gets married or dies, then a record is made into an official book. This book is unique - no publication. But yes, it can be identified by issuing authority, sequential nr etc.
  • After 75, 100 or similar number of years, the public registry hands over the books to a designated public archive. There they are recorded and presented to the public - offline or online.
  • To identify it as a source, we need to make an archive citation. Some recommend to start with item name and end with archive name ([3]). Czech universities suggest to go "from broad to narrow", i.e. start with archive name and end with specific item (e.g., [4]). I follow the second approach in title (P1476).
  • Now, what to enter as stated in (P248). The equivalent of a "publication" would be a book title, such as "Death register JIL Z12 of St. Gilles Parish in Prague" (to take our example Karl Mikolaschek (Q97993619)). However, there is a distinction between a published book (such as Encyclopædia Britannica) and an archived registry: copies of E. B. can be found in many libraries, therefore we do not need to write where to find it. But a parish register is unique. You first need to know in which single archive it is stored and then how to find it there.
  • Therefore, the theoretically best approach would be, to create a wikidata item for each parish book. But this would lead to an explosion of items, most of which used only once or twice (or not at all, if we would replicate the whole archive catalog). That is why I used to give the highest level - the archive name - as stated in (P248). But if we need a "thing" rather than a "(legal) person" in this role, then the most practical option is a "collection of birth, marriage and death records at a given archive". This is what I am proposing.
  • Central Registry of Archive Heritage (Q101498693) is unfortunately not the right object. (You could not know it - the description was only in Czech.). It is a list of archive collections, which include the birth/death registers we want to cite. But you cannot find the birth/death date of a single person there. You need to go to a specific parish book and page, which is a part of certain archive collections such as Collection of Registry Books at Prague City Archives (Q105319160).
  • That is why I still believe, that we should enter the specific collection as stated in (P248) - because this is the first step in finding the info, similar to a book or magazine with many volumes. Once you are in that archive, you need to know the specific resource ID such as accession number (Q1417099) (in Wikidata entered as volume (P478)), inventory number (P217) and page(s) (P304) if available, in structured and/or unstructured (title (P1476)) format.

Still not convinced? Then please tell what practical option to use. I do not think that the list of key archive collections in Czechia is the appropriate one - it would give even less information to the reader than the archive name we have now. Thanks! --Sapfan (talk) 08:14, 6 February 2021 (UTC)

  • Seems I had the wrong type of reference in mind. Sorry about that. Actually we lack an outline on how to cite church registries and similar contained in archives at Help:Sources.
We didn't get much past Wikidata:Property proposal/civil registration district.
If you want to write a short summary based on your explanation above, that could be most helpful. Depending on what you prefer, it can be fairly specific and let to others to be expanded/generalized. --- Jura 09:03, 6 February 2021 (UTC)
Thanks, Jura! I have just placed a citation proposal on Help_talk:Sources#Vital_records_and_other_archive_collections_as_sources. Let's see what the community says. Feel free to comment. I will update the mapping tables based on the outcome. --Sapfan (talk) 13:28, 6 February 2021 (UTC)

Update after a long break (14 March 2021): After a lenghty discussion, there is now an official recommendation regarding citations of archive collections. Can you please check the requests above and tell if it is feasible (and desirable) to perform these mass changes. Thanks in advance! --Sapfan (talk) 10:50, 14 March 2021 (UTC)

Request process

request to fix Property:P395 for Spain (2021-02-08)[edit]

Request date: 8 February 2021, by: Jura1

Link to discussions justifying the request
Task description
Discussion

I see it's only for spanish communities. I would put an end date to the property, and maybe downgrade their rank. What is the end date actually? Edoderoo (talk) 08:24, 9 February 2021 (UTC)

  • The problem is that the codes were applied not only to (autonomous) communities, but thousands of items.
w:Vehicle_registration_plates_of_Spain#Current_system mentions 18 September 2000, but I'd just use the year 2000. --- Jura 08:40, 9 February 2021 (UTC)
Request process

request to import DOI and ISBN as items when present in any Wikipedia article (2021-02-11)[edit]

Request date: 11 February 2021, by: So9q

Link to discussions justifying the request
Task description

The bot follows the eventstream from WMF and looks at all changed pages whether they contain a DOI or ISBN number. Then it checks whether we already have an item for that number and import if not.

Addition: I forgot to mention that I would like to add all missing authors also (if they have ORCID) and articles that are cited from the DOI in Wikipedia also (1 hop away). That amounts to millions of items and probably most of the 86 mio articles in existence, but the import rate is gonna be lower because we use Wikipedia changes as a prism.

2nd addition: So the idea of the workflow for DOI is this:

  1. the bot sees a DOI
  2. checks if it is in WD
  3. if not: prepares to import it by reading the Crossref API
  4. check if the author has ORCID and look them up in WD
  5. if an author has ORCID and are missing
  6. add a new item with the ORCID, name, short name, given name, family name
  7. if any references are present lookup if they have items via the DOI
  8. if any references are missing, look them up in Crossref and import their authors if they have ORCID and are missing
  9. import items for the references*
  10. finish by importing the DOI originally found and link to all the items created above*

So the idea of the workflow for ISBN is this:

  1. look up the ISBN in WD
  2. if not found get details from Worldcat by scraping (they have no open API what I know)
  3. look up the authors by name in WD
  4. if no author found add an item for the author
  5. add the item to WD*
  • linking to author if possible, and fall back to adding a name string like for scientific publications for later manual reconciliation.

Source code (currently missing the upload part)

License of data to import (if relevant)

Not copyrighted (facts) from Wikipedia and Crossref (DOI) and Worldcat (ISBN).

Discussion
  • I like this. I have a long standing wish to import articles cited in important scholarly databases (and I consider Wikipedia one of them). --Egon Willighagen (talk) 15:36, 14 February 2021 (UTC)
  • This is fine due to the constrained scope of use on other Wikimedia projects (which scope, as I asserted in other fora, was scrapped by other people for some reason). The moment 'when present in any Wikipedia article' is removed as a limiting factor in your job is the moment I cease to support it, primarily due to the continued strain this will cause on the query servers. (I wonder if we should be pinging @GLederrey (WMF), CParle (WMF), ZPapierski (WMF): more for any further proposed jobs which are contentious for query service performance reasons, to get their opinions on the matter as those tasked with maintaining the query service.) Mahir256 (talk) 19:27, 14 February 2021 (UTC)
  • @mahir256: Thanks for the support. BTW I forgot to mention that I would like to add all missing authors and articles that are cited from the DOI in Wikipedia also. That amounts to millions of items I guess... Do you support that too? Not doing that means that the user cannot follow the science when they read the DOI item in WD which is one of the main advantages of adding it to WD in the first place isn't it? (that science articles are put into our rich context of links enabling very advanced queries compared to what you can do now in the proprietary databases).
Regarding the WDQS infrastructure and queries I'm not aware of any negative effects if we go from 30 mio articles to 86 mio articles. The p31-> scientific article already times out, and that is not affected. Someone in project chat suggested to query by year and then it does not time out. That might be affected if we have 2 mio articles for each year or something. Anyway timeouts are not a problem, it's a feature if I understood correctly. It tells the user "Hey you are trying to do something that WMF does not want to provide the infrastructure to do. Go ahead and set up WDQS yourself and download the data and run whatever queries you want without time limits at your own expense".
The problem from a usability POW is that the error messages of WDQS are pretty bad. Stack traces should NEVER be exposed by default to a user IMO. They are for developers. That's an important UI bug to fix IMO. See T275736--So9q (talk) 06:40, 25 February 2021 (UTC)
  • Didn't this already happen (some separate WMF installation with all references), but then wasn't imported into Wikidata due to some size issue? --- Jura 22:10, 14 February 2021 (UTC)
@jura1:I never heard about that. I asked in the wikidata group and no one seems to have done what I propose. We don't have a size issue what I'm aware in the infrastructure. With maxlag in effect it probably won't affect WDQS either. I promise to slow the bot down so that we don't get bottleneck problems because of too much write requests/new items created per minute.--So9q (talk) 22:56, 24 February 2021 (UTC)
Request process

request to add identifiers from FB (2021-02-11)[edit]

Thanks to a recent import, we currently have more than >1.2 items where the only identifier is Freebase ID (P646). However, checking https://freebase.toolforge.org/ some of them have identifiers available there.

Samples:

See Wikidata:Project_chat#Freebase_(bis) for discussion.

Task description

Import ids where available. Map keys to properties if not available at Wikidata:WikiProject_Freebase/Mapping.

Discussion


Request process

request to update all ckbwiki article labels (2021-02-12)[edit]

Request date: 12 February 2021, by: Aram

Link to discussions justifying the request
  • There is no any discussion because I don't think the update of the labels require discussion.
Task description

Often, when moving articles, wikidata lables will not be updated to the current article names. So, we need to update all ckbwiki article labels on wikidata. For example, I moved this by using my bot, but it's label on wikidata hasn't been updated yet. ckbwiki has 28,768 articles so far. Thanks!

Licence of data to import (if relevant)
Discussion
  • @Aram: You mean
    • the sitelinks (to ckbwiki),
    • or the labels (in ckb),
    • or both?
When page moves on ckbwiki aren't mirrored here generally that means that the user moving them hasn't created an account on Wikidatawiki. You would need to log-in to Wikidata with your bot account at least once. --- Jura 14:09, 15 February 2021 (UTC)

@Aram: --- Jura 14:10, 15 February 2021 (UTC)

@Jura1: Really? I didn't know that before. Thank you for the hint! Although, it seems that my bot has been logged in in this edit, but the label has not yet been updated. However, regarding your question, we want to only update the ckbwiki labels. Thank you! Aram (talk) 15:06, 15 February 2021 (UTC)
  • It seems the account exists on wikidatawiki so the sitelinks to cbkwiki are updated (since Feb 9), so edits like this one you mentioned above are no longer needed.
    However, this wont have any effect on the label of the item in cbk at Wikidata. These need to be updated separately if deemed correct (by bot, QuickStatements or manually). --- Jura 15:36, 15 February 2021 (UTC)
Thanks! Aram (talk) 20:14, 18 February 2021 (UTC)
Request process

@Aram: if something still needs to be done but bot, you might want to detail it. --- Jura 09:33, 30 March 2021 (UTC)

@Jura1:, Thank you! Yes, already, we've seen a bot added missing article labels or update them on Wikidata, but it won't add/update any labels for a long time (I'm talking about ckbwiki). See here as an example. Here, we want to
  • update all article labels on Wikidata.
  • update article labels while moving the article to a new title if any bot can do it automatically and immediately.
    • If not, update them every 6 months or whenever the bot manager can run the bot.
  • update the category and template labels if the bot can.
But it is clear to ignore that parentheses after the title. For example, both en:Casablanca and en:Casablanca (film) labels are "Casablanca". That is all. Thank you again! Aram (talk) 11:38, 31 March 2021 (UTC)

request to change Belarusian language description from "спіс атыкулаў у адным з праектаў Вікімедыя" to "спіс артыкулаў у адным з праектаў Вікімедыя" in all the articles. A letter "р" was missed (2021-02-23)[edit]

Request date: 23 February 2021, by: Belarus2578

Link to discussions justifying the request

There is not discussion. There is only obvious mysprint. --Belarus2578 (talk) 05:01, 25 February 2021 (UTC)

Task description

Please, change Belarusian language description from "спіс атыкулаў у адным з праектаў Вікімедыя" to "спіс артыкулаў у адным з праектаў Вікімедыя" in all the articles. A letter "р" was missed. --Belarus2578 (talk) 06:47, 23 February 2021 (UTC)

Discussion
Pictogram voting comment.svg Comment There are over 250,000 items. --Matěj Suchánek (talk) 10:15, 13 March 2021 (UTC)
Request process

request to fix descriptions "other organization" (2021-02-23)[edit]

Request date: 23 February 2021, by: Jura1

Task description
  • There are some 8000 items which describe organizations as "other organization"
  • Remove "other " from these descriptions
Discussion
  • This may have some logic in its source, but not in Wikidata or in whatever other context Wikidata descriptions are used. @ArthurPSmith: who may have created some or all of them [5]. --- Jura 19:43, 23 February 2021 (UTC)


Request process

request to .. (2021-03-14)[edit]

Request date: 14 March 2021, by: Mikey641

Link to discussions justifying the request

Many links in Museum of the Jewish People at Beit Hatfutsot ID (P9280), also in next section.

Task description

Hey. I would love to get some help with importing the id's of [6] to Museum of the Jewish People at Beit Hatfutsot ID (P9280).
The structure of the Url is https://dbs.anumuseum.org.il/skn/en/e256696
Basically it could describe anything - a country/person/choir/place.
If you use https://dbs.anumuseum.org.il/skn/en/e211457 - The url of South Africa, when pasted it will turn into https://dbs.anumuseum.org.il/skn/en/c6/e211457/Place/South_Africa
Therefore we can differentiate between Place, Family_Name, Personalities.
I have not found a tabular source for it, therefore this task requires a bot. And my programming is not sufficient.
I would suggest going through all urls and matching the Label to a wikidata Item, while differentiating between same labels using the URL description of what entity, comparing it to instance of (P31)--Mikey641 (talk) 17:04, 14 March 2021 (UTC)

Discussion


Request process

request to unweave former Dutch municipalities from their eponymous capital (2021-03-31)[edit]

Request date: 2 April 2021, by: 1Veertje

Link to discussions justifying the request
Task description

As per this query there are currently 1115 human settlements that have their data mixed up with their eponymous former municipality. Unweaving this is quite tricky since it's hard to preserve references. I think these statements need to be moved to a new item:

It can be assumed that:

The original item should have the statement added that:

Further items that need adjusting:

I'm not sure about whether or not these are more appropriate for an item about the municipality:

Discussion

@Multichill, Antoni1626: Is there a way of moving this data arround that preserves the references? The population statistics should also get the qualifier publisher (P123) moved to the references. --1Veertje (talk) 11:29, 2 April 2021 (UTC)

For your request above, maybe you could create the new items and then produce a query of what needs to be moved. --- Jura 12:33, 2 April 2021 (UTC)
  • Symbol oppose vote oversat.svg Strong oppose doing this by hand, let alone by bot. What would be the point of splitting up for example Bennebroek (Q817840)? Multichill (talk) 16:55, 2 April 2021 (UTC)
    the role and area of a municipality is very different from a town. The statistics are recorded at the municipal level, but the area relevant to it is very different. I grew up in the small municipality of Wateringen, which covered the area of the eponymous village and the village of Kwintsheul. Using one and the same item to describe Kwintsheul's jurisdictional history is very ugly. Documenting a municipality like Bergeyk/Bergeijk changing its size over time is hard enough without it also needing to serve the role of its eponymous village. It's incomprehensible to me that you modeled the data in this way. 1Veertje (talk) 18:00, 2 April 2021 (UTC)
    The world is not black and white. Trying to model it like that won't work either. Might be hard to understand. The municipal status is just something that got assigned at some point to cities, villages and heerlijkheiden. In some cases it might make sense to split, in some cases it doesn't make sense. Mass splitting is not the solution. Multichill (talk) 09:39, 3 April 2021 (UTC)
    The data relevant to the municipality gets snowed over by its pseudonymous village if it doesn't get its own item once it gets superseded by another municipality. A simple query for municipalities in ZH doesn't show an item like Rhoon (Q687584) because the municipality of the Netherlands (Q2039348) in the P31 is the only one that doesn't have a preferred statement. You can't give the main item a dissolved, abolished or demolished date (P576) property because things are mixed up in each other and you can't link to it from the new municipality to the old with replaces (P1365). It's jurisdictional history can't very well reference itself. Assuming that just because there is an eponymous village the size would be about the same was a wrong assumption to make, like with my example of Wateringen. So far I've encountered no problems strictly following the set of instructions written out above. --1Veertje (talk) 09:59, 4 April 2021 (UTC)
  • Maybe you want to sort this out on Wikidata:Project_chat or Wikidata:De_kroeg and then make a request. --- Jura 20:25, 2 April 2021 (UTC)


Request process

request to change "instance of" on some Q-items (2021-04-09)[edit]

Request date: 9 April 2021, by: Taylor 49

Link to discussions justifying the request
[8] probably uncontroversial, nobody answered
Task description
change "instance of"
  • Help:Contents (Q914807) -> Wikimedia help page (Q56005592)
  • Appendix:TOC (Q35243371) -> Wikimedia appendix namespace page (Q101043034)

Nothing should be "instance of" TOC/Index/Contents. A bot should change all "instance of" Help:Contents (Q914807) to Wikimedia help page (Q56005592) and all "instance of" Appendix:TOC (Q35243371) to Wikimedia appendix namespace page (Q101043034).

Proposer: Taylor 49 (talk) 13:29, 9 April 2021 (UTC)

Request process

request to uprank current existing countries (2021-04-10)[edit]

Request date: 10 April 2021, by: Bouzinac

Link to discussions justifying the request
Task description

Help clean P17 data by:

Exemple : Q2492784#P17 --> Ukraine (Q212) [which does not have any P576] + Soviet Union (Q15180) [which has a P576] ==>Ukraine (Q212) should be upranked

Discussion
Request process

import writers/screenwriters (one time data import)[edit]

When adding values for screenwriter (P58), I notice that frequently these persons don't have Wikidata items yet.

It would be helpful to identify a few sources for these and create corresponding items. Ideally every tv episode would have its writers included. --- Jura 15:05, 18 November 2018 (UTC)

It would be beneficial if informations like if the writer wrote just the teleplay or the story would be stated.--CENNOXX (talk) 07:19, 12 April 2019 (UTC)
  • At this stage, the idea is to simply create items for writers, not adding them to works. --- Jura 12:26, 19 July 2019 (UTC)
  • Would be helpful for WP Movies. --- Jura 21:19, 26 March 2020 (UTC)
  • If these are created for TV series (for which we might not have items for every episode), the series could be mentioned with contributed to creative work (P3919). Creating them beforehand makes it easier to add them to episodes once they are created. --- Jura 09:31, 14 April 2021 (UTC)
I'm not sure if there is a good free source on screenwriters somewhere. Perhaps as an alternative we can collect red links of the corresponding infobox parameter on the English Wikipedia and create items for them? —putnik 09:40, 14 April 2021 (UTC)
Yes, episode lists and season articles could be used to create them. --- Jura 09:46, 14 April 2021 (UTC)

ValterVB LydiaPintscher Ermanon Cbrown1023 Discoveranjali Mushroom Queryzo Danrok Rogi Mbch331 Jura Jobu0101 Jklamo Jon Harald Søby putnik ohmyerica AmaryllisGardener FShbib Andreasmperu Li Song Tiot Harshrathod50 U+1F350 Bodhisattwa Shisma Wolverène Tris T7 Antoine2711 Hrk6626 TheFireBender V!v£ l@ Rosière WatchMeWiki! CptViraj ʂɤɲ Trivialist 2le2im-bdc Sotiale Wallacegromit1, mostly focus on media historiography and works from the Global South Floyd-out M2k~dewiki Rockpeterson Mathieu Kappler Sidohayder Spinster Gnoeee Pictogram voting comment.svg Notified participants of WikiProject Movies --- Jura 09:32, 14 April 2021 (UTC)

Maybe we can use IMDb Datasets with writer information (see title.crew.tsv.gz). Queryzo (talk) 14:25, 14 April 2021 (UTC)

Note that IMDB isn't counted as a reliable reference for enwp (since it is user-generated), so a different source would be better if possible (and references added during import!). Thanks. Mike Peel (talk) 06:56, 16 April 2021 (UTC)
  • Wikidata isn't enwiki. Generally in this field for this type of information, IMDb is highly regarded. As for any statement, several references can be useful. --- Jura 07:38, 16 April 2021 (UTC)
  • It could be interesting to do a one-time import and try to extract additional information from Wikipedia for current series/seasons on a regular basis. --- Jura 07:38, 16 April 2021 (UTC)

Add original title of scientific articles (data import/cleanup)[edit]

There are some articles, that have title (P1476) value enclosed in square bracket. This means that the title is translated to English and the article's title wasn't in English.

Sample: https://www.wikidata.org/w/index.php?title=Q27687073&oldid=555470366

Generally, the following should be done:

  1. deprecate existing P1476 statement
  2. add the original title with title (P1476)
  3. add the label in the original language
  4. remove [] from the English label

--- Jura 11:03, 11 December 2018 (UTC)

Research_Bot claims to do this under Maintenance Queries but I still see a lot of research papers with this issue. I might work on a script for this to try and figure out how to make a bot. Notme1560 (talk) 18:17, 21 March 2019 (UTC)
I have created a script for this task. source and permission request --Notme1560 (talk) 20:39, 23 March 2019 (UTC)
  • It seems there may be some 5 out of 220 needing this. --- Jura 17:22, 26 August 2019 (UTC)
  • Would still be worthwhile. --- Jura 21:19, 26 March 2020 (UTC)
  • --- Jura 09:34, 14 April 2021 (UTC)

Fix capitalization and grammar of Bosnian labels (2021-04-14)[edit]

Request date: 14 April 2021, by: Srđan

Link to discussions justifying the request
Task description

See: quarry:query/54093

Could you run the query once more? As it should show now a lot less then the 418824 items of April 14th. Edoderoo (talk) 15:05, 2 May 2021 (UTC)
@Edoderoo:: Sorry for the late reply. Just re-ran the query and it's sitting at 224,889 items. Definitely fewer than before, but still a lot to go. – Srđan (talk) 16:13, 8 May 2021 (UTC)

These are the descriptions that show be written in lowercase and slightly altered:

Srđan (talk) 08:37, 30 April 2021 (UTC)

Licence of data to import (if relevant)
Discussion
  • Here is a query: [9]. Maybe check if any bots still add more of them. --- Jura 09:55, 14 April 2021 (UTC)
Request process

Accepted by (Edoderoo (talk) 15:05, 2 May 2021 (UTC)) and under process

request to add Property:P9382 (2021-04-16)[edit]

Request date: 16 April 2021, by: 217.117.125.72

Task description

For all items with Unicode hex codepoint (P4213) add Unicode character name (P9382). Good source of data is official site of Unicode. I think that Unicode character (P487) can be used to add Unicode hex codepoint (P4213) and Unicode character name (P9382) if item has it but hasn’t other two properties. 217.117.125.72 08:28, 16 April 2021 (UTC)

Discussion


Request process

request to .. (2021-04-19)[edit]

Request date: 19 April 2021, by: Powell Street Festival Society

Link to discussions justifying the request

I have been tasked by the Powell Street Festival Society to upload to Wikidata a listing of Japanese-Canadian Artist information from the Japanese-Canadian Artists Directory.

Task description

I have worked through the various Wikidata steps to prepare the data to be imported. The data is in an Excel spreadsheet. It appears that I am on Step 6. I can provide a sample file with column headers to check that I have parsed the data properly.

Thank you for your attention with this request. I look forward to your response.

Regards, Michael

Licence of data to import (if relevant)
Discussion

@Powell Street Festival Society: What list of steps are you following? You don't necessarily need a bot to do this import. BrokenSegue (talk) 17:35, 19 April 2021 (UTC)

Hello BrokenSegue I am new to this process (and to Wikidata) and have been following the steps in the "Data Import Guide", I have created the "Dataset Summary" and it appears I am on Step 7: Match the data to Wikidata (Option 2: self import). I could really use some help to figure this out. I am not even sure if I am replying properly :)

Request process

request to import data for "Cheung Chau Piu Sik Parade" (2021-05-06)[edit]

Request date: 6 May 2021, by: Hkbulibdmss

Link to discussions justifying the request
Task description

https://www.wikidata.org/wiki/Wikidata:Dataset_Imports/Cheung_Chau_Piu_Sik_Parade

Please help to import the dataset. The URL of a spreadsheet is : https://docs.google.com/spreadsheets/d/1iUVrHNsXVmn94IygtZYj0-foeUg9yvdOcwQ_V-CQbto/edit?usp=sharing

Licence of data to import (if relevant)
Discussion


Request process

request to fix parliamentary group = caucus, != party (2021-05-12)[edit]

Request date: 12 May 2021, by: Jura1

Link to discussions justifying the request
Task description
Discussion


Request process

request to .. (2021-05-12)[edit]

Request date: 12 May 2021, by: 41.113.246.57

Link to discussions justifying the request
Task description
Licence of data to import (if relevant)
Discussion


Request process