Wikidata talk:WikiProject Source MetaData

From Wikidata
Jump to navigation Jump to search

Author IDs[edit]

LensIDs currently offer a metarecord for publications, but they have author IDs and organization IDs as well under the hood. The author IDs will become publicly accessible in the next year.

If they end up both being called "LensID" but referring to different sorts of entities [works, people, organizations] would we want different records for the different facets of the identifier? Sj (talk) 18:55, 21 October 2021 (UTC)[reply]

  • @Sj: Lens ID (P7100) should probably be limited to works, and new properties created for other datatypes. However if the identifiers all fall within a single namespace and resolve the same way then perhaps it wouldn't hurt to just expand Lens ID (P7100) to cover all cases. But usually separate properties are the way to go in these cases. ArthurPSmith (talk) 21:16, 21 October 2021 (UTC)[reply]
    I believe that even internally they have separate namespace-facets, this is a fine Q to discuss with the ID maintainers. Thanks. Sj (talk) 22:20, 21 October 2021 (UTC)[reply]

They have different IDs fir authirs6, eg see https://www.lens.org/lens/profile/280312543/scholar. The data is copied from ORCID, and linked to Lens publications.

The problem is that in 3 years, only 20 IDs are added, and even one of the samples is wrong. Which is a shame because Lens is one of the main SciKG sources.

Please move this section to the Lens ID page, thanks! Vladimir Alexiev (talk) 05:27, 26 January 2022 (UTC)[reply]

Letters to the editor[edit]

I saw you replaced scholarly article (Q13442814) with letter to the editor (Q651270) on Reply to Boslough et al.: Decades of comet research counter their claims (Q28661563).

That creates type constraint for articles with the properties PMCID (P932), PubMed ID (P698), and ResearchGate publication ID (P5875). Perhaps you could add letter to the editor (Q651270) as genre (P136) of scholarly article (Q13442814) instead? Trilotat (talk) 01:46, 14 January 2022 (UTC)[reply]

@Trilotat: This is a tricky one. I've actually done this for all of the YDIH related letters. My thinking was that they rarely go through much peer review so it's best to keep them separate. That, and it makes the Scholia profile graphs more useful (I know that's probably not a great way to look at it). Would a better option be to make a new entity for scholarly letters/replies and have them as a subclass of scientific publication (Q591041) or something similar? Really not too sure what to do here, any help would be much appreciated, cheers! Aluxosm (talk) 16:54, 15 January 2022 (UTC)[reply]
There are many - a very great many - scholarly articles that never went through much peer review. Please don't conflate "scholarly article" with "peer reviewed article". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:41, 24 January 2022 (UTC)[reply]
@Pigsonthewing: It's a good point; I've come across a more than a few where it seemed as though the only person who read the article before publication was the original author! The en description for scholarly article (Q13442814) is, "article in an academic publication, usually peer reviewed", so all of the items I changed do still fit the description and are probably closer to that than they are to letter to the editor (Q651270). All in all, it wasn't my best reasoning but I'm still not sure what to do; I think there should be some kind of distinction between an extensive scholarly article with dozens to hundreds of references and a single page reply with only a handful. Do you have any thoughts on the idea of creating a new item (e.g. scholarly letter), or do you think that they should just have both scholarly article (Q13442814) and letter to the editor (Q651270) applied to them? Cheers! Aluxosm (talk) 15:00, 26 January 2022 (UTC)[reply]
The problem is that there is a continuum, it's not a binary issue, and has no clear point of delineation. Scholarly (or scientific) peer review is a relatively modern phenomenon; see en:Scholarly peer review which tells us, for example, that "Nature itself instituted formal peer review only in 1967.". The works of Linnaeus and Darwin were not peer reviewed. Many taxon names were first published in papers that we would not today consider peer reviewed. Clearly this needs more thought, and eventual consensus, and an improved model. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:48, 27 January 2022 (UTC)[reply]
@Pigsonthewing: Wow, I had no idea that the idea of peer review was so recent, thanks for the pointer! Sorry for not explaining myself particularly well again, I probably shouldn't have mentioned peer review. To clear up what I'm proposing, I created scholarly letter/reply (Q110716513) with the description: "article in an academic publication that focuses on another article, does not usually present any new evidence", and then changed all of the articles in question to that. Worst case, scholarly letter/reply (Q110716513) can just be merged back into scholarly article (Q13442814). Hope this makes a bit more sense! What do you reckon? Aluxosm (talk) 22:42, 27 January 2022 (UTC)[reply]

New wikibase for all scientific articles?[edit]

I recently found non-LOD metadata about 1.4M scientific papers from Swedish research institutions (Swepub). I talked to the universities, but they were not interested in leveraging their metadata to the LOD level.

I'm planning to dismbiguate the authers and maybe the subjects in a proper Wikibase myself. When talking to @Harej: in telegram he asked if we should make a science.wikibase.cloud wikibase for all articles. The current issues with the WDQS backend for Wikidata are probably not going away soon.

If we create a proper Wikibase, then we need to decide whether to use federated properties. Unfortunately, they are not working with Blazegraph at the moment, so I suggest we have our own properties to avoid issues with queries. That will make federated queries somewhat harder, e.g. for Scholia, but not impossible.

WDYT? Mattsenate (talk) 13:11, 8 August 2014 (UTC)
KHammerstein (WMF) (talk) 13:15, 8 August 2014 (UTC)
Mitar (talk) 13:17, 8 August 2014 (UTC)
Mvolz (talk) 18:07, 8 August 2014 (UTC)
Daniel Mietchen (talk) 18:09, 8 August 2014 (UTC)
Merrilee (talk) 13:37, 9 August 2014 (UTC)
Pharos (talk) 14:09, 9 August 2014 (UTC)
DarTar (talk) 15:46, 9 August 2014 (UTC)
HLHJ (talk) 09:11, 11 August 2014 (UTC)
Blue Rasberry 18:02, 11 August 2014 (UTC)
JakobVoss (talk) 12:23, 20 August 2014 (UTC)
Finn Årup Nielsen (fnielsen) (talk) 02:06, 23 August 2014 (UTC)
Jodi.a.schneider (talk) 09:24, 25 August 2014 (UTC)
Abecker (talk) 23:35, 5 September 2014 (UTC)
Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:21, 24 October 2014 (UTC)
Mike Linksvayer (talk) 23:26, 18 October 2014 (UTC)
Kopiersperre (talk) 20:33, 20 October 2014 (UTC)
Jonathan Dugan (talk) 21:03, 20 October 2014 (UTC)
Hfordsa (talk) 19:26, 5 November 2014 (UTC)
Vladimir Alexiev (talk) 15:09, 23 January 2015 (UTC)
Runner1928 (talk) 03:25, 6 May 2015 (UTC)
Pete F (talk)
econterms (talk) 13:51, 19 August 2015 (UTC)
Sj (talk)
TomT0m
addshore 17:43, 18 January 2016 (UTC)
Bodhisattwa (talk) 16:08, 29 January 2016 (UTC)
Ainali (talk) 16:51, 29 January 2016 (UTC)
Shani Evenstein (talk) 21:29, 5 July 2018 (UTC)
Skim (talk) 07:17, 6 November 2018 (UTC)
PKM (talk) 23:19, 19 November 2018 (UTC)
Ocaasi (talk) 22:19, 29 November 2018 (UTC)
Trilotat Trilotat (talk) 15:43, 16 February 2019 (UTC)
NAH
Iwan.Aucamp
Alessandra Boccone
Pablo Busatto (talk) 05:40, 23 June 2020 (UTC)
Blrtg1 (talk) 17:20, 23 July 2020 (UTC)
Kosboot (talk) 21:32, 23 July 2020 (UTC)
Matlin (talk) 09:38, 11 August 2020 (UTC)
Carrierudd(talk) 11:44, 3 November 2020 (UTC)
So9q (talk) 11:35, 16 January 2021 (UTC)
pdesai (talk) 16:00, 8 February 2021 (UTC)
 Donald Trung/徵國單  (討論 🀄) (方孔錢 💴) 18:43, 17 May 2021 (UTC)
Pictogram voting comment.svg Notified participants of WikiProject Source MetaData--So9q (talk) 20:19, 24 January 2022 (UTC) LeadSongDog (talk) 21:42, 23 March 2016 (UTC)
RobLa-WMF (talk) 01:24, 25 March 2016 (UTC)
Kosboot (talk) 20:45, 30 March 2016 (UTC)
Sydney Poore/FloNight♥♥♥♥ 15:10, 14 April 2016 (UTC)
Peaceray (talk) 18:40, 28 April 2016 (UTC)
PKM (talk) 16:29, 1 May 2016 (UTC)
Aubrey (talk) 12:42, 25 August 2016 (UTC)
Chiara (talk) 12:47, 25 August 2016 (UTC)
Marchitelli (talk) 19:02, 1 September 2016 (UTC)
YULdigitalpreservation (talk) 17:44, 9 December 2016 (UTC)
Satdeep Gill (talk) 14:59, 2 February 2017 (UTC)
Raymond Ellis (talk) 16:06, 1 April 2017 (UTC)
Crazy1880 (talk) 18:21, 16 June 2017 (UTC)
T Arrow (talk) 07:55, 22 June 2017 (UTC)
GerardM (talk) 08:25, 30 July 2017 (UTC) With a particular interest of opening up sources about Botany and opening up any freely licensed publications.
Clifford Anderson (talk) 18:26, 11 August 2017 (UTC)
Jsamwrites (talk) 07:52, 27 August 2017 (UTC)
Krishna Chaitanya Velaga (talk) 09:52, 19 September 2017 (UTC)
Capankajsmilyo (talk) 18:32, 19 September 2017 (UTC)
Hsarrazin (talk) 20:41, 15 October 2017 (UTC)
Mlemusrojas (talk) 10:15, 6 December 2017 (UTC)
Samat (talk)
Ivanhercaz Plume pen w.png (Talk) 20:27, 25 December 2017 (UTC)
Simon Cobb (User:Sic19 - talk page) 21:20, 21 January 2018 (UTC)
Mahdimoqri (talk) 20:22, 26 March 2018 (UTC)
Maria zaos (talk) 18:45, 9 April 2018 (UTC)
Jaireeodell (talk) 14:07, 23 April 2018 (UTC)
Egon Willighagen (talk) 12:29, 10 May 2018 (UTC)
RobinMelanson (talk) 2:13, 25 November 2018 (UTC)
Vladimir Alexiev (talk) 03:02, 4 December 2018 (UTC) interested, in particular because of TRR project https://m.wikidata.org/wiki/Q56259739
Maxlath (talk) 18:36, 6 January 2019 (UTC)
Dcflyer (talk) 21:38, 26 January 2019 (UTC)
Trilotat Trilotat (talk) 15:39, 16 February 2019 (UTC)
Mfchris84 (talk) 05:37, 18 April 2019 (UTC)
Salgo60 (talk)
Walkuraxx (talk) 14:58, 18 July 2019 (UTC)
NAH
FULBERT (talk) 17:14, 10 November 2019 (UTC)
Wolfgang8741 (talk) 20:35, 19 April 2020 (UTC)
Csisc (talk) 17:46, 26 April 2020 (UTC)
Phoebe (talk) 16:26, 24 September 2020 (UTC)
Bitofdust (talk) 16:15, 20 January 2021 (UTC)
Dick Bos (talk) 07:52, 23 March 2021 (UTC)
Rtnf (talk) 12:34, 19 July 2021 (UTC)
Mathieu Kappler (talk) 11:38, 6 September 2021 (UTC)
Pictogram voting comment.svg Notified participants of WikiProject Source MetaData/More[reply]

Hm, could you elaborate on what are the issues with the WDQS backend? So I am not sure why is there a need for another instance of Wikibase and we could not just add data to Wikidata itself? Maybe I am missing something obvious? Mitar (talk) 22:38, 24 January 2022 (UTC)[reply]
@Mitar: Essentially, the backend is based on Blazegraph (Q20127748) which is no longer being actively maintained and the size of Wikidata is starting to become an issue. At the current rate the graph only has a couple of years left before becoming overwhelmed. One of the ideas to lessen the load was to split off all of the scholarly articles (see here). Hopefully this all works out for the best and doesn't cause too many headaches! Aluxosm (talk) 15:31, 26 January 2022 (UTC)[reply]
I would prefer if data is kept stored in Wikidata itself, and if necessary only the WDQS part is split out into a separate instance which allows querying the scholarly articles subgraph. So I would say: let's add all Swepub papers to Wikidata and if necessary we can move querying to another instance itself. Mitar (talk) 23:28, 26 January 2022 (UTC)[reply]
@Mitar: The data wouldn't be deleted from Wikidata, we'd just lose the ability to query some of it. The WDQS part is already separate, the problem is that it'd start crashing/just wouldn't work if the graph got too big. The tricky part is how to query all of that data, not how to store it. I somewhat agree though, it does all sound a bit worrying but I don't think it means that we should stop adding data. I just hope that the Wikimedia Foundation has this as a top priority! Aluxosm (talk) 17:51, 29 January 2022 (UTC)[reply]

This was a big question about a year ago and many people disliked importing millions of articles before you had the foundational data toi ground them (e.g subjects, institutions, journals). This was assuaged by demoting articles in WD search and autocompletion.

So right now I think people are happy to keep articles in WD, for the value of being able to link to all contextual items.

Your 1.2M won't add too much burden on top the 40M existing :). And adding up to a million researchers is also ok, as long as you deduplicate them. Vladimir Alexiev (talk) 05:14, 26 January 2022 (UTC)[reply]

I think you might have missed the somewhat alarming status about the Wikidata Query Service at the end of 2021.
Even if we try to keep the existing corpus current, I don't think venturing into additional fields is a something to do at the moment.
Maybe swepub could indeed be a good model to host them in a separate instance. --- Jura 15:43, 26 January 2022 (UTC)[reply]

Manually adding CiTO annotations[edit]

For the uninitiated, here's some links to useful information about Citation Typing Ontology (Q44955364) annotations:

A recent paper by James L. Powell (Q16104533) (Premature rejection in science: The case of the Younger Dryas Impact Hypothesis (Q110444998)) contains a list of articles that cite An independent evaluation of the Younger Dryas extraterrestrial impact hypothesis (Q24642105) as authority. I went ahead and manually added CiTO annotations to the ones he mentioned, and added a query that generates a similar list to WikiProject Younger Dryas impact hypothesis.

I think it's a great use case for CiTO but it got me wondering: is adding these annotations manually (after publication) an acceptable thing to do? The level of misuse would probably be very low but mistakes could happen, especially for unreferenced statements. I personally think that the benefits far outweigh the potential risks but I wanted to hear from others before going too far and adding any more. Thanks in advance for any thoughts on this! Aluxosm (talk) 19:38, 26 January 2022 (UTC)[reply]

@Aluxosm:, thanks for doing this and asking this question. I have been thinking about this too, and have a personal collection of post-publication CiTO annotations too. I think key is just to allow people to track the authority of the source of annotation. So, I would suggest to upload the annotation to Figshare (Q17013516), Zenodo (Q22661177), or similar and cite that as reference. The metadata in that data repository provide the info people need to decide on the history of the data and how to interpret it. This is the approach I want to take too, but could not find time for yet. --Egon Willighagen (talk) 06:51, 27 January 2022 (UTC)[reply]
@Egon Willighagen: No worries, glad to hear I'm on the right track! I totally agree that some kind of reference is needed when the annotations don't originate from the source. For the ones I mentioned above, I referenced Powell (2022).
For example, on No evidence of nanodiamonds in Younger-Dryas sediments to support an impact event (Q24606726), I added:

cites work
Normal rank An independent evaluation of the Younger Dryas extraterrestrial impact hypothesis
objective of project or action cites as authority
1 reference
stated in Premature rejection in science: The case of the Younger Dryas Impact Hypothesis
quotation Instead, as shown in Table 2, right up to the present day many scientists have embraced the results of Surovell et al. to cast doubt on the hypothesis.
add reference


add value


I think I see what you're saying about uploading then citing Zenodo, but I'm not sure how efficient that would be if you only wanted to add a single annotation at a time. Could a simpler alternative (that would allow adding these one-by-one) look something like the following?
With these statements going on Premature rejection in science: The case of the Younger Dryas Impact Hypothesis (Q110444998):

cites work
Normal rank An independent evaluation of the Younger Dryas extraterrestrial impact hypothesis
objective of project or action critiques
1 reference
based on heuristic inferred from prose
quotation Since Firestone et al. showed dispositive photographic evidence that the microspherules exist at the YDB at Blackwater Draw and the other sites, we can only conclude that Surovell et al. failed to sample the YDB and/or erred in their procedures.
(↑ try not to quote more than is necessary to verify the claim)
add reference


add value


Apologies if I've misunderstood what you're saying, thanks for the help! Aluxosm (talk) 12:08, 27 January 2022 (UTC)[reply]
The above solutions is very clear in how the statement is supported. It touches on the discussion whether Wikidata is a primary source or a secondary source. The above looks great to me! The only advantage of putting the data on Zenodo, is that on Zenodo you can further detail how the statements were pulled together. --Egon Willighagen (talk) 06:56, 28 January 2022 (UTC)[reply]
@Egon Willighagen: Nice! Could you ping me if you find the time to do one of yours, I'm interested to see more about how you would implement the Zenodo option. Thanks! Aluxosm (talk) 17:05, 29 January 2022 (UTC)[reply]
This does not appear to be supported by Help:Sources. --- Jura 19:58, 28 January 2022 (UTC)[reply]
@Jura1: Hmmm, do you mean because of this: "In some cases sources are not required: When the item itself is a source for a statement." I've taken that as more of a guideline and in the few cases where I've needed to reference the item itself I've just left out the stated in statement as it's implied. If you meant to talk about the first example statement , I guess it wouldn't be needed if we could reference the statement as I did in the second . Feel like I might have got lost here 😬. Aluxosm (talk) 17:05, 29 January 2022 (UTC)[reply]
  • Unsure about importing quotations I love the project. There are lots of comments that I could make, but for now, I encourage you to go forwarding with modeling examples. One aspect which I question though is importing the quotations. Wikidata uses CC0 copyright licensing and this has to apply globally. There is no provision for fair use or exceptions about copyright. Since Wikidata is global, the copyright needs to be open according to the rules of every country. I am not sure what precedent already exists, but I think importing quotations like this is always problematic and often not permissible. I could be off. Please ask around for precedents elsewhere. As an alternative, you can still say that the determination is from a quotation, but actually copying the quotation in Wikidata may not work. If you do this anyway, and there is a determination that this is not allowed, then there may be mass deletion of the quotations. Please get other opinions as I may be incorrect about best practices. Bluerasberry (talk) 20:58, 15 February 2022 (UTC)[reply]
@Bluerasberry: Are you suggesting that quotes should only be taken from public domain sources? I did check before I started using quotation (P1683) elsewhere (due to the same concerns) but there were/are no restrictions listed on the talk page and the advice at Wikidata:Verifiability seems to suggest that fair use does apply. For reference, the longer quote in the second example above is 41 words (out of the suggested 200 max). I could see this potentially being an issue for single page articles with lots of references (you could end up replicating a significant proportion if not careful), but other than that, this should be okay. It's a fair concern though, keeping the quote to a minimum and only using them when needed is good advice. Aluxosm (talk) 06:42, 16 February 2022 (UTC)[reply]
@Aluxosm: I think everyone should be wary of any quotations. Wikidata is a CC0 project and while that usually is equivalent to public domain, there are circumstances when it is not. Some German Wikidata editors are fond of saying that in Germany there is no public domain, so they call for CC0. No one wants to slow down your project but at the same time, I would not want you to put time into something that may be deleted. If there is a need to import sentences or phrases then probably we should organize a general discussion about the Wikidata policy on quotations for this and other use cases. Or perhaps that discussion already exists somewhere, and there is already an answer. Bluerasberry (talk) 12:18, 16 February 2022 (UTC)[reply]
@Bluerasberry: Not sure how much more I can add to be honest, the current guidelines on quotations align with my thoughts. Requiring quotes to be of the same license as the Wiki they're added to would hobble every single Wikimedia project; this is definitely one for the lawyers! There are currently over 100,000 uses of quotation (P1683) so it might be worth opening a discussion on the property's talk page if you think there could be an issue here. Pinging Thepwnco, who wrote the guidelines, and Matěj Suchánek, who marked them as 'outdated'. Cheers! Aluxosm (talk) 13:21, 16 February 2022 (UTC)[reply]
I think I marked it as outdated because it referred to the deleted property P387 and mentioned some related configuration. --Matěj Suchánek (talk) 13:57, 18 February 2022 (UTC)[reply]

WikiProject redesign, cleanup, icon?[edit]

Please no one have high expectations or want quick reactions, but I am thinking about cleaning up this WikiProject with a redesign. Since there are lots of participants here I thought I would post to the talk page first.

Tabs

I am thinking of setting up this page with tabs. It will probably look like this

Icon

Lots of WikiProjects have some image which represents the WikiProject. I like images because I think they make people remember or recognize the project better, especially since on Wikidata we have long-term users who may only visit projects yearly or every few years. Seeing a familiar image helps them remember it. Also pictures are fun.

I like these "source" icons from nounproject.com. They all have Wikimedia compatible licenses. Check out any of the images there and propose a favorite.

My favorites are

Thoughts from anyone? Bluerasberry (talk) 20:43, 15 February 2022 (UTC)[reply]

 Support here. I like your second "favorite" icon suggestion. ArthurPSmith (talk) 21:10, 15 February 2022 (UTC)[reply]

Chaotic end to the activities of WikiCite and WikiProject Source MetaData[edit]

Conversations happen in lots of places and somehow some important ones missed this talk page. I do not know who has a brief and accurate explanation. Here is a brief oversimplified explanation which might be close enough: the Wikimedia Foundation says that Blazegraph (Q20127748) has reached its limit, Wikidata is full and people have to quit adding content and querying it, and the problem is contributors to WikiCite and WikiProject Source MetaData. By stopping WikiProject Source MetaData, the Wikimedia Foundation will gain 2 years to find and design an alternative solution. I am not sure what happens after that.

Somewhere there is the proposal of what the Wikimedia Foundation is going to do next. They presented at Wikimania and also have a page somewhere. I forgot where. Does anyone have the link?

Of course I care a lot but for typical Wikidata editors, here is my own suggestion: if you making fewer than 1 million edits these changes may not affect you much, but be aware that some bigger projects are paused. Many big projects were paused for years before this anyway. Also feel encouraged to continue to do data modeling, because we still need examples and best practice recommendations for all sorts of source metadata.

Bluerasberry (talk) 22:19, 15 February 2022 (UTC)[reply]

@Bluerasberry: There's a section above where this was discussed (#New wikibase for all scientific articles?) which includes a link to the Blazegraph failure playbook. Could you point to an official statement from the WMF calling for this project should be shut down? Aluxosm (talk) 06:59, 16 February 2022 (UTC)[reply]
@Aluxosm: There is a WMF statement but I forgot where it is. There is a YouTube video explaining the problem and then the published statement either in Wikidata or on Meta. Let me look more, or maybe someone else here knows. Bluerasberry (talk) 12:13, 16 February 2022 (UTC)[reply]
@Bluerasberry: There was a panel discussion at WikidataCon 2021 and lots of talk of plans but I haven't seen anything to suggest that they needed to be implemented immediately. Shutting Wikicite down would be a pretty big deal for a large number of contributors/users; not really the sort of news that would be passed on via the grapevine (no offense and apologies if this is actually the case). Aluxosm (talk) 13:36, 16 February 2022 (UTC)[reply]
Ah, it is that Blazegraph failure playbook. That panel discussion video is not the one with the WMF presentation on the failure playbook, but yes, that it is the right issue and that is all the statement we have. I do not think there is an implementation statement saying when and exactly how all this will happen. Bluerasberry (talk) 22:02, 16 February 2022 (UTC)[reply]
Wikimedia Foundation has open event on this tomorrow Wikidata:SPARQL query service/Feb 2022 scaling community meetings Bluerasberry (talk) 22:03, 16 February 2022 (UTC)[reply]
If you think about numbers, 40 million articles versus a few thousands/week added now, I think that a closure of SourceMD would be barking up the wrong tree IMHO. --SCIdude (talk) 10:33, 17 February 2022 (UTC)[reply]

Importing from OpenAlex[edit]

Hi. I just posted this in the Wikidata chat in Telegram:

Related to the issues concerning BlazeGraph there is a new thread here https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Source_MetaData#Chaotic_end_to_the_activities_of_WikiCite_and_WikiProject_Source_MetaData
I participated in the WDQS scaling community call yesterday and I invite anyone interested to join the next call.
Meanwhile I'm continuing my work on my new bot with the goal of importing 20M+ articles into Wikidata from OpenAlex now that we have a disaster plan and don't have to make fear-based decisions. If BG breaks, WMF simply cuts out the scientific articles from WDQS according to the plan.
Anyone can set up a Wikibase and import a part of Wikidata and make it possible to make SPARQL queries on the scientific items and I predict someone will do it within a month from the disaster plan is executed.
I will post the request for botflag here once it is ready.

The code is here https://github.com/dpriskorn/OpenAlexBot --So9q (talk) 08:09, 18 February 2022 (UTC)[reply]

Here is the request https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/OpenAlexBot--So9q (talk) 15:00, 23 February 2022 (UTC)[reply]
@So9q: It looks like you are lower-casing DOI's instead of upper-casing them? All DOI's in Wikidata right now are upper-case, and you will not find matches with WDQS (or, I think, haswbstatement) if you have the wrong case. ArthurPSmith (talk) 17:30, 24 February 2022 (UTC)[reply]
In the Wikicite group @Harej suggested we lowercase them all (in Wikidata). I use CirrusSearch which is based on Elasticsearch which has case-handling built in. Compare [1] and [2] :) So9q (talk) 08:48, 25 February 2022 (UTC)[reply]
I want to clarify that although that is my personal opinion, as I understand there is currently consensus to capitalize DOIs in Wikidata, and drift away from this has been accidental (and largely a product of inconsistent enforcement). Harej (talk) 17:06, 25 February 2022 (UTC)[reply]

University adding portraits[edit]

This project collects a lot of academic publications, and because of that makes structured data for authors. Until now I do not think we have an example of an organization which has tried to give us an image collection of their researchers, but here is one -

Bluerasberry (talk) 20:19, 7 March 2022 (UTC)[reply]

Wikidata software profiling hackathon, June 6&8[edit]

Those interested in software + Wikidata are invited to the Scholia Hackathon 6&8 June 2022.

WD:Scholia is a Wikidata front end which does scholarly profiling, and is best known as tool for browsing the WikiCite collection of WD:WikiProject Source Metadata.

An example Scholia profile for the software Stata (Q1204300) is

Anyone interested in examining any part of Wikidata connecting to software is welcome. Bluerasberry (talk) 20:40, 19 May 2022 (UTC)[reply]