Wikidata:WikiProject LD4 Wikidata Affinity Group/Affinity Group Calls/Meeting Notes/2020-08-11

Call details[edit]

Scholarly profiling system
Uses wikidata to find what wikidata has for that topic, person
Can find papers that authors have written
Pre-populates scholarly profiles based on Wikidata queries
Complements Wikipedia and can be used independently to browse academic literature
Only presents data loaded into Wikidata

Compare to Google Scholar
- Not open data, no export data
- Doesn’t encourage curation
Compare to Elsevier
- Can export content, but can’t put on open web
- Encourage curation, but labor goes toward their own product
Wikipedia
- Design for end user--people’s best interest
- Don’t need to worry about competitors and preventing reuse of data
- Fundamental access to content should not be marketplace
  - Curation where marketplace should come in

Much curation happens at the Wikidata level
Doesn’t have entirety of academic source metadata (yet)
Scholia can build author networks wherever there is a free and open identifier
Connecting authors of papers to where they got their degree to demonstrate impact of universities
- Awards that they have won
Curation workflows
- Topic tagging
- Author disambiguation
  - Anyone can disambiguate as they see fit
  - Asking for list of faculty from every university in the world
- Ontology development
  - WikiProject Lighthouses as example of how to do Wikidata
  - Documentation for developing Wikidata items around certain areas, ie. sports, can apply to other domains, ie. clinical trials
- Subject affiliation

No workflows entirely automatic
Can try Scholia on different document types, ie. Swedish parliamentarians--fine tuning author profile for this group
Scholia not aware of full-text--just metadata; can have annotations and links to supplementary data
Scholia hides complexity of creating SPARQL queries
- Front end to Wikidata
- Could use this concept for other areas
Documentation?
- Kind of neglected
Documented workflows for repositories, dspace for instance?
- Would welcome these
Could Scholia query info about books cited in Wikipedia
- Book info would need to be in Wikidata, which currently does not have much
- Patent corpus citing Wikipedia articles
  - Could be modeled in Wikidata
Is there any objection to the Internet Archive adding info about all the books we have added to (from Wikipedia articles to Wikidata)
- Yes, interested
- Would be loading info about books or volume scans?
  - Would try to load info for specific edition referenced
  - Have added links to 200,000 books so far--will load up Wikibase with 200,000 books
  - Would like to load up to 11 million
  - Will link to corresponding item in Wikidata
  - Daniel: Start with a few and see how it goes. Scale up and then see if anyone complains
Harvesting from individual repositories?
- Harvesting from aggregators difficult
- Depends on content license
- Do harvest from PubMed central
Is there a way to represent theses and dissertations and their authors in Wikidata
- Yes
- Can use for seeding for future articles

Showcase

Other insights come out when data is loaded
Example: VanderBot by Vanderbilt University (see article link in announcements)
Privacy legislation may complicate matters, especially in Europe, for living scholars
Can also include awards
- Go to institutions that grant the awards for lists
- Can show who went to university and became office holder, actor, etc.
Gender distribution can be demonstrated if those tags are used
- Social implications not fully determined, Scholia is being cautious on whole
- Some users request removal of gender tag from their article
Imported Clinical Trials.gov to Wikidata