Wikidata:WikiProject LD4 Wikidata Affinity Group/Affinity Group Calls/Meeting Notes/2020-08-11
Jump to navigation
Jump to search
Call details[edit]
- Date: 2020-08-11
- Topic: Scholia
- Presenters: Daniel Mietchen and Lane Rasberry
Presentation material[edit]
Notes[edit]
Overview[edit]
- Scholarly profiling system
- Uses wikidata to find what wikidata has for that topic, person
- Can find papers that authors have written
- Pre-populates scholarly profiles based on Wikidata queries
- Complements Wikipedia and can be used independently to browse academic literature
- Only presents data loaded into Wikidata
Wikicite[edit]
- Subset of wikidata that deals with source data
- Zika corpus
- Challenging to profile papers when have a huge volume
- Curation workflow for papers
- Topic tagging
- Some communities interested in different subsets of tagging--ie. Maternal health or business or education
- Visualize topic tags
- Publication per year--find when become popular
- Locations mentioned in paper
- Locations of institutions associated with authors
- Topic tagging
Why Scholia matters[edit]
- Uses and frees open data
- Free and accessible on the web
- Develops with Wiki platform
- Anyone editing Wiki can contribute
- Strong ethical foundation
- Radically inclusionary and diverse
- Scholia provides a useful service
- Orientation to academic literature
- For researchers and layman audiences
- Largest free and open FAIR option
- Native multilingual support
- Integration with Wikipedia?
- Compare to Google Scholar
- Not open data, no export data
- Doesn’t encourage curation
- Compare to Elsevier
- Can export content, but can’t put on open web
- Encourage curation, but labor goes toward their own product
- Wikipedia
- Design for end user--people’s best interest
- Don’t need to worry about competitors and preventing reuse of data
- Fundamental access to content should not be marketplace
- Curation where marketplace should come in
Data and curation process[edit]
- Much curation happens at the Wikidata level
- Doesn’t have entirety of academic source metadata (yet)
- Scholia can build author networks wherever there is a free and open identifier
- Connecting authors of papers to where they got their degree to demonstrate impact of universities
- Awards that they have won
- Curation workflows
- Topic tagging
- Author disambiguation
- Anyone can disambiguate as they see fit
- Asking for list of faculty from every university in the world
- Ontology development
- WikiProject Lighthouses as example of how to do Wikidata
- Documentation for developing Wikidata items around certain areas, ie. sports, can apply to other domains, ie. clinical trials
- Subject affiliation
Questions/Discussion[edit]
- No workflows entirely automatic
- Can try Scholia on different document types, ie. Swedish parliamentarians--fine tuning author profile for this group
- Scholia not aware of full-text--just metadata; can have annotations and links to supplementary data
- Scholia hides complexity of creating SPARQL queries
- Front end to Wikidata
- Could use this concept for other areas
- Documentation?
- Kind of neglected
- Documented workflows for repositories, dspace for instance?
- Would welcome these
- Could Scholia query info about books cited in Wikipedia
- Book info would need to be in Wikidata, which currently does not have much
- Patent corpus citing Wikipedia articles
- Could be modeled in Wikidata
- Is there any objection to the Internet Archive adding info about all the books we have added to (from Wikipedia articles to Wikidata)
- Yes, interested
- Would be loading info about books or volume scans?
- Would try to load info for specific edition referenced
- Have added links to 200,000 books so far--will load up Wikibase with 200,000 books
- Would like to load up to 11 million
- Will link to corresponding item in Wikidata
- Daniel: Start with a few and see how it goes. Scale up and then see if anyone complains
- Harvesting from individual repositories?
- Harvesting from aggregators difficult
- Depends on content license
- Do harvest from PubMed central
- Is there a way to represent theses and dissertations and their authors in Wikidata
- Yes
- Can use for seeding for future articles
Showcase
- Other insights come out when data is loaded
- Example: VanderBot by Vanderbilt University (see article link in announcements)
- Privacy legislation may complicate matters, especially in Europe, for living scholars
- Can also include awards
- Go to institutions that grant the awards for lists
- Can show who went to university and became office holder, actor, etc.
- Gender distribution can be demonstrated if those tags are used
- Social implications not fully determined, Scholia is being cautious on whole
- Some users request removal of gender tag from their article
- Imported Clinical Trials.gov to Wikidata
- Questions
- Is data clean up at ORCID necessary?
- Make use of ORCID if useful, but not often the case and has lots of errors and duplicated
- Need bot with human oversight--previously bot was halted due to data quality issues
- https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/Orcbot
- Can use LC Authority Names and VIAF for disambiguation
- How to do disambiguation--Author Disambiguator Tool
- Go to profile and add /missing to profile web address, you’ll be given link to places to contribute
- Make use of ORCID if useful, but not often the case and has lots of errors and duplicated
- Is data clean up at ORCID necessary?
- Opening your data=contributing