Wikidata:WikiProject LD4 Wikidata Affinity Group/Affinity Group Calls/Meeting Notes/2023-04-04

From Wikidata
Jump to navigation Jump to search

Call details[edit]

Presentation material[edit]

Presentation

Notes[edit]

  • Mix’n’match https://mix-n-match.toolforge.org
    • background
      • rivals wikidata in terms of entries (100 million)
      • catalog for each external source; id and name are required; additional optional metadata
      • can match the entities to Wikidata via fully automated matches, preliminarily matched, or not matched
      • there is a game function for matching
    • getting data into Mix’n’Match
      • can import csv
      • can define a website scraper to extract key info of website (this can be re-run in regular intervals and automatically update the catalog)
      • bespoke code (i.e., some data sources are available on the web but not possible to scrape by defining regular expressions)
    • matching
      • automatch for people if name/birth/death date are same
      • can also match if ID (for example, VIAF id) already on Wikidata
      • option to directly edit wikidata from Mix'n'match and can also download catalog from Mix’n’match
  • Authority Control data to Wikidata (AC2WD) https://ac2wd.toolforge.org/
    • API that can read various sources from MARC to enrich Wikidata w/ Authority Control data
    • can create a new item or enhance an existing item from source
    • available as a sidebar tool on Wikidata (single-click)
  • Reasonator https://reasonator.toolforge.org/
    • a “pretty” view on Wikidata
    • includes a linked data sidebar
    • does have a current bug
  • Resolver https://wikidata-todo.toolforge.org/resolver.php
    • takes AC data as input
    • redirects to the matching (if any) Wiidata item or Wikipedia page
    • designed as an entry point if you want to link to Wikidata without having to look it up (for example)
  • Referee http://magnusmanske.de/wordpress/?p=572
  • SourceMD (MetaData)/ORCIDator (currently offline)
    • import publications into Wikidata from PubMed, etc.
    • create and/or link authors via ORCID
    • Currently mostly offline, needs work
  • GeneDB (now defunct)
    • genome annotations of ~50 species
    • amalgamated various linked data sources (PubMed, UniProt, GO terms, etc.) into one website
    • Automatically updated using Wikidata on back-end
    • tool merged into different application @ a larger institute,  but data is still in wikidata
  • Additional tools to use w/ Wikidata/linked data
  • Tech
    • Old: PHP (quick to write)
    • New: Rust (strongly typed, resource efficient, easy multi-threading, asynchronous, many modules available)
    • Tools already in Rust: PetScan, Listeria, AC2WD, QuickStatements (server-side batches), Mix’n’match (background tasks)

Questions[edit]

  • Q&A (Qs) /Comments (Cs)
    • Q1: Are the AC2WD sources expandable?
      • Yes, contact Magnus with requests, will take a little bit of time to write the code but will be quick if available in MARC
    • Q2: Sorry if I didn't understand correctly, SourceMD would import scientific publications as wikidata items with all core statements?
      • notetaker did not catch the answer -- group assistance requested!
    • Q3: Will QuickStatements using CSV be available to bulk create properties for a Wikibase instance?
      • Magnus did not write the CSV importer, would need to dig into it, on the “to-do” list
    • C1: putting an authorized access point into the description field doesn't seem terribly helpful or appropriate
    • Q4: what is your preferred way to be contacted when support is needed, i.e. if a Mix'n'Match catalog has errors and the catalog needs to be refreshed? I've found a few different methods but am never sure what is best to ensure it doesn't slip through.
    • Q5: What would you suggest to the Wikimedia Foundation to support you and other programmers?
      • use the tools, pull requests always welcome, if you want to be a co-maintainer on one of my tools, please come forward (even if just to re-start when it breaks down)
      • support from wikimedia foundation has been good recently
    • Q6: Question about motivation that spurred Magnus to create so many great tools for the community.
      • Would start creating tools in response to doing tasks manually and wanting a more efficient way to do it.
      • mix’n’match (for example) was initially a request from a user to match oxford bio to wikipedia; magnus opted to match to wikidata instead, and lets not limit it just to odbn data!
    • Q7: Recently we had a speaker who is working on a new API for Wikidata. It seemed nice because it is simpler than the current MediaWiki API. However, it sounded like they were not necessarily supporting all of the features of the current API and that they might deprecate the current API. I found this alarming because it would break so many things if it were true. I assume it would also break many of your tools. Are you concerned about this?
      • Magnus not currently using the API; once it is in a state that is a s good as the current one, will look into migrating (like to stay on top of tech developments)
      • wbedit entity will not be going away anytime soon; will likely have both APIs active until the new API becomes more robust and the old API falls out of usage
      • Comments/discussion from chat
        • Yes. There is quite a long deprecation period planned. And so far it's not a feature-parity with the action API
        • Please contact Lydia Pintscher (User:Lydia_Pintscher_(WMDE)) know about things you're missing in the REST API or issues you run into
        • Comment from chat: I have absolutveryely no plans to port my own tools to the REST API, fwiw
    • Q8: What are you planning for the future?