Wikidata:WikiProject Wikidata for research/Meetups/2018-04-23-25-Antwerpen

From Wikidata
Jump to navigation Jump to search

Contributions to the "Wikidata for research" project (including Wikidata:WikiProject Wikidata for research and all its pages) are dual licensed under CC BY-SA 3.0 (the Wikimedia default) and the Creative Commons Attribution 4.0 license.
Contributions by the project to the item and project namespaces of Wikidata shall be under CC0.

An impression from the meeting
Wikibase instances indexed in the Wikibase registry that was set up during the workshop

On 23-25 April 2018, a "Workshop on harnessing open data for Monitoring and Evaluation" is taking place in Antwerp (Q12892), focused on using Wikibase (Q16354758) instances federated with Wikidata (Q2013) in the context of research assessment (Q51844619).




Draft schedule:

Day 1 April 23[edit]

The focus of the day is on learning about each other's activities around Wikibase.

13:30 – 15:00 Introductions & lightning talks[edit]

Formal introductions[edit]
Lightning talks by everyone part 1[edit]

15:00 - 15:30 coffee break (Q52145382)[edit]

15:30 - 18:00 Identifying problems and opportunities[edit]

Lightning talks part 2[edit]
Introduction to the afternoon session[edit]
  • Every participant picks at least two
  • Ideally, every topic should be worked on by two people
  • Additional topics may be added, but these will not count towards everyone’s two
  • Sub-question How can we discover that there is more information about an item (on Wikidata), on external wikibases?
  • (How) Can we use different triple stores for the same or separate Wikibase instances? — Rajaram
  • Can Wikibase be configured to work with any existing ontology/ knowledge model?
    • Yes, within the constraints of the Wikibase data model of items and properties
    • Not beyond
  • Sub-question Can wikibase repository and wikibase client be used independently? — Nuno
    • Yes
  • Should Wikidata properties be known to new Wikibase instances by default?
  • Syncing local properties with properties from another Wikibase (Wikidata by default)
    • Same as …..
    • Equivalent property: URIs
    • E.g.: subclass of (P279): equivalent property: ""
    • My subclass of: equiv prop:
    • It is possible with backup and restore.
    • Creating separate Wikidata property namespace that is automatically synced from Wikidata
    • Can we use Wikidata properties directly?
    • How do we link PIDs and QIDs that are the same between Wikidata and local wikibase?
  • Wikidata Integrator
    • WDI needs to support non-Wikidata Wikibases
    • Distribute WDI run between Wikidata and any Wikibase
    • Discovery of skos:exactMatch links between wikidata and wikibase properties and items to allow generalization of bots
    • Todo: Greg will give a presentation about it to those who wants to learn how to
    • Phabricator ticket
  • Supporting a local file/image commons repository instead of Wikimedia commons
  • Use Wikibase as a backend (e.g. to convert to and from RDF) — Nuno, Mike
  • Shall we set up a mailing list or something for the Wikibase community?
  • Can I run a Wikidata-like infrastructure with slightly different content?
    • In terms of hardware
    • In terms of configuration
  • Some sort of query interface wrapper/helper that can simplify the property -> skos:exactMatch lookups across wikibases to facilitate writing more legible federated queries
    • Autocomplete on wdqs working for wikidata items and local items
  • How can the Wikibase docker include all the infrastructure for the SPARQL endpoint, including limiting long queries, parallelizing over many nodes, etc?
  • Scalability
    • Can Wikibase handle billions of items?
    • E.g: star catalogues, SNPs
    • Can I put million statements on an item?
    • Multiple synchronous queries?
  • Different options for deploying / containerization
    • Perhaps defaulting to wmflabs
    • Open Stack
    • Amazon EC2
    • LXD :)
    • Parallelization, scaling, multiple query services?
    • Phabricator ticket
  • Would it make sense to prohibit non-Wikidata Wikibase instances from using P, Q, L etc. as used in Wikidata identifiers?
    • Olaf: It’s not necessarily about “prohibiting” Q and P, we want to avoid confusion
    • no
  • Mirroring of a given set of Wikibases
    • Paralympics use case
  • Having WDQS working on other SPARQL endpoints (e.g. Virtuoso, neptune, etc) — Rajaram
  • Working with dumps vs life data vs LDF
    • Potentially use ShEx to create dumps to work on
  • Performance
    • Working on the API/SPARQL vs. working directly on The Wikibase Key Value store vs. dumps
    • Phabricator ticket
  • Common features expected for single-Wikibase frontends, e.g.:
  • Common features for cross-Wikibase frontends
    • Pulling data from multiple Wikibases
    • Pushing data to multiple Wikibases
    • Coordination of authentication mechanisms
    • Graph validation across Wikibases
    • Phabricator ticket
  • Adding meaningful Mediawiki extensions via a config file or a web interface
  • Migrate from non-Docker Wikibase install to Docker Wikibase, while maintaining content integrity.
  • How does uncertainty around Blazegraph affect the future development of WDQS for Wikibases?
  • Get Histropedia working on Wikibases
  • “Un-hide all references” functionality
Input for New York Workshop[edit]
  • Representing historic places — Susanna
  • Representing digital & performative art — Lozana
  • Thinking through what a user interface on top of a Wikibase backend can look like – what are user needs / goals / expectations – how can we bring new experiences to users that were not possible before with conventional databases.
End of the group discussions on 23.04.2018[edit]
  • Adams describes his vision for a future where Wikibase instances can be easily spun off – Phabricator ticket

Day 2 April 24[edit]

The focus of the day is on getting things done.

Morning session: 9am–1pm[edit]

Demos of various tools and workflows by different workshop participants

  • Fatameh – a tool to add items to Wikidata about papers with content generated from places like EuropePMC, PubMed and Crossref.

Afternoon sessions: 13:00pm–18:00pm[edit]

Participants splitting off in a few different works to work on different issues raised on Day 1

Session 1: 13:00-14:00pm[edit]
Working group I: Docker image configuration[edit]
  • Raz
Working group II: Different options for deployment / containerization of Wikibase[edit]
  • Andra / Lyndsey
Working group III: UX / UI issues related to single Wikibase instances and WDQS / SPARQL query builders[edit]
  • Lozana and Susanna

Session 2: 14:00-15:00pm[edit]
Working group I: Docker image configuration[edit]
  • Raz
Working group II: Different options for deployment / containerization of Wikibase[edit]
Working group III: UX / UI issues related to single Wikibase instances and WDQS / SPARQL query builders (cont.)[edit]

Session 4: 15:00-16:00pm[edit]
Working group I:[edit]
Working group II:[edit]
Working group III: UX / UI issues – discussion moved towards issues specific to cross-wikibase federation (cont.)[edit]

Session 5: 16:00-17:00pm[edit]
Working group I:[edit]
Working group II:[edit]
Working group III: UX / UI issues – discussion moved towards issues specific to cross-wikibase federation (cont.)[edit]
  • Daniel, Lozana and Susanna

Session 6: 17:00-18:00pm[edit]

Presentations from all working groups

Day 3, April 25th[edit]

This is a google doc dump – needs editting

  • Finishing up, writing documentation and prepare/embed in future events

Start at 9:00

Open issues[edit]

  • Discuss the unsolved comments here in this doc from Intro and Days 1 & 2
  • Load some datasets into some Wikibase instances

  • Wikibase registry
  • Start a discussion on deprecating rather than deleting Wikidata properties

Transfer tickets from paper[edit]

  • Signed statements for properties (Daniel)
  • Diffs at snak level (Daniel)
  • How to avoid missing QIDs? (Daniel)
  • Davy has wrapper around Crossef API (via Mike)
  • How to handle passwords in PAWS notebooks? (Daniel)
  • Adapt WDI to split large SPARQL queries into smaller ones (probably by way of OFFSET) (Daniel)
  • Adapt WDI to allow to set up missing properties in a Wikibase, perhaps based on a SPARQL query federated with Wikidata, the Wikibase registry and other Wikibases (Daniel)
  • Should have have WDI issue tracker on phabricator?
  • Adapt WDI to use the last updated triple to check for lag (Daniel)
  • Customize search for a given Wikibase (Daniel)
  • Create a home for Wikibase Community User Group on Wikidata, modeled after (Daniel)
  • Create a home for Wikibase Community User Group on the Wikibase registry, modeled after (Daniel)
  • How can Wikibase become a FAIR data platform? (Daniel)
  • How can the WDQS of a given Wikibase decoupled from MediaWiki for things like autocompletion? (Nuno)
  • In pre-configuration, add option to include the property suggester into the Wikibase Docker image. (Daniel)
  • Define best practices for Wikibase communities, e.g. in terms of deleting items or properties. (Daniel)
  • Define communication plan
  • Create properties in the Wikibase registry to reflect the parameters used in the pre-configuration of the Wikibase Docker
  • How does the GDPR affect Wikimedia sites and Wikibase instances?
  • Logo for federated Wikibase ecosystem

Vote on deleting properties[edit]

  • The Wikibase community as represented at the Antwerp workshop on 23-25 April 2018 has voted on the following
    • Text to be voted on
      • Properties on Wikibase instances registered in the Wikibase registry should not be deleted. When the data model on a given Wikibase instance changes, the respective properties should be deprecated instead, and contain pointers to the properties replacing them.
    • Voting options
      • Agree
        • 8
      • Not sure
        • 4
      • Disagree
        • 0
      • Abstain
        • 2
    • Comments:
      • A property may change so much that it is no longer what it was previously.
        • Yes, but then it might be used on other wikibase in its original meaning
      • Adam: I’m against doing / trying to enforce / make wikibases do this currently. Although with the right technical solution backing this I am 100% for it. The technical solution could be something like a soft redirect and official deprecation (which removes it from property lookup & suggestion on a wikibase) etc.

End around 16:00

New place to aggregate and discuss small tasks / questions going forward[edit]

Post-workshop documentation[edit]

See also[edit]