Wikidata:WikiProject Wikidata for research/Meetups/2018-04-23-25-Antwerpen
Contributions to the "Wikidata for research" project (including Wikidata:WikiProject Wikidata for research and all its pages) are dual licensed under CC BY-SA 3.0 (the Wikimedia default) and the Creative Commons Attribution 4.0 license.
Contributions by the project to the item and project namespaces of Wikidata shall be under CC0.


On 23-25 April 2018, a "Workshop on harnessing open data for Monitoring and Evaluation" is taking place in Antwerp (Q12892), focused on using Wikibase (Q16354758) instances federated with Wikidata (Q2013) in the context of research assessment (Q51844619).
About
[edit]- In collaboration with the European Research Council (Q1377836), Gene Wiki (Q5531528), Rhizome (Q7320757), WikiCite (Q30035267), Wikibase Community User Group (Q51033881) and others, we are meeting in Antwerp (Q12892) to explore ways to create a federated landscape of Wikibase instances federated with Wikidata.
- The workshop aims:
- Creating a staging server for Wikidata
- Migrate existing Wikibase-instances to a full Wikibase ecosystem (including WDQS)
- Use Wikibase for non-CC0 data
- The workshop aims:
- hashtags: #wikibase and #WD4R (Wikidata for research)
- issue tracker on Phabricator: https://phabricator.wikimedia.org/tag/federated-wikibase-workshops/
Venue
[edit]Program
[edit]Draft schedule:
Day 1 April 23
[edit]The focus of the day is on learning about each other's activities around Wikibase.
13:30 – 15:00 Introductions & lightning talks
[edit]Formal introductions
[edit]- Mike Mugabushaka
- Andra and Daniel
- First part of a series of workshops
- Focus on Wikibase as infrastructure
- Focus for the next ones is on WikiCite
- Wikimedia Deutschland
- Rhizome (Lyndsey and Lozana)
Lightning talks by everyone part 1
[edit]- Guidelines
- Focus on things that have been done in the past
- Scoping the current landscape of open data infrastructure
- Suggest some concrete outcomes we could strive to achieve by Wednesday afternoon
- 3 min
- 3 slides max
- prepared ahead of time
- Put link in here
- Adam Shorland, Raz Shuty & Sandra Müllrick, Wikimedia Deutschland: A federated landscape of Wikibase
- Gregory Stupp & Andra Waagmeester, Andrew Su Lab, The Scripps Research Institute, San Diego, CA (USA): Gene Wiki Project
- Lyndsey Moulds, Rhizome (USA) & Lozana Rossenova, London South Bank University (UK)/Rhizome (USA): The Wikibase usecase for Rhizome’s ArtBase
- Susanna Ånäs, Open Knowledge Finland: Open Knowledge Finland
- Olaf Simons, Gotha Research Centre of the University of Erfurt: Wikidata:FactGrid, slides: FactGrid Project
- Rajaram Kaliyaperumal, Biosemantics group, LUMC, Leiden The Netherlands: Introduction
- Adam Shorland, Wikimedia Deutschland: Past present & future, Wikibase & Wikidata
- Nuno Nunes, Wikibase Workshop: Wikibase & Wikidata
15:00 - 15:30 coffee break (Q52145382)
[edit]15:30 - 18:00 Identifying problems and opportunities
[edit]Lightning talks part 2
[edit]Introduction to the afternoon session
[edit]- Daniel Mietchen
- Past & Present:
- Future:
- Registry of Wikibase instances
- Basic process for further standardization across Wikibase instances
- Timeline of Wikibase federation
Procedure
[edit]- Every participant picks at least two
- Ideally, every topic should be worked on by two people
- Additional topics may be added, but these will not count towards everyone’s two
Ideas
[edit]- Can we set up a GitHub repo or Phabricator group for coordination of this workshop?
- What about setting up a Wikimedia-supported Wikibase instance for registering Wikibase instances? — Daniel
- Adam to give a demo on how to install Wikibase, using the registry Wikibase as an example
- Phabricator ticket
- Sub-question How can we discover that there is more information about an item (on Wikidata), on external wikibases?
- Beacon?
- 'Same As' Property
- Phabricator ticket
- (How) Can we use different triple stores for the same or separate Wikibase instances? — Rajaram
- Can Wikibase be configured to work with any existing ontology/ knowledge model?
- Yes, within the constraints of the Wikibase data model of items and properties
- Not beyond
- Sub-question Can wikibase repository and wikibase client be used independently? — Nuno
- Yes
- Should Wikidata properties be known to new Wikibase instances by default?
- Yes, similar to InstantCommons
- Phabricator ticket
- Syncing local properties with properties from another Wikibase (Wikidata by default)
- Same as …..
- Equivalent property: URIs
- E.g.: subclass of (P279): equivalent property: "http://www.w3.org/2000/01/rdf-schema#subClassOf"
- My subclass of: equiv prop:
- It is possible with backup and restore.
- Creating separate Wikidata property namespace that is automatically synced from Wikidata
- Can we use Wikidata properties directly?
- How do we link PIDs and QIDs that are the same between Wikidata and local wikibase?
- NEEDS TICKET NUMBER
- Wikidata Integrator
- WDI needs to support non-Wikidata Wikibases
- Distribute WDI run between Wikidata and any Wikibase
- Discovery of skos:exactMatch links between wikidata and wikibase properties and items to allow generalization of bots
- Todo: Greg will give a presentation about it to those who wants to learn how to
- Phabricator ticket
- Setting up staging server to run tests using ShEx slurpers — Andra
- Supporting a local file/image commons repository instead of Wikimedia commons
- Use Wikibase as a backend (e.g. to convert to and from RDF) — Nuno, Mike
- Importing RDF files to the blazegraph — Rajaram
- Shall we set up a mailing list or something for the Wikibase community?
- Discovery of skos:exactMatch on a given instance
- WDI
- WDQS autocompletion (Ticket)
- Equivalent property
- Equivalent item
- Can I run a Wikidata-like infrastructure with slightly different content?
- In terms of hardware
- In terms of configuration
- Some sort of query interface wrapper/helper that can simplify the property -> skos:exactMatch lookups across wikibases to facilitate writing more legible federated queries
- Autocomplete on wdqs working for wikidata items and local items
- How can the Wikibase docker include all the infrastructure for the SPARQL endpoint, including limiting long queries, parallelizing over many nodes, etc?
- Scalability
- Can Wikibase handle billions of items?
- E.g: star catalogues, SNPs
- Can I put million statements on an item?
- Multiple synchronous queries?
- Different options for deploying / containerization
- Perhaps defaulting to wmflabs
- Open Stack
- Amazon EC2
- LXD :)
- Parallelization, scaling, multiple query services?
- Phabricator ticket
- Would it make sense to prohibit non-Wikidata Wikibase instances from using P, Q, L etc. as used in Wikidata identifiers?
- Olaf: It’s not necessarily about “prohibiting” Q and P, we want to avoid confusion
- no
- Mirroring of a given set of Wikibases
- Paralympics use case
- Ideas for a pre-configuration of the elements of the Docker container
- Having WDQS working on other SPARQL endpoints (e.g. Virtuoso, neptune, etc) — Rajaram
- Working with dumps vs life data vs LDF
- Potentially use ShEx to create dumps to work on
- Performance
- Working on the API/SPARQL vs. working directly on The Wikibase Key Value store vs. dumps
- Phabricator ticket
- Common features expected for single-Wikibase frontends, e.g.:
- OAuth
- Visualizing results from WDQS
- Understanding ShEx
- Formatter urls
- Phabricator ticket
- Common features for cross-Wikibase frontends
- Pulling data from multiple Wikibases
- Pushing data to multiple Wikibases
- Coordination of authentication mechanisms
- Graph validation across Wikibases
- Phabricator ticket
- Limits to indexing full-text documents — Mike
- Adding meaningful Mediawiki extensions via a config file or a web interface
- We should have (interactive) forms that help in direct editing — Olaf
- The structured CV for Wikipedia-users all around the globe (the person’s names, genealogy, work positions, places lived, places visited, contacts - correspondences, personal meetings we know of etc. https://database.factgrid.de/wiki/Erfassung_biographischer_Information The forms have to be modular: you can get the specific form if this is interesting - fill in what you know.)
- a form for people who sit in an archive and evaluate a document - it offers fields for metadata and transcripts…
- Possible existing example / reference: www.Wikigenomes.org, http://wikigenomes.org/organism/85962/gene/HP0001/authorized/
- Phabricator ticket
- Migrate from non-Docker Wikibase install to Docker Wikibase, while maintaining content integrity.
- How does uncertainty around Blazegraph affect the future development of WDQS for Wikibases?
- Should OSM tags be stored in Wikibase?
- Additional problems:
- Slide 29
- Several tasks added here: Phabricator ticket
- Get Histropedia working on Wikibases
- “Un-hide all references” functionality
Input for New York Workshop
[edit]- Representing historic places — Susanna
- Representing digital & performative art — Lozana
- Thinking through what a user interface on top of a Wikibase backend can look like – what are user needs / goals / expectations – how can we bring new experiences to users that were not possible before with conventional databases.
End of the group discussions on 23.04.2018
[edit]- Adams describes his vision for a future where Wikibase instances can be easily spun off – Phabricator ticket
Day 2 April 24
[edit]The focus of the day is on getting things done.
Morning session: 9am–1pm
[edit]Demos of various tools and workflows by different workshop participants
- Demo by Adam: Wikibase Docker Container
- Demo by Greg on WikidataIntegrator
- Greg's Wikidataintegrator minimal bot setup and demo
- Fatameh – a tool to add items to Wikidata about papers with content generated from places like EuropePMC, PubMed and Crossref.
- Demo of GLAMpipe by Susanna: http://demo.glampipe.org/
- Sandra and Adam shared: Structured Data für Commons Presentation
Afternoon sessions: 13:00pm–18:00pm
[edit]Participants splitting off in a few different works to work on different issues raised on Day 1
Session 1: 13:00-14:00pm
[edit]Working group I: Docker image configuration
[edit]- Raz
Working group II: Different options for deployment / containerization of Wikibase
[edit]- Andra / Lyndsey
Working group III: UX / UI issues related to single Wikibase instances and WDQS / SPARQL query builders
[edit]- Lozana and Susanna
Session 2: 14:00-15:00pm
[edit]Working group I: Docker image configuration
[edit]- Raz
Working group II: Different options for deployment / containerization of Wikibase
[edit]- Andra / Lyndsey
- https://github.com/RazShutyWMDE/blubber
Working group III: UX / UI issues related to single Wikibase instances and WDQS / SPARQL query builders (cont.)
[edit]- Lozana and Susanna
- https://phabricator.wikimedia.org/T192878
Session 4: 15:00-16:00pm
[edit]Working group I:
[edit]Working group II:
[edit]Working group III: UX / UI issues – discussion moved towards issues specific to cross-wikibase federation (cont.)
[edit]- Daniel, Lozana and Susanna (with Greg and Nuno joining in)
- Phabricator tickets
Session 5: 16:00-17:00pm
[edit]Working group I:
[edit]Working group II:
[edit]Working group III: UX / UI issues – discussion moved towards issues specific to cross-wikibase federation (cont.)
[edit]- Daniel, Lozana and Susanna
Session 6: 17:00-18:00pm
[edit]Presentations from all working groups
Day 3, April 25th
[edit]This is a google doc dump – needs editting
- Finishing up, writing documentation and prepare/embed in future events
Start at 9:00
Open issues
[edit]- Discuss the unsolved comments here in this doc from Intro and Days 1 & 2
- Load some datasets into some Wikibase instances
- Wikibase registry
- Start a discussion on deprecating rather than deleting Wikidata properties
https://phabricator.wikimedia.org/T193009
- Create tickets for upcoming events?
- Wikimedia Hackathon
- Berlin workshop
- Ca. 4 places left
- New York workshop
- Wikimania Hackathon
- Open Citations workshop
- SWAT4HCLS
- Think about publishing a formal workshop report
- Group photo at lunch
- Review Phabricator tickets
- Transfer content from here to the wiki page at https://www.wikidata.org/wiki/Wikidata:WikiProject_Wikidata_for_research/Meetups/2018-04-23-25-Antwerpen
Transfer tickets from paper
[edit]- Signed statements for properties (Daniel)
- Diffs at snak level (Daniel)
- How to avoid missing QIDs? (Daniel)
- Davy has wrapper around Crossef API (via Mike)
- Not mine but also worth looking at https://pypi.org/project/crossrefapi/1.0.3/
- The code will be put in the following repo https://github.com/DavyCielen/crossrefclient
- How to handle passwords in PAWS notebooks? (Daniel)
- Adapt WDI to split large SPARQL queries into smaller ones (probably by way of OFFSET) (Daniel)
- Adapt WDI to allow to set up missing properties in a Wikibase, perhaps based on a SPARQL query federated with Wikidata, the Wikibase registry and other Wikibases (Daniel)
- Should have have WDI issue tracker on phabricator?
- Adapt WDI to use the last updated triple to check for lag (Daniel)
- Customize search for a given Wikibase (Daniel)
- Create a home for Wikibase Community User Group on Wikidata, modeled after https://meta.wikimedia.org/wiki/Wikibase_Community_User_Group (Daniel)
- Create a home for Wikibase Community User Group on the Wikibase registry, modeled after https://meta.wikimedia.org/wiki/Wikibase_Community_User_Group (Daniel)
- How can Wikibase become a FAIR data platform? (Daniel)
- How can the WDQS of a given Wikibase decoupled from MediaWiki for things like autocompletion? (Nuno)
- In pre-configuration, add option to include the property suggester into the Wikibase Docker image. (Daniel)
- Define best practices for Wikibase communities, e.g. in terms of deleting items or properties. (Daniel)
- Define communication plan
- Clean up the Google doc
- Transfer stuff from Google doc to Wiki page at https://www.wikidata.org/wiki/Wikidata:WikiProject_Wikidata_for_research/Meetups/2018-04-23-25-Antwerpen
- Complete the workshop’s entry in https://www.wikidata.org/wiki/Wikidata:Status_updates/Next#Events
- Write a blog post
- Write a formal workshop report
- Create properties in the Wikibase registry to reflect the parameters used in the pre-configuration of the Wikibase Docker
- How does the GDPR affect Wikimedia sites and Wikibase instances?
- Logo for federated Wikibase ecosystem
Vote on deleting properties
[edit]- The Wikibase community as represented at the Antwerp workshop on 23-25 April 2018 has voted on the following
- Text to be voted on
- Properties on Wikibase instances registered in the Wikibase registry should not be deleted. When the data model on a given Wikibase instance changes, the respective properties should be deprecated instead, and contain pointers to the properties replacing them.
- Voting options
- Agree
- 8
- Not sure
- 4
- Disagree
- 0
- Abstain
- 2
- Agree
- Comments:
- A property may change so much that it is no longer what it was previously.
- Yes, but then it might be used on other wikibase in its original meaning
- Adam: I’m against doing / trying to enforce / make wikibases do this currently. Although with the right technical solution backing this I am 100% for it. The technical solution could be something like a soft redirect and official deprecation (which removes it from property lookup & suggestion on a wikibase) etc.
- A property may change so much that it is no longer what it was previously.
- Text to be voted on
End around 16:00
New place to aggregate and discuss small tasks / questions going forward
[edit]- A place where you can ask questions about Mediawiki, Wikimedia, Wikibase, Wikidata etc: https://discourse-mediawiki.wmflabs.org/