Wikidata:WikiProject LD4 Wikidata Affinity Group/Affinity Group Calls/Meeting Notes/2021-04-06
Jump to navigation
Jump to search
Call Details[edit]
- Date: 2021-04-06
- Topic: GaNCH: Using Wikidata for Georgia's Natural, Cultural and Historic Organizations' Disaster Response
- Speaker: Cliff Landis, Atlanta University Center Robert W. Woodruff Library
Presentation materials[edit]
Meeting Notes[edit]
GaNCH: Using Wikidata for Georgia's Natural, Cultural, and Historic Organizations' Disaster Response, with Cliff Landis
- Project Members:
- Cliff Landis (Digital Initiatives Librarian, Atlanta University Center Robert W. Woodruff Library)
- Christine Wiseman (Assistant Director of Digital Services at the Atlanta University Center Robert W. Woodruff Library)
- Allyson F. Smith (Graduate Assistant, Atlanta University Center Robert W. Woodruff Library)
- Matthew Stephens (Web Developer, in collaboration with Atlanta University Center Robert W. Woodruff Library)
- Article: https://journal.code4lib.org/articles/15576
- Website: https://ganch.auctr.edu/
- GaNCH Data GitHub Repo: https://github.com/AUC-Woodruff-Library/GaNCH-data
- GaNCH Website GitHub Repo: https://github.com/AUC-Woodruff-Library/GaNCH-website
- Today’s Slides: https://docs.google.com/presentation/d/1tBA-9UIYicPEHFdrpYbJJfEV_45xwfsv3Cxpk8id4l4/edit?usp=sharing
- Notes:
- Goal to create a publicly editable directory of Georgia’s NCH’s.
- Funded by a Lyrasis Catalyst grant
- Background:
- New statewide planning initiatives: Georgia Heritage Responders Traingng & GaNCH.
- Georgia experiencing natural disasters. 2020 most active hurricane season on record.
- How would cultural heritage disaster responders find places impacted by natural disasters in real time?
- Several directories available, 1500+ organizations
- Wikidata provided flexibility for working with data about these organizations that could be easily accessed and updated.
- From design to workflow
- Initial test to scrape data from web directory, add GIS coordinates, upload to Wikidata.
- Data modeling focused on contact info, addresses, social media links.
- Some desired data elements were deemed out of scope.
- 2018: found 40 GLAMs in Wikidata as located in Georgia, but more actually found in Wikidata using different queries (data inconsistency)
- Identified hidden GLAM orgs to add to Wikidata.
- Ultimately, representing 1900 institutions in Wikidata
- Used OpenRefine, Visual Studio Code, GitHub
- VS Code to Open Refine (Wikidata reconciliation)
- 3 values for every statement: statement, reference URL, retrieved date.
- Captured reference links using IA Wayback Machine.
- Website mockup using MockFlow to create website wireframes.
- Saved copy of dataset to website. Nightly queries to update data.
- Map and table to represent data on the website. Mobile friendly, end users can export data as well.
- Sustainability plan developed for ongoing maintenance of the directory.
- Partner organizations providing support,
- Reminder email built into the website
- Option for orgs to provide updates to their information via Google form.
- Also, Cliff has Google alerts set up to help maintain the data set.
- Procedures documented on GitHub.
- Considerations
- Duplicates and name variations documented as encountered
- Identifying dissolved organizations, expired domains, national organizations, orgs that have moved out of Georgia
- Be aware of historical triggers. Made an effort to capture all organizations regardless of perspectives that they represent.
- Challenge with data model for municipalities and counties where Georgia’s municipality to county relationships don’t fit the hierarchical Wikidata data model.
- Decided to go against the best practices of P131 while consensus is outstanding in the Wikidata community.
- Cliff Landis and the Watchlist of Horror! Administrator bot removed P131 data which busted some of the queries. Still in progress….
- Challenges working with a volunteer community in terms of developing consensus for property changes.
- Still believes that the benefits outweigh the challenges.
- Possible to split county and municipality into P131 and P276?
- Query Maintenance
- Sometimes, things will break
- SPARQL queries don’t automatically redirect when items are merged.
- Opportunities to work with additional publicly available data. For example: added visit counts to Georgia Public Libraries
- Cliff is checking his Wikidata watch list every day for undesirable updates
- GaNCH in action:
- HERA email blast ahead of Hurrican Sally
- Learned from email bounce rate
- Able to follow up with orgs to update contact information
- Recent severe storms:
- Pre-emptive HERA email blast,
- Appreciative response from organizations
- Questions:
- Can you add “distinct” to your select query in the QS to address the multiples?
- Selling Wikidata as part of the grant process or as the data source for the project in general.
- Lyrasis Catalyst grant great for this, inherently experimental
- Concern in the community re: having a publicly-editable database (vandalism, etc.)
- Are the nightly downloads saved and versioned internally?
- Doing a download and overwrite each night.
- Is institutional failure one of the “disasters” that GaNCH wants to be aware of and responsive to?
- Are there other datasets that define county boundaries that could be used?
- Are other states developing similar datasets?