Wikidata:WikiProject PCC Wikidata Pilot/Smithsonian Libraries/Projects/Artists Files

From Wikidata
Jump to navigation Jump to search

Aim and Scope[edit]

On Wikidata project list (P5008) Q105757729 Art and artists' Files

  • Pilot project to expand local database of artists to a linked environment.

Background[edit]

The Smithsonian Libraries' Art and Artist Files (AAF) are an exceptionally valuable resource for art historical research done on emerging regional and local artists and often are the only obtainable sources of information on those artists. Spread over seven branches, the vertical files contain information on artists and art collectives, galleries, and museums from around the world, but primarily from North America and Africa. There are over 60,000 names of artists and arts organizations (such as galleries, museums, etc.), which can be found in our in-house database: https://library.si.edu/art-and-artist-files Smithsonian Libraries and Archives branches with Artist Files:

  • Smithsonian American Art and Portrait Gallery Library Q98665879
  • Anacostia Community Museum Library Q106205264
  • Cooper Hewitt, Smithsonian Design Museum Library Q106205286
  • Freer Gallery of Art and Arthur M. Sackler Gallery Library Q106205309
  • Hirshhorn Museum and Sculpture Garden Library Q106205479
  • Vine Deloria, Jr. Library, National Museum of the American Indian Q106205410
  • Warren M. Robbins Library, National Museum of African Art Q106205419

Timeline[edit]

  • Spring/Summer 2020 - Form internal committee to investigate and draft wikidata workflows and possibilities.
  • Fall 2020 - Create and Test workflows with pilot set of names derived from Art & Artist File (AAF) database that matched with SAAM's Artist Names list (pilot list =3,797 names). Workflows include using OpenRefine to match existing Wikidata QIDs and pull in other standard identifiers (VIAF, LCNAF, ARTIC, ULAN, etc.) Draft Base, Core, Extended Properties for creating new QIDs. Final product of pilot will be matched list, plus list of new QIDs to be created, a tested workflow, and staff will build skills with new software.
  • Spring 2021 - Draft workflow to create new Wikidata QIDs for artists discovered in this pilot to not have them, to include using OpenRefine and/or QuickStatements to create new IDs using Core+ properties. Refine pilot workflow to fit next phase of ~60,000 names to be reconciled from the existing AAF database.
  • Early Summer 2021 - Implement new workflows to reconcile smaller sub-set of names from the existing AAF database. The revised final product will be a fully matched set from the pilot list, including all new QIDs created for this group. All will have our specific project and "has artist file at" properties added to Wikidata, and data will be prepped and modeled for input into Smithsonian's nascent wikibase.
  • Changes made post-pilot:
    • Changes made during pilot experimentation include no longer creating new QIDs in remote environment, and focusing on the artists held in the American Art and Portrait Gallery branch.
    • Moving from an OpenRefine-based reconciliation to crowd-sourcing using Mix'n'Match Catalog, and later additions using OpenRefine and/or QuickStatements.
    • New QIDs will be created when concrete biographical information can be verified from physical sources, when the Library is open, post-Covid. Non-AAPG artists, particularly those pat of the African Art and American Indian collections, will be moved to their own separate project, using verified physical materials and subject specialists.

Contributors[edit]

Workflow[edit]

Tasks[edit]

  1. Identify and/or create Properties to pull/provide data for.
  2. Select subset of names to create new internal workflows.
  3. Reconcile artist names using OpenRefine.
  4. Create new QIDs for artists. (During Pilot phase only)
  5. Outreach dashboard for the project
  6. Demo on Listeria pages
  7. Data Modeling via User Stories and WDQS

Properties[edit]

Core Properties[edit]

Property Usage note
instance of (P31) Class of which this subject is a particular example and member; for SI, basically always "human"
family name (P734) Family name, last name, repeatable if changed with marriage, etc.
given name (P735) Given name, first name, nicknames.
sex or gender (P21) Sex or gender identity of human; Possibilities include female, male, non-binary, intersex, transgender, agender.
date of birth (P569) Date on which the subject was born; Preference for full date if available.
date of death (P570) Date on which the subject died
country of citizenship (P27) the object is a country that recognizes the subject as its citizen; can have multiple entries
occupation (P106) Occupation of a person; artist, sculptor, painter, etcher, ceramicist. AAF project can have multiple entries, always include artist Q483501 as highest, and lowest specialty (so etcher, not printmaker).
pseudonym (P742) Alias, AKA Note
on focus list of Wikimedia project (P5008) On Wikidata project list: AAF QID Q105757729
artist files at (P9493) [new property] institution that holds artist files about the subject. At Smithsonian, each branch library with files would enter its own QID.

Extended Properties[edit]

Property Usage note
floruit (P1317) date when the person was known to be active or alive, when birth or death not documented; active, circa, flourished. Use P2031 and P2032 for the start date and end date if known.
place of birth (P19) country of birth
place of birth (P19) country of death
notable work (P800) notable scientific, artistic or literary work, or other work of significance among subject's works; particularly artworks for this Project.
movement (P135) artistic movement or scene, i.e. Surrealism, naturalism, Pop Art
family name (P734) part of full name of person
ethnic group (P172) subject's ethnicity (consensus is that a VERY high standard of proof is needed for this field to be used. In general this means 1) the subject claims it themself, or 2) it is widely agreed on by scholars, or 3) is fictional and portrayed as such)
work location (P937) location where artist was active, i.e. city, region
ethnic group (P172) ethnicity, culture, people, cultural nationality (not citizenship), race; Must have high-quality reference, map to NMAI Tribes from AAF data (Nationality); do we include "White Americans (Q49078)" where confirmed?
employer (P108) person or organization for which the subject works or worked
educated at (P69) educational institution attended by subject/artist
has works in the collection (P6379) museums/galleries that hold artworks by the artist, repeatable
Smithsonian American Art Museum person/institution ID (P1795) SAAM's constituent ID

Authority Identifiers[edit]

Property Usage note
Library of Congress authority ID (P244) LC NAF ID
WorldCat Identities ID (superseded) (P7859) WorldCat Identities ID
Union List of Artist Names ID (P245) Getty Union List of Artist Names
VIAF ID (P214) Virtual International Authority File


Project Year-end report[edit]

Goals & Final Product: The Artists Files PCC Wikidata Pilot Project wrapped up its work in late summer 2021. Our primary goals were to understand wikidata as a tool, to investigate its potential as a platform to manage our existing data related to the Art & Artist Files (AAF) collection, and to learn new skills and software related to its functions. Our tasks to achieve these goals are noted above.

Our final pilot product was a complete subset of artists in our AAF collection that have been matched, or newly created, in wikidata, with the addition of our identified core properties as we could include them, ready to add to the Smithsonian’s nascent wikibase. In addition, we created and tested new workflows, played with different means of reconciling and importing changes, and determined a path forward for the much larger group of records from the AAF collection.

Challenges: Outside of the learning curves, we faced challenges that caused us to pivot mid-pilot, or in our post-pilot project continuation. The biggest challenge was our own incomplete or generic data, and the inability to confirm biographical details from a home/telework environment. For our post-pilot work, this was especially difficult in reconciling non-Western names, for example, those artists from African countries, because of both the difference in spellings between languages and because of the dearth of representation from many non-Western countries.

Our data modeling was a great exercise in determining what core and extended properties were most important for our users in describing artists, and therefore needed to be included in whatever we created. However, even these were challenging, in that we’d ideally like to know and represent all potential datapoints about a person! As a group, we also faced challenges in not wanting to falsely ascribe an artist’s gender or race/ethnicity/cultural affiliation, which are key interest to researchers.

In addition, we changed workflows multiple times, first attempting to use a Google Sheets wikidata plug-in that didn’t work as hoped, then a workflow using OpenRefine and a series of claim/upload/reconcile/reenter/hi-lite process that worked for quality and process, but was complicated for a project that could have many hands. Our final post-pilot process is to use combination: we created a Mix’n’Match catalog as a far easier, crowd-sourceable tool for the post-pilot set of artists names, and those that are not reconciled will be put through the OpenRefine workflow, with the potential addition of time to research print materials onsite.

Suggestions for others in future projects: We hope that the Smithsonian's Art Libraries, and others in the practice, could plan projects that add non-Western artists to wikidata. And we felt like our final post-pilot project workflow is much more simple than some of our original workflows and tools, but facing those challenges along the way were all valuable learning opportunities.