Wikidata:WikiProject Zika Corpus

From Wikidata
Jump to: navigation, search

Home

 

Items

 

Properties

 

Queries

 
Zika-chain-colored.png
Welcome to WikiProject Zika Corpus
This is a WikiProject dedicated to the creation of a rich corpus on all scholarly knowledge on Zika Virus.

About[edit]

WikiCite 2016 report.pdf

In February 2016, the World Health Organization declared a public health emergency over the Zika virus outbreak and its links (then suspected, by now confirmed) to microcephaly and Guillain-Barré syndrome. By that time, around 150 scholarly articles had been published about the virus since its discovery in 1947, and the majority of these articles had already been assigned Wikidata items.

Since then, the literature on the topic has grown about tenfold (see timeline), and the Wikidata coverage has mostly kept pace, with a typical time lag of less than a week. While not complete, this corpus covers most PubMed-indexed English-language articles reporting or reviewing original research about the Zika virus and the infections it can cause in mosquitoes, humans and animal models, as well as about approaches to prevention, diagnostics, therapy, or surveillance.

The Zika corpus served as a nucleus for creating a citation graph on Wikidata and for exploring co-author networks and similar information on Wikidata. It is now slowly expanding to encompass literature about related subjects, e.g., flaviviridae and mosquito-borne diseases more broadly, epidemiological modeling or data sharing in public health emergencies.

Objectives[edit]

  • Curate the corpus of Zika data
  • Create a demonstrator for WikiCite: a consistent/interesting/visualizable dataset
  • Prototype data visualization/storytelling ideas for exploring the corpus

Target Audiences[edit]

  • Sociologists of science (including STS, information science, bibliometrics, social scientists? Or should we describe multiple separate groups here)
    • democratizing access to datasets that have traditionally been controlled by a small group of academic players
    • which topics were the current authors of Zika research previously studying?
  • The general public
    • public understanding of research on Zika and how this research evolved, e.g. timelines of when the news knew about the virus, when it became public knowledge, compared to when the papers were published, social media coverage and compared to the geographic spread of the virus and cases over time.
  • Journalists
    • how much is Zika research costing? where is funding coming from? Is funding coming from tax dollars and research coming from govt orgs? It matters because our representatives' and institutions' ability to respond to global health crises depends on budget
    • what treatments are currently available? Are there advances that may provide treatment in the near future?
    • how the public opinion is understanding or potentially distorting trustworthy information on the topic
    • personal stories

To Do[edit]

  • Extract and store author affiliations
  • Extend coverage of statements supported by specific sources
  • Add funder organizations from CrossRef Funder Registry to Wikidata. It's CC0 and "a unique taxonomy of grant-giving organizations". Downloadable RDF
  • Add funder information for papers
    • PMC API
    • NLP may help:
      • Councill, Isaac G., C. Lee Giles, Hui Han, and Eren Manavoglu. "Automatic acknowledgement indexing: expanding the semantics of contribution in the CiteSeer digital library." In Proceedings of the 3rd international conference on Knowledge capture, pp. 19-26. ACM, 2005. Q30046394
      • Giles, C. Lee, and Isaac G. Councill. "Who gets acknowledged: Measuring scientific contributions through automatic acknowledgment indexing."Proceedings of the National Academy of Sciences of the United States of America 101, no. 51 (2004): 17599-17604. Q30046493
      • Khabsa, Madian, Pucktada Treeratpituk, and C. Lee Giles. "Ackseer: a repository and search engine for automatically extracted acknowledgments from digital libraries." In Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries, pp. 185-194. ACM, 2012. Q30050797
      • Khabsa M., Koppman S., Giles C.L. (2012) Towards Building and Analyzing a Social Network of Acknowledgments in Scientific and Academic Documents. In: Yang S.J., Greenberg A.M., Endsley M. (eds) Social Computing, Behavioral - Cultural Modeling and Prediction. SBP 2012. Lecture Notes in Computer Science, vol 7227. Springer, Berlin, Heidelberg
  • Add paper topics
    • Extract and add MeSH (Q199897) terms
  • Define a property to help set the boundaries of the bibliographic corpus
    • We could just define an item "Zika corpus (v1) and set relevant items as "part of" that corpus

See also[edit]

Participants[edit]

[+] Add yourself to the list

The participants listed below can be notified using the following template in discussions:

{{Ping project|Zika Corpus}}