Skip to content

Latest commit

 

History

History
384 lines (284 loc) · 46.2 KB

SciDataCon-2018-data-sharing.md

File metadata and controls

384 lines (284 loc) · 46.2 KB

Data sharing as a new component of addressing and preparing for disease outbreaks

AboutAbstractTimelineReusing dataFuture?EpilogSee also

Map of humanitarian support to the Great Eastern Japan Earthquake Map of humanitarian support to the Great Eastern Japan Earthquake (by NuclearVacuum and Kahusi, CC BY-SA 3.0)

Abstract

Public health emergencies require profound and swift action at scale with limited resources, often on the basis of incomplete information and frequently under rapidly evolving circumstances. While emergency-triggered sharing goes back millennia, data sharing is a relatively new flavour under this broader theme, but one that has been receiving attention over the last few years, especially in the context of public health emergencies like the Ebola or Zika outbreaks.

In response, researchers, research institutions, journals, funders and others have taken steps towards increasing the sharing of data around ongoing public health emergencies and in preparation for future ones. These measures range from the adoption of open lab notebooks to modifications of policies and funding lines, and they include conversations around infrastructure and cultural change.

In this contribution, I will provide an overview of different ways in which the sharing of data has played a role in public health emergencies, highlighting steps that have already been taken over the last decade as well as challenges still lying ahead.

While focusing on disease outbreaks, I will draw on examples from other public health emergencies as well (e.g. earthquakes or tropical storms) and discuss their applicability in the context of infections. The examples will span the entire data life cycle of public health emergencies, from preventive measures and routine public health surveillance data to the tracking of pathogens, investigating pathogen transmission and other host-pathogen interactions, as well as diagnostics, vaccination, epidemiological modelling, data ethics and other related topics, concluding with considerations around the potential impact of preserving and sharing data, or failing at that.

Historic timeline

1854: Cholera in London2002: SARS2004: Indian Ocean earthquake2005: H5N12005: Hurricane Katrina2009: H1N12010: Haiti earthquake2011: Tōhoku earthquake2011: Escherichia coli O104:H42013: Typhoon Haiyan2015: Ebola2016: Zika

Map of 1854 Broad Street cholera outbreak

John Snow's original map of the 1854 Broad Street cholera outbreak in London. Cholera cases are highlighted in black, as are water pumps (data available here). The pump on Broad Street was identified as the one through which the contaiminated water was distributed. Removing its handle then essentially stopped the outbreak, and when the next Cholera outbreak hit London in 1866, sanitary measures had been improved.

Map of the Severe Acute Respiratory Syndrome epidemic 2002-2003 Map of the Severe Acute Respiratory Syndrome epidemic 2002-2003

  • Anatomy of the Epidemiological Literature on the 2003 SARS Outbreaks in Hong Kong and Toronto: A Time-Stratified Review
    • Only 22% of the studies were submitted, 8% accepted, and 7% published during the epidemic.

    • Just as theoretical modelers have shifted to real-time approaches, for example, to estimate the basic/effective reproduction number of an epidemic [29], “field epidemiologists” should benefit from real-time tools for protocol writing and questionnaire design, and have them readily available on the Web so as to be prepared for the next emerging disease outbreak.

    • editorial commentary
      • Based on their findings, it would be hard to conclude that journal publication was a successful mechanism for rapidly sharing information.

      • Whatever the answers may be, it seems clear that before the next public health emergency strikes, the scientific publishing establishment needs to ask itself how it can respond in the way the world needs.

Flyers with information about missing persons line the walls outside Tentera Nasional Indonesia Military Hospital in Banda Aceh Flyers with information about missing persons line the walls outside Tentera Nasional Indonesia Military Hospital in Banda Aceh (by Rebecca J. Moat, public domain)

  • paper on mass fatality management
    • from the editor's summary:
      • None of the countries had sufficient refrigerated storage available to store bodies until they could be identified.

      • Methods and efficiency of identification varied between and within countries. One hospital in Sri Lanka excelled by systematically photographing all bodies brought in and recording sex, height, and personal effects: 87% of the bodies brought here were identified. But in most areas rates of identification were much lower. It seemed that simple methods of identification were the most useful: photographs taken quickly before the bodies started to decompose, dental records, and personal effects found on the bodies. DNA analysis was only useful for a small number of bodies.

Global spread of H5N1 Map of the global spread of H5N1 in 2005 (by Zuanzuanfuwa, public domain)

  • Global Initiative on Sharing Avian Influenza Data (GISAID)
    • paper
      • to understand better the spread and evolution of the virus, and the determinants of its transmissibility and pathogenicity in humans [..] demands that scientists with different fields of expertise have full access to comprehensive genetic-sequence, clinical and epidemiological data from both animal and human virus isolates.

      • The current level of collection and sharing of data is inadequate, however, given the magnitude of the threat.

      • We propose to expand and complement existing efforts with the creation of a global consortium — the Global Initiative on Sharing Avian Influenza Data (GISAID; http://gisaid.org) — that would foster international sharing of avian influenza isolates and data.

New Orleans flooded by Hurricane Katrina New Orleans flooded by Hurricane Katrina (by Jocelyn Augustino, public domain)

Map of H1N1 cases Map of H1N1 cases (by HotWikiBR, public domain)

OpenStreetMap of Port-Au-Prince and Carrefour OpenStreetMap of Port-Au-Prince and Carrefour (by OpenStreetMap, CC BY-SA 2.0)

  • OpenStreetMap response

  • Crisis camp

  • Random Hacks of Kindness

  • Cholera Epidemics of the Past Offer New Insights Into an Old Enemy

    • Mathematical modeling of the spread and health impact of cholera is a key effort to guide policy makers and intervention planners about the projected impact of interventions, such as vaccinations, in contemporary outbreaks [5–7].

    • Despite this public health importance, key aspects of cholera disease dynamics [..] contain a large amount of uncertainty or remain unresolved [8]. The high-quality epidemiological data needed to address these uncertainties are often lacking, especially in outbreak situations [6].

    • To fill this data void, we investigated data from an underutilized source: 19th-century cholera epidemics in Europe. Denmark provides an excellent source as its population was not exposed to cholera, likely due to a quarantine at the Danish coast [9]. Finally, in 1853, a year after the quarantine was lifted, a single and catastrophic outbreak hit the nation, including Copenhagen.

    • Time series of daily cholera morbidity and mortality counts by age and sex were obtained from datasets compiled by contemporary physicians in 3 towns and cities in Denmark: Copenhagen and Aalborg in 1853 and Korsør in 1857.

    • To provide epidemiological context to the outbreaks, we acquired cause-specific mortality data for the surrounding years for Copenhagen. We also obtained age-specific population data for Copenhagen, Aalborg, and Korsør.

  • Also in 2010: Sharing health data: good intentions are not enough

    • As they prepare for careers in science, today’s students doubtless hear the same clichés as we did a generation ago: science advances collaboratively; we reproduce and extend the work of others; we stand on the shoulders of giants. In some fields, such as genomics, these axioms are becoming true. In epidemiology and public health, however, data sharing and collaboration remain more aspirational than real.

    • Students embark on a career in health research in the spirit of sharing; they want to help improve the well-being of others. For all the talk of collaboration, they will enter a world in which another axiom dominates: “publish or perish”. That system puts the interests of public health researchers in direct conflict with the interests of public health.

    • Genomics has taught us that sharing data with other scientists is a way to add value without costing a lot. It allows the same data to be used to answer new questions that may be relevant far beyond the original study. And it allows for meta-analyses that are free from the distortions introduced when only summary results are available.3,4 We could get far more out of public health research if we followed a similar path, if we squeezed more scientific and policy insights out of data that have already been collected.

    • Sailors keeping log books on whaling boats in the 1600s could not have predicted that, centuries later, the data would be an important source of information for climate change scientists.25

    • Here we propose several goals to which funders and researchers can jointly aspire and towards which progress can be measured:

      • (i) all data of potential public health importance funded by taxpayers or foundations will be appropriately documented and archived in formats accessible to the wider scientific community;
      • (ii) all data provided by governments to databases developed by publicly-funded organizations will be freely available to any user, at the level of detail at which it was provided;
      • (iii) the publication of a research article in a biomedical journal will be accompanied by the publication of the data set upon which the analysis is based;
      • (iv) funders and employers of researchers will consider publication of well managed data sets as an important indicator of success in research, and will reward researchers professionally for sharing data; and
      • (v) all planned research will budget and be funded to manage data professionally to a quality adequate for archiving and sharing.

Deaths and missing persons by prefecture from 2011 Tohoku Earthquake

Deaths and missing persons by prefecture from 2011 Tohoku Earthquake (by mti,InoueKeisuke,CES1596, public domain)

EHEC outbreak stats

Typhoon Haiyan as seen from the International Space Station Typhoon Haiyan as seen from the International Space Station (by NASA/Karen Nyberg, public domain)

Ebola Virus particles Ebola Virus particles (NIAID, CC BY 2.0)

Ebola: safe burial Ebola: safe burial (CDC, CC BY 2.0)

Zika virus cryo-EM structure Zika virus cryo-EM structure (Starless, public domain, based on open data)

Aedes aegypti female feeding on human blood Aedes aegypti female feeding on human blood (CDC, public domain). These mosquitos can transmit multiple viruses, including the Zika virus.

Pool on an abandoned property Pool on an abandoned property (Idamantium, CC BY-SA 3.0)

Abandoned tyres Abandoned tyres — another breeding ground for mosquitos (Gerd Danigel, CC BY-SA 4.0)

Reusing data

Data modeling

Future?

Endangered Data Week

Epilog

A fire truck being unloaded as part of humanitarian assistance after the 2009 Samoa earthquake and tsunami Humanity should continue the great tradition of sharing physical goods and logistics in the case of public health emergencies ...

The sharing of data and metadata relating to public health emergencies should be routine, not the exception. ... and the routine sharing of data and metadata relating to public health emergencies should become part of this tradition.

(Fire truck photo by Australian Department of Foreign Affairs and Trade, CC BY 2.0; data sharing illustration by Ainsley Seago, CC BY 4.0)

About

This file hosts a contribution to SciDataCon 2018. It continues a series of talks on the subject that was started with a SciDataCon 2016 talk. For SciDataCon 2018, it was originally submitted as a separate session "Data sharing in public health emergencies", then merged into a related session Open Data from Cell Biology of Infectious Pathogens, how far are we? , which was then merged further into the session Health Databases across the African Continent: What do we have and what do we need for Sustainable Development? taking place on 6 November at 14:00–15:30 in rooms Serondela 1 and 2, where this contribution will be one of four oral presentations that lead to a panel discussion.

For a previous version of this talk, given on February 26, 2018 in the framework of Endangered Data Week, see here.

Title of the original session submission

Data sharing in public health emergencies

Session description for the original session submission

Humanity has a long history of sharing, especially in the wake of disasters. Data sharing is a relatively new flavour under this broader theme, but one that has been receiving attention over the last few years, especially in the context of public health emergencies like the Ebola or Zika outbreaks.

In response, researchers, research institutions, journals, funders and others have taken steps towards increasing the sharing of data around public health emergencies. These measures range from the adoption of open lab notebooks to modifications of policies and funding lines, and they include conversations around infrastructure and cultural change.

In this session, representatives from various stakeholder groups are invited to contribute their perspectives in a series of lightning talks, highlighting steps that have already been taken as well as challenges still lying ahead.

Together, these talks - and the ensuing discussion - are to span the entire data life cycle of public health emergencies, from preventive measures and routine public health surveillance data to the tracking of pathogens, investigating pathogen transmission and other host-pathogen interactions, as well as diagnostics, vaccination, epidemiological modelling and other related efforts.

This submission addresses all four of the high-level themes of SciDataCon 2018:

  • The digital frontiers of global science;
  • a global and inclusive data revolution;
  • applications, progress and challenges of data intensive research;
  • data infrastructure and enabling practices for international and collaborative research.

A version of this abstract can be found at https://github.com/Daniel-Mietchen/events/blob/master/SciDataCon-2018-data-sharing.md .

See also