Anyone can download the source code from GitHub (which is kind of like Twitter for sharing code), run genetic sequencing data for whatever outbreak they are following through the pipeline and build a web page showing a phylogenetic tree, or genetic history, in a few minutes, he said.
Real-time tracking of genetic mutations during disease outbreaks helps scientists discern what makes viruses so severe and inform public health efforts to contain them, whether setting up treatment and isolation units for Ebola or instituting mosquito control for Zika. The tool can be used to model something as small as a single hospital outbreak or as large as a global pandemic.
Bedford and Neher were among six teams of finalists chosen from 96 entries representing 450 innovators and 45 countries. The sponsors of the competition — The U.S. National Institutes of Health, the British-based charitable foundation Wellcome Trust and the U.S.-based Howard Hughes Medical Institute — announced the first-phase winners Monday at the 7th Health Datapalooza Conference in Washington D.C., which brings together companies, startups, academics and government agencies who believe in “liberating” health data to improve patient care.
‘Moonshot’ goals: Data sharing, collaboration
The goal is to stimulate collaboration – also a key directive of Vice President Joe Biden’s National Cancer Moonshot Initiative – by encouraging the development of new tools and platforms that make data open and findable for use by other scientists.
Bedford and Neher, who heads a group at the Max Planck Institute for Developmental Biology in Tuebingen, Germany, met in early 2012 at a flu symposium and started working together on the predecessor to nextstrain in fall 2014. Nextstrain is based on Bedford and Neher’s earlier code for analyzing and illustrating patterns of epidemic growth, geographic spread and adaptive evolution of the influenza virus.
Called nextflu, it is now used by the World Health Organization and the U.S. Centers for Disease Control and Prevention to help determine which flu virus strain to include in each year’s flu vaccine. Influenza mutates rapidly, quickly outfoxing our immune response, which is why we need a flu shot every year instead of one, lifelong vaccination. The goal of the nextflu project is to make sure the vaccine closely matches circulating strains.
While the two scientists will continue working to improve and update nextflu, they wanted to expand the concept to work for other viruses.
"I'm imagining at some point, nextstrain will have all the viruses on it, basically," Bedford said.
Open science
The operative words to describe both nextflu and nextstrain are “real time.” Scientists have long analyzed and illustrated patterns of epidemic growth, spread and evolution retrospectively. Bedford and Neher are doing it in real time and predictively, a feat made possible by the increasing availability of genomic data — and the willingness of researchers to share that data.
Bedford is quick to point out that the basics of building a phylogenetic tree from sequence data is not that difficult to do (at least for a computational biologist). But by making the coding modular and openly shared, no one has to start from scratch.
“There are huge efficiencies to be gained by not having to do the same thing over again,” Bedford said. “And generally, you’ll get more out of it if you have more people looking at it.”
In a sense, the two scientists’ recognition in the Open Science competition is a salute toward two types of openness — their willingness to share their coding and the tool’s dependence on other scientists sharing data on genomic sequences.
In a world of competition to publish in prestigious journals and stake claims to discoveries, the open science movement — the belief that data and methods should be open and shared —is not embraced by every scientist. But evolutionary and computational biologists like Bedford and Neher are in the movement’s vanguard. It’s a matter of culture, Bedford believes. One goad is that their fields are the ones most concerned with outbreaks, where waiting to publish can have deadly consequences.
“There is a movement away from only sharing at publication to releasing data and manuscripts early, before publication, so it can be commented upon and evolve before it’s actually published,” he said. “It is moving away from publication as the only means of scholarly communication.”
Bedford believes that efforts such as the vice president’s to promote data-sharing and collaboration help the cause.
“Culture is not evenly distributed, so different fields are going to behave differently,” he said. “But there is definitely a movement toward being more open.”
What’s next
For Phase II, the six winning Phase I teams will submit their prototypes, a brief progress report and a case for the award. As befits an “open science” competition, the proposals will be put to a public vote. The three that rank highest will be reviewed by expert advisers and picked by a panel of judges based on actual benefit and future impact, degree of innovation and level of demand and utility. The winner of the top prize of $230,000, will be announced in late February or early March 2017.
In the meantime, each team of finalists receives $80,000, funding that Bedford and Neher — who now do all of the coding for nextflu and nextstrain themselves — intend to put toward hiring a programmer.
But most of all, this prestigious new prize honoring open science will help further a cause that Bedford and Neher believe in deeply.
“It’s nice to have the recognition for what I’ve been trying to push for a while — this open science,” Bedford said. “The best part is the outreach and having promotion for what we’re doing.”