NIH-led effort Examines Use of Big Data for Infectious Disease Surveillance
Big data derived from electronic health records, social media, the internet and other digital sources have the potential to provide more timely and detailed information on infectious disease threats or outbreaks than traditional surveillance methods. A team of scientists led by the National Institutes of Health reviewed the growing body of research on the subject and has published its analyses in a special issue of The Journal of Infectious Diseases.
Traditional infectious disease surveillance — typically based on laboratory tests and other data collected by public health institutions — is the gold standard. But, the authors note it can have time lags, is expensive to produce, and typically lacks the local resolution needed for accurate monitoring. Further, it can be cost-prohibitive in low-income countries. In contrast, big data streams from internet queries, for example, are available in real time and can track disease activity locally, but have their own biases. Hybrid tools that combine traditional surveillance and big data sets may provide a way forward, the scientists suggest, serving to complement, rather than replace, existing methods.
“The ultimate goal is to be able to forecast the size, peak or trajectory of an outbreak weeks or months in advance in order to better respond to infectious disease threats. Integrating big data in surveillance is a first step toward this long-term goal,” says Cecile Viboud, PhD., co-editor of the supplement and a senior scientist at the NIH’s Fogarty International Center. “Now that we have demonstrated proof of concept by comparing data sets in high-income countries, we can examine these models in low-resource settings where traditional surveillance is sparse.”
Experts in epidemiology, computer science and modeling collaborated on the supplement’s 10 articles. They report on the opportunities and challenges associated with three types of data: medical encounter files, such as records from healthcare facilities and insurance claim forms; crowdsourced data collected from volunteers who self-report symptoms in near real time; and data generated by the use of social media, the internet and mobile phones, which may include self-reporting of health, behavior and travel information to help elucidate disease transmission.
Read the full release at nih.gov.