Overview
This narrative review examines the integration of big data ecosystems and machine learning algorithms into infectious disease surveillance and control systems in the United States. The analysis addresses longstanding deficiencies in U.S. infectious disease surveillance, including delayed feedback mechanisms, inefficient data infrastructure, and limited predictive capacities that have been revealed during recent outbreak events. The review synthesizes current literature on how emerging computational methodologies and expanded data sources can enhance the timeliness, accuracy, and robustness of infectious disease monitoring and response capabilities at the national level.
Methods and approach
The review systematically categorizes both conventional and emerging data sources applicable to infectious disease surveillance. Conventional public health data repositories are examined alongside digital, genomic, and non-conventional sources including electronic health records, syndromic surveillance systems, mobility datasets, social media platforms, wearable biosensing devices, and genomic pathogen sequencing data. The methodological framework encompasses multiple machine learning paradigms: supervised learning, unsupervised learning, and deep learning architectures. These computational approaches are evaluated specifically for their applicability to three core functions: disease detection, epidemic forecasting, and risk assessment. The synthesis focuses on practical implementations within the U.S. public health infrastructure context.
Key Findings
The review identifies multiple operational domains where machine learning applications have demonstrated utility in infectious disease control. Documented applications include early warning systems for outbreak detection, disease control interventions, resource allocation optimization, and precision medicine approaches tailored to public health contexts. The analysis demonstrates how machine learning-enabled systems can address specific operational challenges in U.S. surveillance infrastructure by leveraging diverse data streams from both traditional epidemiological sources and novel digital health platforms. The integration of genomic sequencing data with computational analytics represents a particularly significant advancement for pathogen tracking and characterization.
Implications
The findings indicate substantial potential for machine learning methodologies to transform infectious disease surveillance capabilities in the United States through improvements in accuracy, speed, and system robustness. The review identifies future research directions necessary for advancing machine learning applications in disease control, suggesting that continued development of these computational approaches could address critical gaps in current surveillance infrastructure. The integration of big data ecosystems with machine learning algorithms represents a foundational shift in epidemiological practice, moving beyond reactive surveillance toward predictive and proactive public health systems. Implementation of these technologies at scale requires continued attention to data integration challenges, algorithmic validation, and operational deployment within existing public health frameworks.
Disclosure
Key points
- Research title: Big Data and Machine Learning Applications for Enhanced U.S. Infectious Disease Surveillance and Control: A Narrative Review
- Authors: Merrera S. Kebeba, Emmanuel Amoako Agyei
- Institutions: Santa Clara County Behavioral Health Services, Washington University in St. Louis
- Publication date: 2026-02-24
- DOI: https://doi.org/10.5281/zenodo.18761941
- OpenAlex record: View
- Image credit: Photo by ThisisEngineering on Unsplash (Source • License)
- Disclosure: This post was generated by Claude (Anthropic). The original authors did not write or review this post.
Disclosure
- Research title:
- Big Data and Machine Learning Applications for Enhanced U.S. Infectious Disease Surveillance and Control: A Narrative Review
- Publication date:
- 2026-02-24
- OpenAlex record:
- View
Get the weekly research newsletter
Stay current with peer-reviewed research without reading academic papers — one filtered digest, every Friday.

