What the study found
The study found that machine learning and deep learning models could be used to forecast monthly tuberculosis incidence across cities and counties in Taiwan, China. The top-performing models were CatBoost, random forest, and gradient boosting. The authors also identified population size, sulfur dioxide levels, physician count, normalized difference vegetation index, wind velocity, and precipitation level as the main influences on tuberculosis incidence.
Why the authors say this matters
The authors conclude that the framework and findings provide data support and a decision-making basis for tuberculosis mitigation initiatives on a global scale. The study suggests that identifying the most important drivers may help guide efforts related to tuberculosis control.
What the researchers tested
The researchers analyzed data from 19 cities and counties in Taiwan, China from 2014 to 2022. They used four machine learning models and four deep learning models to predict monthly tuberculosis incidence based on 12 drivers, and combined the best models with post-hoc explainable machine learning techniques. They also used stepwise regression and statistical assessments to find a model with fewer drivers while keeping high predictive accuracy.
What worked and what didn't
CatBoost, random forest, and gradient boosting performed best among the tested models. The explainable methods consistently highlighted the same main influences: population size, sulfur dioxide levels, physician count, normalized difference vegetation index, wind velocity, and precipitation level. The study also reported nonlinear interactions and threshold effects between these factors and tuberculosis incidence.
What to keep in mind
The abstract does not describe detailed limitations or uncertainty measures. The summary is based on data from Taiwan, China and on monthly incidence from 2014 to 2022, so the scope is limited to that setting in the available text.
Key points
- CatBoost, random forest, and gradient boosting were the top-performing prediction models.
- Population size and sulfur dioxide levels were among the main influences on tuberculosis incidence.
- Physician count, vegetation index, wind velocity, and precipitation level were also identified as important drivers.
- The study reported nonlinear interactions and threshold effects between the determinants and tuberculosis incidence.
- Stepwise regression was used to find a smaller model with high predictive accuracy.
Disclosure
- Research title:
- Machine-learning models identified key drivers of tuberculosis incidence in Taiwan
- Authors:
- Yiwen Tao, Jiaxin Zhao, Hao Cui, Zhanlue Liang, Jian Li, Jingli Ren, Huaiping Zhu
- Institutions:
- Sichuan University, The University of Queensland, West China Hospital of Sichuan University, York University, Zhengzhou University, Zhengzhou University, Zhengzhou University, Zhengzhou University, Zhengzhou University, Zhengzhou University of Science and Technology, Zhengzhou University of Science and Technology
- Publication date:
- 2026-02-26
- OpenAlex record:
- View
Get the weekly research newsletter
Stay current with peer-reviewed research without reading academic papers — one filtered digest, every Friday.

