Machine Learning Applications in Population and Public Health: Guidelines for Development, Testing, and Implementation

Upstream Lab director Dr. Andrew Pinto and a diverse team of experts have developed new guidelines for using machine learning in population and public health, with the aim of mitigating bias and promoting health equity.

Machine learning is being used in public health for early warning of infectious disease outbreaks, predicting the future burden of noncommunicable diseases, and assessing public health interventions. However, machine learning can inadvertently produce biased outputs related to the quality and quantity of data, who is engaged and helping direct the analysis, and how findings are interpreted.

These guidelines were developed in consultation with experts from multiple fields – including computer science, statistical modeling, clinical and population health epidemiology, health economics, ethics, sociology, and public health – and drew upon literature reviews.

The five key recommendations are:

1) Prioritize partnerships and interventions to support communities considered structurally disadvantaged

2) Use machine learning for dynamic situations, such as public health emergencies, while adhering to ethical standards

3) Conduct risk assessments and bias mitigation strategies aligned with identified risks

4) Ensure technical transparency and reproducibility by publicly sharing data sources and methodologies

5) Foster multidisciplinary dialogue to discuss the potential harms of machine learning-related bias and raise awareness among the public and public health community

Published in JMIR Public Health and Surveillance

READ ARTICLE