AUTHOR=Finnegan Amy , Potenziani David D. , Karutu Caroline , Wanyana Irene , Matsiko Nicholas , Elahi Cyrus , Mijumbi Nobert , Stanley Richard , Vota Wayan TITLE=Deploying machine learning with messy, real world data in low- and middle-income countries: Developing a global health use case JOURNAL=Frontiers in Big Data VOLUME=5 YEAR=2022 URL=https://www.frontiersin.org/journals/big-data/articles/10.3389/fdata.2022.553673 DOI=10.3389/fdata.2022.553673 ISSN=2624-909X ABSTRACT=

The rapid emergence of machine learning in the form of large-scale computational statistics and accumulation of data offers global health implementing partners an opportunity to adopt, adapt, and apply these techniques and technologies to low- and middle-income country (LMIC) contexts where we work. These benefits reside just out of the reach of many implementing partners because they lack the experience and specific skills to use them. Yet the growth of available analytical systems and exponential growth of data require the global digital health community to become conversant in this technology to continue to make contributions to help fulfill our missions. In this community case study, we describe the approach we took at IntraHealth International to inform the use case for machine learning in global health and development. We found that the data needed to take advantage of machine learning were plentiful and that an international, interdisciplinary team can be formed to collect, clean, and analyze the data at hand using cloud-based (e.g., Dropbox, Google Drive) and open source tools (e.g., R). We organized our work as a “sprint” lasting roughly 10 weeks in length so that we could rapidly prototype these approaches in order to achieve institutional buy in. Our initial sprint resulted in two requests in subsequent workplans for analytics using the data we compiled and directly impacted program implementation.