AUTHOR=Estes Lyndon D. , Ye Su , Song Lei , Luo Boka , Eastman J. Ronald , Meng Zhenhua , Zhang Qi , McRitchie Dennis , Debats Stephanie R. , Muhando Justus , Amukoa Angeline H. , Kaloo Brian W. , Makuru Jackson , Mbatia Ben K. , Muasa Isaac M. , Mucha Julius , Mugami Adelide M. , Mugami Judith M. , Muinde Francis W. , Mwawaza Fredrick M. , Ochieng Jeff , Oduol Charles J. , Oduor Purent , Wanjiku Thuo , Wanyoike Joseph G. , Avery Ryan B. , Caylor Kelly K. TITLE=High Resolution, Annual Maps of Field Boundaries for Smallholder-Dominated Croplands at National Scales JOURNAL=Frontiers in Artificial Intelligence VOLUME=4 YEAR=2022 URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2021.744863 DOI=10.3389/frai.2021.744863 ISSN=2624-8212 ABSTRACT=
Mapping the characteristics of Africa’s smallholder-dominated croplands, including the sizes and numbers of fields, can provide critical insights into food security and a range of other socioeconomic and environmental concerns. However, accurately mapping these systems is difficult because there is 1) a spatial and temporal mismatch between satellite sensors and smallholder fields, and 2) a lack of high-quality labels needed to train and assess machine learning classifiers. We developed an approach designed to address these two problems, and used it to map Ghana’s croplands. To overcome the spatio-temporal mismatch, we converted daily, high resolution imagery into two cloud-free composites (the primary growing season and subsequent dry season) covering the 2018 agricultural year, providing a seasonal contrast that helps to improve classification accuracy. To address the problem of label availability, we created a platform that rigorously assesses and minimizes label error, and used it to iteratively train a Random Forests classifier with active learning, which identifies the most informative training sample based on prediction uncertainty. Minimizing label errors improved model F1 scores by up to 25%. Active learning increased F1 scores by an average of 9.1% between first and last training iterations, and 2.3% more than models trained with randomly selected labels. We used the resulting 3.7 m map of cropland probabilities within a segmentation algorithm to delineate crop field boundaries. Using an independent map reference sample (