AUTHOR=Karale Yogita , Yuan May TITLE=Spatially lagged predictors from a wider area improve PM2.5 estimation at a finer temporal interval—A case study of Dallas-Fort Worth, United States JOURNAL=Frontiers in Remote Sensing VOLUME=4 YEAR=2023 URL=https://www.frontiersin.org/journals/remote-sensing/articles/10.3389/frsen.2023.1041466 DOI=10.3389/frsen.2023.1041466 ISSN=2673-6187 ABSTRACT=

Fine particulate matter, also known as PM2.5, has many adverse impacts on human health. However, there are few ground monitoring stations measuring PM2.5. Satellite data help fill the gaps in ground measurements, but most studies focus on estimating daily PM2.5 levels. Studies examining the effects of environmental exposome need accurate PM2.5 estimates at fine temporal intervals. This work developed a Convolutional Neural Network (CNN) to estimate the PM2.5 concentration at an hourly average using high-resolution Aerosol Optical Depth (AOD) from the MODIS MAIAC algorithm and meteorological data. Satellite-acquired AOD data are instantaneous measurements, whereas stations on the ground provide an hourly average of PM2.5 concentration. The current work aimed to refine PM2.5 estimates at temporal intervals from 24-h to 1-h averages. Our premise posited the enabling effects of spatial convolution on temporal refinements in PM2.5 estimates. We trained a CNN to estimate PM2.5 corresponding to the hour of AOD acquisition in the Dallas-Fort Worth and surrounding area using 10 years of data from 2006–2015. The CNN accepts images as input. For each PM2.5 station, we strategically subset temporal MODIS images centering at the PM2.5 station. Hence, the resulting image-patch size represented the size of the area around the PM2.5 station. It thus was analogous to spatial lag in spatial statistics. We systematically increased the image-patch size from 3 × 3, 5 × 5, … , to 19 × 19 km2 and observed how increasing the spatial lag impacted PM2.5 estimation. Model performance improved with a larger spatial lag; the model with a 19 × 19 km2 image-patch as input performed best, with a correlation coefficient of 0.87 and a RMSE of 2.57 g/m3 to estimate PM2.5 at in situ stations corresponding to the hour of satellite acquisition time. To overcome the problem of a reduced number of image-patches available for training due to missing AOD, the study employed a data augmentation technique to increase the number of samples available to train the model. In addition to avoiding overfitting, data augmentation also improved model performance.