The final, formatted version of the article will be published soon.
ORIGINAL RESEARCH article
Front. Appl. Math. Stat.
Sec. Statistics and Probability
Volume 10 - 2024 |
doi: 10.3389/fams.2024.1368147
Quantifying impact of correlated predictors on low-cost sensor PM 2.5 data using KZ-filter
Provisionally accepted- 1 Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York City, New York, United States
- 2 Department of Mathematics, Clarkson University, Potsdam, United States
- 3 Department of Biology, Clarkson University, Potsdam, United States
- 4 The State University of New York (SUNY), Albany, United States
- 5 Department of Computer Science, State University of New York (SUNY), Potsdam, NY, United States
- 6 Department of Mechanical and Aeronautical Engineering, Wallace H. Coulter School of Engineering, Clarkson University, Potsdam, New York, United States
PM 2.5 , fine particulate matter with a diameter smaller than 2.5 µm, is associated with a range of health problems. Monitoring PM 2.5 levels at the community scale is crucial for understanding personal exposure and implementing preventive measures. While monitoring agencies around the world, such as the U.S. Environmental Protection Agency (EPA), provide accurate data, the spatial coverage is limited due to a sparse monitoring network. Recently, the emergence of low-cost air quality sensor networks has enabled the availability of air quality data with higher spatiotemporal resolution, which is more representative of personal exposure. However, concerns persist regarding the sensitivity, noise, and reliability of data from these low-cost sensors. In this work, we analyzed PM 2.5 data from both EPA and Purple Air (PA) sensors in Cook County, Illinois, with two primary goals: (1) understanding the differential impact of meteorological factors on PA and EPA sensor networks, and (2) provide a mathematical approach to quantify the individual impact of correlated predictors on both short-term and baseline variations in noisy time series data. We used the Kolmogorov-Zurbenko (KZ) filter to separate the time series into short-term and baseline components, followed by fitting linear models to quantify the impact of meteorological predictors, including temperature, relative humidity (RH), wind speed (WS), and wind direction (WD). Furthermore, we applied the Lindeman, Merenda, and Gold (LMG) method to these linear models to quantify the individual contribution of each predictor in the presence of multicollinearity. Our results show that the PM 2.5 data from PA sensors exhibit higher sensitivity to meteorological factors, particularly wind speed, in the short-term and RH in the baseline component. This method provides a structured approach for analyzing noisy sensor data under diverse environmental conditions.
Keywords: Low-cost sensors, Air Quality, PM 2.5, KZ filter, LMG
Received: 18 Jan 2024; Accepted: 06 Nov 2024.
Copyright: © 2024 Kumar, Sur, Senarathna, Gurajala, Dhaniyala and Mondal. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Vijay Kumar, Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York City, NY 10032, New York, United States
Sumona Mondal, Department of Mathematics, Clarkson University, Potsdam, United States
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.