Skip to main content

ORIGINAL RESEARCH article

Front. Water
Sec. Water and Artificial Intelligence
Volume 6 - 2024 | doi: 10.3389/frwa.2024.1509945

Integrating Groundwater Pumping Data with Regression-Enhanced Random Forest Models to Improve Groundwater Monitoring and Management in a Coastal Region

Provisionally accepted
  • 1 Civil and Environmental Engineering, Princeton University, Princeton, United States
  • 2 International Ground Water Modeling Center, High Meadows Environmental Institute, Princeton University, Princeton, New Jersey, United States

The final, formatted version of the article will be published soon.

    Groundwater is essential for sustaining human life and ecosystems as a freshwater resource. However, intensive groundwater pumping (GWP) can deplete groundwater levels, and exacerbate issues such as sea-level rise and saltwater intrusion in coastal areas, further affecting the availability and accessibility of groundwater. To address these challenges, accurate monitoring and modeling of water table depth (WTD), a key indicator of groundwater storage, is useful for sustainable groundwater management. This work studies the implementation of a regression-enhanced random forest (RERF) model to predict WTD anomalies with pumping as a major input for New Jersey, a coastal state in the United States. The predicted WTD anomalies align well with observations, with a test Nash-Sutcliffe Efficiency (NSE) of 0.49, a test Pearson correlation coefficient (r) of 0.72, and a test root-squared mean error (RMSE) of 1.61 m. Based on a permutation feature importance, the most important input variables in the model for predicting WTD anomalies were long-term mean WTD, precipitation minus evapotranspiration (PME), and GWP. Using the trained RERF model, we generated 90m spatial resolution WTD anomaly maps for New Jersey for January and July 2015, showing areas of increasing and decreasing WTD. We then inverted the RERF model to predict GWP using WTD anomalies, land cover, and a cross metric as additional inputs. This approach was less effective, yielding a test NSE of 0.40, a test r of 0.65, and a test RMSE of 15.44 million liters/month. A permutation feature importance revealed the most important input variables to be PME, long-term mean WTD, and topographic slope. Again we generated 90m GWP maps for New Jersey for January and July 2015, offering finer resolution than the previous maps at the subwatershed level. Focusing on New Jersey, the study provides insights into the relationship between WTD anomalies and its critical input variables including GWP in coastal areas. Moreover, significant gaps in WTD observations persist in New Jersey, highlighting the need for comprehensive monitoring efforts. Thus, by employing ML techniques and leveraging available data, this study contributes to improving groundwater management practices and informing future decision-making.

    Keywords: Groundwater level, Water table depth, Groundwater pumping, machine learning, regression-enhanced random forest model, groundwater monitoring and management, New Jersey

    Received: 11 Oct 2024; Accepted: 04 Dec 2024.

    Copyright: © 2024 Kim, Ma and Maxwell. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence:
    Jamie Kim, Civil and Environmental Engineering, Princeton University, Princeton, United States
    Yueling Ma, International Ground Water Modeling Center, High Meadows Environmental Institute, Princeton University, Princeton, 08544, New Jersey, United States

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.