AUTHOR=Zwart Jacob A. , Diaz Jeremy , Hamshaw Scott , Oliver Samantha , Ross Jesse C. , Sleckman Margaux , Appling Alison P. , Corson-Dosch Hayley , Jia Xiaowei , Read Jordan , Sadler Jeffrey , Thompson Theodore , Watkins David , White Elaheh TITLE=Evaluating deep learning architecture and data assimilation for improving water temperature forecasts at unmonitored locations JOURNAL=Frontiers in Water VOLUME=5 YEAR=2023 URL=https://www.frontiersin.org/journals/water/articles/10.3389/frwa.2023.1184992 DOI=10.3389/frwa.2023.1184992 ISSN=2624-9375 ABSTRACT=
Deep learning (DL) models are increasingly used to forecast water quality variables for use in decision making. Ingesting recent observations of the forecasted variable has been shown to greatly increase model performance at monitored locations; however, observations are not collected at all locations, and methods are not yet well developed for DL models for optimally ingesting recent observations from other sites to inform focal sites. In this paper, we evaluate two different DL model structures, a long short-term memory neural network (LSTM) and a recurrent graph convolutional neural network (RGCN), both with and without data assimilation for forecasting daily maximum stream temperature 7 days into the future at monitored and unmonitored locations in a 70-segment stream network. All our DL models performed well when forecasting stream temperature as the root mean squared error (RMSE) across all models ranged from 2.03 to 2.11°C for 1-day lead times in the validation period, with substantially better performance at gaged locations (RMSE = 1.45–1.52°C) compared to ungaged locations (RMSE = 3.18–3.27°C). Forecast uncertainty characterization was near-perfect for gaged locations but all DL models were overconfident (i.e., uncertainty bounds too narrow) for ungaged locations. Our results show that the RGCN with data assimilation performed best for ungaged locations and especially at higher temperatures (>18°C) which is important for management decisions in our study location. This indicates that the networked model structure and data assimilation techniques may help borrow information from nearby monitored sites to improve forecasts at unmonitored locations. Results from this study can help guide DL modeling decisions when forecasting other important environmental variables.