AUTHOR=Tao Shuhao , Du Ling , Li Jiahao TITLE=Data mining-based machine learning methods for improving hydrological data: a case study of salinity field in the Western Arctic Ocean JOURNAL=Frontiers in Marine Science VOLUME=11 YEAR=2024 URL=https://www.frontiersin.org/journals/marine-science/articles/10.3389/fmars.2024.1490548 DOI=10.3389/fmars.2024.1490548 ISSN=2296-7745 ABSTRACT=

The Beaufort Gyre is the largest freshwater reservoir in the Arctic Ocean. Long-term changes in freshwater reservoirs are critical for understanding the Arctic Ocean, and data from various sources, particularly observation or reanalysis data, must be used to the greatest extent possible. Over the past two decades, a large number of intensive field observations and ship surveys have been conducted in the western Arctic Ocean to obtain a large amount of CTD (Conductivity, Temperature, and Depth) data. Multi-machine learning methods were assessed and merged to reconstruct the annual salinity product in the Western Arctic Ocean over the period 2003-2022. Data mining-based machine learning methods reconstructed salinity product based on input variables determined by physical processes, such as sea level pressure, bathymetry, sea ice concentration, and sea ice drift. The root-mean-square error of sea surface salinity, in comparison to deep water, was effectively managed during machine learning, which exhibits higher sensitivity to variations in the atmosphere, sea ice, and ocean. The mean absolute errors in freshwater content and halocline depth within the Beaufort Gyre region for the salinity product from 2003 to 2022 are 0.98 m and 1.31 m, respectively, when compared to observational data. The salinity product provides reliable characterizations of freshwater content in the Beaufort Gyre and its variations at halocline depth. In polar regions where lacking observed data, we can build data mining-based machine learning methods to generate reliable data products to compensate for the inconvenience. Furthermore, the application potential of this multi-machine learning results approach for evaluating and integrating extends beyond the salinity field, encompassing hydrometeorology, sea ice thickness, polar biogeochemistry, and other related fields.