AUTHOR=Romera-Castillo Cristina , Heras Jónathan , Álvarez Marta , Álvarez-Salgado X. Antón , Mata Gadea , Sáenz-de-Cabezón Eduardo TITLE=Application of multi-regression machine learning algorithms to solve ocean water mass mixing in the Atlantic Ocean JOURNAL=Frontiers in Marine Science VOLUME=9 YEAR=2022 URL=https://www.frontiersin.org/journals/marine-science/articles/10.3389/fmars.2022.904492 DOI=10.3389/fmars.2022.904492 ISSN=2296-7745 ABSTRACT=

The distribution of any non-conservative variable in the deep open ocean results from the circulation and mixing of water masses (WMs) of contrasting origin and from the initial preformed composition, modified during ongoing simultaneous biological and/or geochemical processes. Estimating the contribution of the WMs composing a sample is useful to trace the distribution of each water mass and to quantitatively separate the physical (mixing) and biogeochemical components of the variability of any, non- conservative variable (e.g., dissolved organic carbon, prokaryote biomass) in the ocean. Other than potential temperature and salinity, additional semi-conservative and non-conservative variables have been used to solve the mixing of more than three water masses using Optimum Multi-Parameter (OMP) approaches. Successful application of an OMP analysis requires knowledge of the characteristics of the water masses in their source regions as well as their circulation and mixing patterns. Here, we propose the application of multi-regression machine learning models to solve ocean water mass mixing. The models tested were trained using the solutions from OMP analyses previously applied to samples from cruises in the Atlantic Ocean. Extremely Randomized Trees algorithm yielded the highest score (R2 = 0.9931; mse = 0.000227). Our model allows solving the mixing of water masses in the Atlantic Ocean using potential temperature, salinity, latitude, longitude and depth. Therefore, basic hydrographic data collected during typical research cruises or autonomous systems can be used as input variables and provide results in real time. The model can be fed with new solutions from compatible OMP analyses as well as with new water masses not previously considered in it. Our tool will provide knowledge on water mass composition and distribution to a broader community of marine scientists not specialized in OMP analysis and/or in the oceanography of the studied area. This will allow a quantitative analysis of the effect of water mass mixing on the variables or processes under study.