Applications of deep learning in physical oceanography: a comprehensive review

Zhao, Qianlong; Peng, Shiqiu; Wang, Jingzhen; Li, Shaotian; Hou, Zhengyu; Zhong, Guoqiang

doi:10.3389/fmars.2024.1396322

REVIEW article

Front. Mar. Sci. , 15 July 2024

Sec. Ocean Observation

Volume 11 - 2024 | https://doi.org/10.3389/fmars.2024.1396322

This article is part of the Research Topic Deep Learning for Marine Science, volume II View all 27 articles

Applications of deep learning in physical oceanography: a comprehensive review

Qianlong Zhao^1,2

Shiqiu Peng^2,3

Jingzhen Wang³

Shaotian Li^2*

Zhengyu Hou⁴

Guoqiang Zhong^1*

¹College of Computer Science and Technology, Ocean University of China, Qingdao, China
²State Key Laboratory of Tropical Oceanography, Key Laboratory of Science and Technology on Operational Oceanography, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou, China
³Guangxi Key Laboratory of Marine Environmental Change and Disaster in the Beibu Gulf, Bubei Gulf University, Qinzhou, China
⁴School of Ocean Engineering and Technology, Sun Yat-sen University, Zhuhai, China

Deep learning, a data-driven technology, has attracted widespread attention from various disciplines due to the rapid advancements in the Internet of Things (IoT) big data, machine learning algorithms and computational hardware in recent years. It proves to achieve comparable or even more accurate results than traditional methods in a more flexible manner in existing applications in various fields. In the field of physical oceanography, an important scientific field of oceanography, the abundance of ocean surface data and high dynamic complexity pave the way for an extensive application of deep learning. Moreover, researchers have already conducted a great deal of work to innovate traditional approaches in ocean circulation, ocean dynamics, ocean climate, ocean remote sensing and ocean geophysics, leading oceanographic studies into the “AI ocean era”. In our study, we categorize numerous research topics in physical oceanography into four aspects: surface elements, subsurface elements, typical ocean phenomena, and typical weather and climate phenomena. We review the cutting-edge applications of deep learning in physical oceanography over the past three years to provide comprehensive insights into its development. From the perspective of three application scenarios, namely spatial data, temporal data and data generation, three corresponding deep learning model types are introduced, which are convolutional neural networks (CNNs), recurrent neural networks (RNNs) and generative adversarial networks (GANs), and also their principal application tasks. Furthermore, this study discusses the current bottlenecks and future innovative prospects of deep learning in oceanography. Through summarizing and analyzing the existing research, our aim is to delve into the potential and challenges of deep learning in physical oceanography, providing reference and inspiration for researchers in future oceanographic studies.

1 Introduction

Marine science is holistic and involves the comprehensive study of the ocean, as well as complex interactions of various natural processes related to the ocean. The purpose of marine science research is to reveal the structure and function of the marine system through observation, experiment, comparison, analysis, synthesis, induction, deduction and scientific abstraction, to understand the natural laws of various phenomena and processes in the ocean and further use these laws to serve humans. Physical oceanography, as an important branch of ocean science, is dedicated to studying the physical processes and dynamic characteristics of the ocean, as well as its interactions with factors such as climate and environment. It encompasses a wide range of research areas, including ocean circulation, ocean dynamics, ocean climate, ocean remote sensing and ocean geophysics. Research on physical oceanography is closely related to human survival, life and economic activities. Traditional physical oceanographic models, while capable of simulating and predicting the behavior of ocean systems to some extent, are constrained by the limitations of numerical methods and the simplification of physical-process parameterization, making it difficult to accurately capture complex ocean dynamics and climate change mechanisms. The complexity of the ocean environment and regional differences also lead to a huge amount of computation, slow processing times and poor generalizability of traditional methods (Sonnewald et al., 2021).

With the rapid development of the Internet of Things (IoT) devices, such as underwater sensors, and satellite communication systems, and the continuous advancement of ocean observational technologies, IoT devices play a crucial role in collecting and transmitting oceanographic data. This integration allows for real-time monitoring of parameters like temperature, salinity, and currents, leading to an exponential growth of oceanographic data to petabyte sizes (Lou et al., 2023). The three-dimensional, diversified, multiscale, and spatiotemporal characteristics of ocean data signify the emergence of ocean big data. This paradigm shift enables marine science research to increasingly integrate and analyze vast amounts of heterogeneous data. Consequently, researchers can uncover new patterns and insights that were previously hidden, moving beyond the traditional reliance on theoretical physical models and simulations to a more data-centric approach that leverages advanced analytics, machine learning, and artificial intelligence, which also means promoting marine science research to the data traction stage (Qian and Chen, 2018) and bring challenges to traditional research methods and approaches. For example, in physical oceanography, the most typical feature of ocean big data is its spatio-temporal characteristics. It means that each category of ocean big data has a time series dimension, such as satellite remote sensing with repeated regular sampling, which assists with various applications of long-time period sequences. The interdependence of data, the diversity of influencing factors on each time scale and regional and temporal biases of data make it difficult for traditional numerical modelling methods to fit the relevant development patterns. In addition, ocean data are distinct in different geographical locations, and the data are associated with complex factors, such as the marine environment, continental environment, and even geographical location in neighboring regions. It is so complicated and uncertain that the results obtained from these data are not universal. Therefore, it is necessary to develop a methodological technique that can adequately integrate a large amount of characteristic data, conform to natural development patterns as much as possible, and be as universal as possible in oceanographic research.

Data-driven deep learning techniques are focused on using data to train models, optimize parameters, learn patterns and relationships from historical observations, and explore intrinsic connections in feature data through nonlinear mapping approaches for various purposes, such as maximizing fitted patterns or classification. The development of GPU parallel computing technology makes it possible for deep learning to be widely applied. In recent years, many classical deep learning architectures have been proposed consecutively, such as the typical CNN, RNN, long short-term memory (LSTM), and GANs, and deep learning is widely used in various research fields. For example, in the oceanography field, various processes are studied, such as variable (sea surface temperature, SST and so on) prediction, ocean noise classification, ocean wave height determination, typhoon formation and path prediction. Figure 1 shows the trend in the number of published articles on the Web of Science with the keywords: ‘ocean’ and ‘deep learning’ over time.

Figure 1

Figure 1 Trends in the number of papers published on the application of deep learning in oceanography retrieved from the Web of Science each year since 2012.

Deep learning as a data-driven technique plays a key role in helping marine science researchers accelerate their understanding of complex, interactive, and multiscale processes in the ocean environment (Yu and Ma, 2021). The development of aerospace remote sensing technologies and underwater acquisition equipment has enriched the three-dimensional, diverse, and multiscale nature of ocean data, thus propelling ocean science forward significantly. The rapid development of IoT devices and observational technologies has laid the data foundation for modern ocean science. On this basis, the studies reviewed here suggest that the development of deep learning-related artificial intelligence (AI) technologies will play an increasingly important role in updating ocean science research methods and enhancing ocean data analysis within the next 10–20 years.

This paper is arranged as follows: the first part briefly introduces the shortcomings of traditional numerical methods and statistical methods used in marine science, as well as the background and necessity of deep learning applications. The development history and main categories of deep learning are presented in the second part, followed by reviews on the applications of deep learning in various fields of physical oceanography. Finally, several promising directions for future development and innovation are proposed in the fourth part.

2 Deep learning

The main knowledge foundation of deep learning is the neural network (Lippmann, 2023). The idea of a neural network involves using multiple neuron nodes to combine multi-dimensional feature data and continuously adjusting the mathematical parameters. It maps multidimensional input data into optimal nonlinear outputs to represent specific attribute classes or features for classification, regression, etc. A fully connected (FC) neural network is shown in Figure 2. For more details on the concept of deep learning, please refer to the paper Deep learning (LeCun et al., 2015) or Kai-Fu Lee’s explanation in the book Artificial Intelligence (Li and Wang, 2018).

Figure 2

Figure 2 Sketch of the fully connected network.

2.1 Developmental history

Up to the present, neural networks have experienced three waves and one explosion, as shown in Figure 3. The earliest work can be traced back to 1943 when psychologist McCulloch and mathematical logician Pitts proposed the first mathematical model of neurons, the MP model, which roughly simulated the working principle of human neurons. In 1958, Rosenblatt added a learning function to the MP model and proposed a single-layer perceptron model, triggering the first wave of neural network research.

Figure 3

Figure 3 History of neural network development. Large circles are used to denote important time points, and small circles are used to denote the presentation of more typical scientific results in each time stage.

In 1986, Rumelhart et al. published an article in Nature, proposing a multilayer feedforward network-back propagation (BP) network trained by the error BP algorithm (Rumelhart et al., 1986). It settled the nonlinear classification and learning problem that the original single-layer perceptron confronted. This strongly countered the view of Professor Minsky and others that neural networks are ‘death sentences’, which can only handle linear classification problems but cannot solve even the simplest exclusive OR(XOR) problems. It led to a second wave of neural network research. Subsequently, the Boltzmann machine, CNN, RNN, and other neural network structural models were developed.

In 2006, Professor Hinton and his team first introduced the concept of deep learning in Science (Hinton and Salakhutdinov, 2006). Thereupon, ReLU, Dropout, and other deep network optimization strategies were applied and proposed, and the development of GPU parallel computing technology helped to alleviate the problems of local optimum, overfitting, and gradient diffusion of BP networks caused by increasing the number of layers of neural networks. Not surprisingly, the wave of research in the field of neural networks continues to the present day.

In 2012, Hinton led a team to participate in the ImageNet Large Scale Visual Recognition Challenge. In 2014, Facebook’s Deep-Face project, based on deep learning technology, achieved a face recognition accuracy of over 97%, which was almost the same as that of humans. In 2016, along with Google’s deep learning-based AlphaGo defeating the top international Go master Lee Sedol 4-1 and the AlphaFold (AI algorithm applied to amino acid folding, 2020), deep learning-related algorithms achieved remarkable results in many fields, such as healthcare, finance, art, self-driving and so on.

Deep learning algorithms are gradually replacing traditional statistical machine learning methods in many fields, including oceanographic fields, as the hottest research area in artificial intelligence. For example, in oceanography, convolutional neural network-based algorithms are applied to feature extraction such as sea ice identification and classification and satellite image feature extraction. Recurrent neural networks-based algorithms are widely used in marine environmental forecasting, data inversion and reconstruction and other fields.

2.2 Models of deep learning

The main steps of deep learning include understanding research problems, data preprocessing, selecting and designing algorithmic models, training and optimizing models and mapping the output results. As shown in Figure 4. Datasets may have insufficient data volume, poor label classification, low data quality, data imbalance (both category and format), or lack of validation and test sets. Data cleaning, labelling, normalization, denoising and dimensionality reduction are required for the dataset during data preprocessing. Algorithmic models generally include several important components, such as layers, loss functions, activation functions, and optimizers. From the main application tasks and data types in oceanography, such as spatial, temporal, and data generation, several commonly used frontier algorithm models are described below, such as CNNs, RNNs and GANs.

Figure 4

Figure 4 Deep learning step flow.

CNNs are multilayer neural network algorithms mainly applied to image data analysis and processing in the image recognition field. They consist of a class of networks with different structures, as shown in Table 1. By combining the layers, a local region (i.e., the sensory domain) of the same size as the convolutional kernel can be sampled in a sliding fashion as the output of the layer (Figure 5). It reduces the number of parameters in the network, decreases the consumption of computational resources, and controls overfitting. This process can then be repeated until the image is spatially reduced to a sufficiently small size somewhere in the transition to a fully connected layer. The final fully connected layer yields the output, such as classification. Besides, other convolutional domain structures were subsequently developed, such as AlexNet, ZF Net, GoogLeNet, VGGNet, U-Net, and ResNet. The most commonly used CNNs at present are U-Net and ResNet.

Table 1

Table 1 Composition logic of different classes of CNNs.

Figure 5

Figure 5 CNN processing logic.

RNNs for processing sequential data are commonly used for text analysis or natural language processing. The three most well-known types of RNNs are simple RNN, LSTM and gated recurrent unit (GRU). By using the output of the previous step as part of the input of the next step, LSTM is a variation of RNN, while GRU is a variant of LSTM. The network structures of RNNs are presented in Figure 6. The RNN maintains forward propagation by continuously using the output of the previous node as part of the input of the next node in the sequence. To achieve long-term memory, the RNN model links the computation of the current implicit state with the previous n times computations, which increases the computation cost exponentially, leads to a significant increase in the model training time, and results in gradient vanishing and gradient explosion, which, in turn, make it difficult for traditional RNNs to handle long-term dependence in practice. LSTM tackles these problems through three control gates: the input, output and memory gates. The output of the previous node is selectively retained to ensure that important feature information (also called memory) will not be lost even during long-term propagation. This idea is like the later attention mechanism (Bengio, 2014). Subsequently, along with the self-attention and feedforward neural network (FFNN), the transformer model is applied to improve the parallelization and long-term dependency problems that also occur to LSTM for particularly long-term tasks. Based on LSTM, GRU reduces substantial operations by combining the input and memory gates into one update gate. This saves a lot of time for large training sample data.

Figure 6

Figure 6 Logic diagram of three main classes of recurrent class neural networks.

GANs, which are commonly used for data generation or unsupervised learning, can be applied to the super-resolution reconstruction of oceanographic data or balancing sample data. A GAN contains two important components: the generator and discriminator (Figure 7). Metaphorically, the generator is a criminal making counterfeit money, while the discriminator is a police officer. The generator aims to produce counterfeit money to trick the discriminator, while the discriminator strives not to be tricked. A GAN aims to optimize and obtain a generative model to provide results close to real data. Based on the GAN networks, several classical algorithmic variants are available, such as deep convolutional GANs (DC-GANs), Wasserstein GANs (WGANs) and conditional GANs (CGANs).

Figure 7

Figure 7 GAN network logic.

In addition to these three main types of networks, it is worth noting another network model, Transformer. It is based on the Self-Attention mechanism (Vaswani et al., 2017) to capture the relationships between elements more efficiently when dealing with sequential data and is different from traditional RNN and LSTM. The model’s design ideas and architecture have become the cornerstone of many subsequent innovative models, including the well-known ChatGPT and BERT models. For marine researchers, the model is of great application for the prediction of time-series data such as SST, SSS, etc. Meanwhile, the prediction and inversion of 2-D image-like data can also be learned from the idea of Vit Transformer (Wang et al., 2021), which divides the image data into multiple small image chunks for location coding in order to serve as sequence data. However, it should be noted that the model has a large requirement for the amount of data when applied in the marine field, and in many cases, only a sufficiently large amount of data will reflect the obvious effect of improvement.

3 Applications of Deep learning in oceanography

Physical oceanography is the basic subject of oceanic sciences, which works on the spatial and temporal changes of force fields, thermohaline structures, and related mechanical motions in the ocean using the viewpoint and methods of physics, as well as the exchanges and transformation of oceanic substances, momentum and energy. As the first ocean-related sub-discipline developed in modern times, it covers not only extensive research contents but also widespread applications of deep learning in modern marine science. The following sections introduce the application of deep learning in surface elements, subsurface elements and typical ocean, weather and climate phenomena.

3.1 Sea surface elements

We systematically summarize each elements in Table 2 and present its details in the next section.

Table 2

Table 2 A summary of the main application tasks, DL models used and current challenges of the sea surface elements.

3.1.1 Sea surface temperature

Currently, the sea surface temperature (SST) is one of the marine science topics where deep learning is applied more, mainly used to optimize the quality of remotely sensed data, such as revising the data error, super-resolving the data, and predicting the change of sea surface temperature from the spatio-temporal level. Detailed examples are described below.

The prerequisite for the application of deep learning is good sample data. Remote sensing data, as one of the important data sources in oceanography, plays a vital role in analyzing the marine environment such as sea surface temperature, mainly relying on infrared radiation (IR) sensors and microwave (MW) sensors. IR resolution is high, but cloud cover leads to missing data. Though microwave sensors solve the cloud cover problem, resolution is low. Taking SST as an example, (Aparna et al., 2018) used an artificial neural network (ANN) trained by daily SST spatial maps to predict the SST in the missing region, considering spatial and temporal variability. (Liu et al., 2022) used a deep neural network to optimally correct the SST retrieval residuals, taking the physical retrieval of the microwave integrated retrieval system as input. To a certain extent, it solves various problems, such as the scan angle dependence of retrieval residuals of cross-tracking instruments in the low-resolution case. To comprehensively work out missing and low-resolution observation under clouds in satellite remote sensing, (Izumi et al., 2022) used the enhanced super-resolution GAN, ESRGAN, to perform super-resolution of SST data. Compared with other methods, ESRGAN has the highest accuracy in learning perceptual image patch similarity and perceptual index, correctly generating missing parts of SST distribution in low-resolution data with very high perceptual quality. The method is suitable for various tasks, such as repairing data defects and super-resolution, and can also be applied to other physical variables.

The current mainstream idea for predicting SST is to fully consider the spatiotemporal properties. (Zhang et al., 2017) divided data into multiple small grids and integrated prediction results using LSTM on each as the final output. CFCC-LSTM (combined FC-LSTM and convolution neural network) (Yang et al., 2018), combines LSTM-AdaBoost method (Xiao et al., 2019) and the regional convolution long short-term memory (RC-LSTM) (Xu et al., 2020b) all consider spatiotemporal properties from different perspectives for short-term, medium-term, and regional SST prediction. In addition, the multi-long short-term memory convolution neural network (M-LCNN) (Xu et al., 2020a) significantly improved the accuracy and robustness at multiple scales and large SST fluctuations. The temporal convolutional network (TCN) model proposed by (Sun et al., 2022) achieved good results in predicting SST at large spatial scales and in the long term. (Usharani, [[NoYear]]) applied the improved loss function ILF to the LSTM and greatly improved the ability to reduce the error and processing time, achieving 98.7% accuracy and reducing the processing time to approximately 0.35 s. In addition to using the data for prediction. (Zheng et al., 2020), used time series of SST charts for predicting SST and tropical instability waves.

3.1.2 Sea surface salinity

Although satellite remote sensing data can obtain a large range of data information related to sea surface salinity, obtaining high-quality sea surface salinity products is still facing a variety of difficulties. Deep learning technology in recent years began to be gradually used to inverse high-resolution and high-precision sea surface salinity products through the selection of different modes and networks.

Sea surface salinity (SSS) is an important variable for studying scientific issues, such as ocean circulation, global water cycle, and climate change. The main remote sensing sources to monitor SSS are L-band microwave radiometers from Soil Moisture Active-Passive (SMAP) and Soil Moisture and Ocean Salinity (SMOS), and also Moderate Resolution Imaging Spectroradiometer (MODIS) from NASA. Microwave sensors in offshore regions are susceptible to uncertainties, such as radio frequency interference and low SST, resulting in low accuracy (Rajabi-Kiasari and Hasanlou, 2020). used support vector regression (SVR), ANN, random forest (RF) and gradient boosting machine (GBM) to model SSS in the Persian Gulf and assessed the ability of machine learning methods to predict SSS in the region of lower- accuracy data.

(Jang et al., 2021) used three machine learning methods (RF, SVR and ANN) to improve the SSS data from SMAP in five global river-dominated sea areas, resulting in a 28% reduction in root mean square error (RMSE) compared to the original SMAP SSS product. Further, they can also capture the spatial and temporal properties and the differentiation of high and low salinity waters. Moreover (Jang et al., 2022), simultaneously used SMPA satellite data and ocean interior salinity data provided by HYCOM to obtain high-quality global daily SSS estimation with seven machine-learning algorithms.

Microwave remote sensing is easily affected by problems such as radio frequency interference in coastal waters leading to low resolution, while optical remote sensing can avoid this problem. Numerous optical remote sensing-based inversion methods for SSS have been proposed. The significant difference among these methods is the selection of different characteristic factors. (Geiger et al., 2013) used normalized off-water irradiance, SST and location information from MODIS-Aqua to account for more spatial linkages, compared to (Chen and Hu, 2017) and others who used satellite reflectance data and SST data from MODIS and SeaWiFS. From the point of multimodality (Xu, 2016), used the high-correlation variables as sensitive factors for the indirect inversion of salinity, including total nitrogen, total phosphorus and temperature. To improve the spatial resolution of sea salt products (Liu, 2020), used a deep convolutional network (DCN) model to invert the SSS by considering high-resolution sea surface reflectance data, seawater temperature data and low-spatial resolution SMOS salinity products. The RMSE of DCN inversion model can be reduced to 0.02191 psu when using ResNet and U-net networks as feature enhancement modules. (Wu et al., 2021) considered remote sensing reflectance and SST for constructing an SSS inversion model for the Gulf of Mexico with the RF method.

3.1.3 Sea surface currents

As an important physical ocean phenomenon that regulates global climate change, the study and prediction of ocean currents are of great significance. Currently deep learning is mainly applied to predict ocean surface currents.

In early ocean current prediction applications, (Saha et al., 2016) indirectly predicted ocean current velocities by applying ANN to the time series of errors between the estimates and observations of numerical models. In tidal and wind-dominated coastal areas, (Ren et al., 2018) applied historical high-frequency radar (HFR) observations and modelled tide and wind results as feature variables to train ANNs to achieve high-precision predictions within a short-term prediction window of an hour. (Yan et al., 2021) considered the interferometric phase image as the input image and the measured current velocity image as the output image, and creatively introduced the conditional generative adversarial networks (CGAN) model. This approach effectively leverages deep learning to address the challenge of accurately measuring current velocity, even when current velocity is directly measured. The CGAN model reduces error and improves efficiency by learning from the input-output relationship of the phase and velocity images, thus offering a significant advantage over traditional methods, which often struggle with noise and efficiency issues due to the complex nature of ocean currents. From a spatiotemporal perspective (Chen and Chi, 2021), adopted both spatial blocks to obtain spatiotemporal features and combined GRU and attention mechanisms to capture nearest-neighbor temporal correlations, the so-called STAGRU model, (Thongniran et al., 2019 2019) conducted similar studies. Despite applying an attention mechanism, the studies mentioned above have not fully identified the importance of certain key elements such as sea surface wind, which remains the major bottleneck. To settle this bottleneck problem (Liu et al., 2022b), add a weight parameter adjustment to enhance the importance of different elements based on the proposed pure attention model (P-ATT) and significantly improved the performance in contrast to other deep learning models or schemes that incorporate attention mechanisms and deep learning models.

Currently, these applications are mostly cases of regional circulation. The larger spatial scale ocean currents or subsurface currents are subject to the joint action of different regions and different dimensions, with a complex influence mechanism, which not only poses a greater challenge for the application of deep learning but also is a major direction that needs to be explored in the future.

3.1.4 Sea surface height

Sea surface height (SSH) is influenced by various dynamic processes in the ocean, including mesoscale eddies (MEs), waves, currents, and tides. The interactions between these processes can be highly complex due to their nonlinear nature and varying spatial and temporal scales. Additionally, the uncertainties in measuring and predicting these processes, such as those introduced by tidal forces, create challenges that can be effectively addressed using deep learning techniques. Similarly, as the main source of data, satellite remote sensing has obvious defects, such as degraded data quality or even missing data, while in situ observations are sparse. So deep learning has been attempted to be used in areas such as predicting long-term changes in sea surface height, as well as optimizing the quality of remotely sensed data and reconstructing situ observations.

(Zhang et al., 2020) adopted GANs to achieve a good SSH reconstruction of an entire basin with observations from 19 coastal sites. (Rong and San Liang, 2022) applied a neural network model to couple with a causal inference technique based on IF analysis and reconstructed MEs in an area of the South China Sea successfully. (Barth et al., 2022) implemented a novel skip connection based on DINCE (Data INterpolating Convolutional Auto-Encoder) to reconstruct multivariate data, including SSH, which showed excellent performance.

Considering the spatiotemporal dependence in the prediction of SSH (Liu et al., 2022a), achieved superior stability and accuracy in large-scale and long-term prediction by assigning reasonable weights to the data at each time step and dividing the points close to each other at the same latitude into groups to integrate the attention mechanism of temporal and spatial dimensions into the LSTM. Similarly, Song et al. proposed a merged LSTM model (Song et al., 2020) and a convolutional LSTM (ConvLSTM) P3 model (Song et al., 2021) combining LSTM and residual strategies, with the latter achieving an average accuracy of 93.4% over a 15-d prediction period for SSH. Based on the correlation between different variables such as SST, SSH, SSS and sea surface velocity (Shao et al., 2021), proposed a hybrid empirical orthogonal function (EOF)- complete ensemble empirical mode decomposition (CEEMD) -ANN model and a multivariate empirical orthogonal function (MEOF) - 1-D convolutional neural network (Conv1D block) -LSTM model to consider the linear and nonlinear characteristics of sea level change, respectively.

3.1.5 Significant wave height

The significant wave height (SWH) is the most widely used wave parameter in climate assessment and various marine industries. Altimeters and radiometers onboard satellites provide large-range and high-resolution observations to support SWH studies. At the same time, as a prerequisite for accurate wave forecasting, performing validation and calibration for the observed data and improving the quantity and quality have also become important research topics.

For the Chinese HY-2 ocean remote sensing satellite series (Wang et al., 2020), applied deep learning techniques to combine multiple parameters of altimeter HY2B, including SWH, sigma0 and sigma0 Standard Deviation (STD) and significantly reduce the calibration bias by 80%, RMSE by 24%, and scatter index (SI) by 10% on the SWH calibration task than the previous methods. It demonstrated the good capability of HY2B calibration and robustness. The GRU model was also trained using the minimum wave height and wind field data obtained from the altimeter and Scatterometer (SCAT) on HY-2C operating on an inclined orbit, and achieved good results in large-scale SWH data generation (Wang et al., 2021b). To solve the problem of data loss due to observational platform and sensor failure (Bethel et al., 2021), used LSTM along with the Simulating WAves Nearshore (SWAN) for bidirectional modelling of surface wind speeds (WSP) and SWH to improve the data reliability, based on the relationship between the WSP and SWH.

The French CFOSAT carries Surface Waves Investigation and Monitoring (SWIM) and a scatterometer (SCAT), both of which are designed to provide along-track wave parameters and wind observationals over a wide swath, respectively (Wang et al., 2021). combined the wave and wind from SWIM and SCAT to train the deep neural network with a variety of variables to estimate the wide swath SWH, achieving an accuracy as good as the SWIM nadir and an improved spatial coverage (Figure 8). The variables included SWH and Sigma0 (σ0) (The most common representation of the surface backscatter coefficient, also known as the Normalized Radar Cross Section (NRCS). It takes into account the effect of the size and shape of the surface target on the reflection of the radar signal) from the SWIM nadir observations, SWH and peak period from the wave spectrum in the SWIM off-bottom box and wind speed from SCAT. Using the wide swath SWH achieves impacts as good as using the assimilation of the SWIM nadir SWH and enhances the accuracy of the wave model when used together with the nadir SWH.

Figure 8

Figure 8 Selection of training factors for deep neural networks.

Based on a deep residual CNN (Wang et al., 2022), proposed a quadrupolar synthetic aperture radar (SAR) SWH retrieval algorithm, GF3WVResNet, to improve the estimation of SAR SWH for China’s HMS-3 with an RMSE of 0.32 m and an SI of approximately 13%, outperforming other state-of-the-art wave height retrieval algorithms. For potentially catastrophic SWH changes caused by typhoons (Meng et al., 2021), introduced a deep learning method for the long-term prediction of tropical cyclone (TC)-forced nearshore wave heights, and identified them. A two-way gated recurrent cell network was used as an effective model for real-time and 24 h forward-looking predictions (Bethel et al., 2022). proposed an LSTM model for predicting the forced elevation SWH of Caribbean Sea hurricanes, which can provide accurate predictions within 12 hours (R2 ≥ 0.8) and maintain the error below 1 m within 6 hours of the forecast lead time. The RNN-LSTM model (Pushpam et al., 2020 2020), GRU algorithm (Wang et al., 2021a), bidirectional ConvLSTM model (Son et al., 2020), and nearshore simulated wave-LSTM model (Fan ST. et al., 2020) all achieved satisfactory performance for single-point short-term prediction in their respective study seas.

However, the spatial distribution of wave is two-dimensional (2D), and the 2D spatial field prediction helps to understand the overall wave conditions in certain regions (Zhou SY. et al., 2021). applied the ConvLSTM network to South China Sea and East China Sea and demonstrated its feasibility for short-term SWH prediction under normal and extreme conditions (Bai et al., 2022). used a stochastic search algorithm to optimize a CNN-based 2D wave field prediction model, which could not only accurately predict the wave height variation along the timeline but also accurately estimate the spatial wave height distribution of the 2D wave field. In addition (Ma et al., 2021), combined the numerical weather prediction model Weather Rearch Forecast (WRF) with a deep learning model for the SWH prediction algorithm WRF-CLSF, which can effectively suppress both the randomness and instability of waves as well as extract the continuity and interaction scales from the wind-wave history information. Combining numerical forecasting with data-driven algorithms is a unique and innovative perspective. The effectiveness of the model for long-term prediction (24 h, 48 h, and 72 h) was also demonstrated (Li et al., 2021). proposed a deep learning model, convolutional long term time series network (CLTS-Net), for multivariate time series SWH prediction, which integrates the advantages of CNN, LSTM, and autoregressive models. It captures short- and long-term dependencies in multivariate data and combines linear and nonlinear models for reliable prediction and has been experimentally proven to be a more accurate and general method for long-term prediction of SWH. Similarly, the CNN-BiLSTM-attention model offers the mentioned above advantages while proving its feasibility under extreme conditions (Wang LN. et al., 2022).

3.1.6 Sea ice

The identification and prediction of sea ice are crucial to maritime navigation safety, marine resource exploitation, global climate change and sea surface altitude change monitoring. Currently, the main sources of sea ice data include Sentinel-1 SAR images, RADARSAT system missions, passive microwave data from AMSR2 and ship photographs. These data sources provide massive datasets, and therefore deep learning can be widely applied for the identification, segmentation, and prediction of sea ice. The following sections present the latest research findings in these fields.

Global warming has intensified the melting trend of Arctic sea ice. Prediction of sea ice concentrations (SIC) at different timescales is important to understanding global climate change. For medium- and long-term predictions on monthly timescales (Wei et al., 2022), used an LSTM incorporating mean absolute error and attention modules based on the persistence of SIC anomalies for extracting the relationship between sea ice in the target month and that in the preceding 12 months, which generally improves the accuracy of predictions. To achieve SIC prediction beyond 30 days compared to traditional LSTM networks (Zheng et al., 2022), used the EOF analysis to extract the spatiotemporal characteristics of the Arctic SIC and then used LSTM for time series prediction, which showed some validity on a 100-day time scale. However, all of these studies involved prediction only from the univariate perspective of SIC, ignoring the influence of some necessary external factors on SIC evolution (Andersson et al., 2021). integrated 11 variables of both ocean and atmosphere and trained an ensemble of U-Net networks to predict monthly mean SIC maps at a 25 km resolution for the next six months. It performed well in the seasonal forecasting of summer sea ice and extreme sea ice events (Chi et al., 2021). combined different modalities with a dual ConvLSTM and improved the loss function to address the discrepancy between statistical and visual errors. Although a six-month sea ice prediction was achieved, it demonstrated that atmospheric parameters did not have significant contributions, and the model still has room to improve (Liu QH. et al., 2021). specifically selected five factors (SST, mean sea level pressure (MSL), 2-m temperature (T2M), skin temperature (SKT), and SIC) to train an improved predictive RNN (PredRNN++) to achieve daily SIC prediction for up to 9 days; more recently (Ren et al., 2022), incorporated a fully convolutional network with spatiotemporal attention.

Another major application scenario is to identify and classify sea ice in remote sensing images. SAR images provide an important data source for sea ice research (Song W. et al., 2022), provided a large labeled sea ice SAR dataset which includes seven different sea ice types and is a reliable data set for applying deep learning to sea ice-related research. It was the first to provide both spatial and temporal information, providing a reliable data set for applying deep learning to sea ice-related research. Based on SAR data in a dual-polarization mode of operation (Zhang et al., 2022), used a modeling approach combining the backbone network MobileNetV3 with a multiscale feature fusion approach to achieve more than 95% classification accuracy when classifying sea ice (new ice, thin first-year ice, thick ice, and old ice). The accuracy can be improved by approximately 10% compared to that when using single-polarized data. However, SAR images can suffer from unclear backscattering features and noise phenomena. Based on the U-Net architecture with the addition of model receptive fields and noise correction (Stokholm et al., 2022), achieved faster automatic sea ice concentration generation. These methods mentioned above mainly focus on shallow feature learning. To mine deeper features in images (Han et al., 2021), performed multi-level feature fusion based on residual networks, and further improved the classification accuracy by mining and fusing layer and layer element features through feature pyramid networks (FPN), path aggregation network (PAN), and spatial pyramid pooling (SPP) modules compared with the algorithm with fewer layers of deep learning network. In addition, based on the idea of residuals (Goncalves and Lynch, 2021), used a U-Net variant model fused with ResNet encoders to remedy the defects, such as insignificant texture distinction of ice and water boundaries in ice floe extraction. The framework proposed by (Jiang et al., 2022) uses a regional pooling layer to integrate spatial features learned by ResNet and unsupervised iterative region growing with semantics (IRGS). The contextual information extracted by the partitioning algorithm achieves an overall accuracy of 99.67% for ice and water classification results, which can generate sea ice maps with pixel-level labels and more natural ice-water boundaries.

The identification of sea ice is not limited to these studies. Identifying ice surface and bottom boundaries by radar images to further calculate ice cover thickness is also an important topic in sea ice-related researches. Recently, a multiscale feature fusion network was developed to solve the sample imbalance problem of boundary detection effectively (Cai et al., 2022), using a multiscale convolution module to learn multiscale feature representations of different layers and a combined loss function called cross-entropy (CE) and focus loss. It accounted for multiscale features more comprehensively and further improved the accuracy of boundary detection. Moreover, there are also studies which focus on identifying glacier cracks to understand glacier state and stability (Zhao et al., 2022). used an improved U-Net network to map the spatial distribution of Antarctic ice shelf cracks in 2020 with a spatial resolution of ~40 m. Extraction accuracy of 84.2% was reached with good visual consistency.

Compared with SAR, passive microwave has stronger surface penetration, wider coverage and better all-weather operation, but coarser spatial resolution. (Liu XM. et al., 2022) proposed a super-resolution algorithm to improve the spatial resolution of passive microwave images, called a progressive multiscale deformable residual network. It employed a novel alignment module including a progressive alignment strategy and a multiscale deformable convolutional alignment unit and further incorporated a temporal attention mechanism. In addition, for some of the shipboard shot data, which are susceptible to image quality degradation due to rain and other factors (Alsharay et al., 2022), developed a raindrop removal framework to classify the scenes of sea ice images into ice, water, ship and sky by three deep learning networks, which improved the classification performance to a-certain extent.

3.2 Subsurface temperature and salinity

The ocean is the major global climate regulator and balancer of Earth’s thermal energy (Deng, 2024). Monitoring and predicting ocean parameters are of great significance for understanding the state of the oceans and predicting global climate change. With the generation of huge amount of satellite remote sensing data, research related to ocean surface phenomena and variabilities has been greatly promoted. However, owing to the ocean’s complex environment and physical characteristics, satellite remote sensing cannot directly observe subsurface information. Modeling the relationship between ocean surface and subsurface parameters and retrieving or predicting the environmental parameters inside the ocean accurately through deep learning-related methods has become a hot and promising topic.

Regarding subsurface temperature and salinity, most studies have selected ocean surface parameters as predictors, such as SSH, SST, SSS, sea surface wind and location information (latitude and longitude). Su carried out more work on reconstructing subsurface temperature and salinity, exploring the effects of different methods with different resolutions. The ability of CNN, light gradient booster models (LightGBMs) (Su et al., 2021b), and LSTM (Su et al., 2021a) to have high–resolution and long–time series reconstruction for subsurface temperatures was compared and validated. The percentage of contribution that latitude and longitude to subsurface temperature and salinity anomalies was validated using a bidirectional LSTM (Bi-LSTM) neural network (Su et al., 2021c). The recent attempt at the convolutional LSTM neural network (ConvLSTM) (Su et al., 2022), which better accounts for time series dependence and spatial features, revealed significant spatial heterogeneity among different ocean basins (Cheng et al., 2021). introduced sea surface velocity as the input of a BP neural network based on these several predictors and verified its accuracy positively.

Similarly, CNN- or LSTM-based approaches were also developed. Again using the characteristic data variables mentioned above (SSH, SST, SSS) (Nardelli, 2020), combined stacking-based LSTM neural networks with Monte Carlo dropout methods to reduce the RMSD of the reconstructed hydrographic profiles to 50% relative to the reference reconstruction (climatology and mEOF-r profiles). Considering spatiotemporal features for 3D ocean temperature prediction, using three-dimensional temperature data (Zhang K. et al., 2020), proposed a multilayer convolutional LSTM (M-convLSTM), while (Zuo et al., 2022) proposed a stereotactic spatiotemporal convolutional model, SST-4D-CNN. The latter added residual and recalibration modules to the convolutional module to improve the horizontal and profile prediction accuracy to more than 98.02%, mostly maintaining it above 99% compared with SVR, FC-LSTM, Conv-LSTM and 3D Convolutional Models (Sammartino et al., 2020). constructed an ANN using the absolute dynamic topography, the geostrophic velocity components derived from altimeter, ocean surface chlorophyll-a and temperature as input parameters to accurately infer the vertical shape of the Mediterranean chlorophyll-a profile while also acquiring profile information for other similar substances (e.g., particulate organic carbon, salinity etc.).

3.3 Typical ocean phenomena

We systematically summarize each phenomenon in Table 3 and present its details in the next section.

Table 3

Table 3 A summary of the main application tasks, DL models used and current challenges of the typical ocean phenomena.

3.3.1 Mesoscale eddies

MEs are extremely important for ocean energy and material exchange, and ubiquitous phenomena in global ocean. Therefore, effective eddy detection and tracking are essential for promoting the development of understanding of ocean dynamics. Advances in remote sensing technology have greatly facilitated the integration of researches on eddy detection and tracking with deep learning techniques.

MEs produce an irregular pulsation in the background field, leading to a strong correlation with the surrounding SSH anomalies and variables, such as SST. (Moschos et al., 2020 2020) and (Sun et al., 2021) proposed a deep learning framework to accurately identify and extract eddy features from SST satellite images and satellite remote sensing SSH images, respectively, along with a CNN-based network model. Within the framework, the latter proposed a tracking algorithm, MCML (Median blur, Contours finding, Moments computation, Location), to track multiple eddies without relying on adjusting physical parameters. Recently, for ocean satellite SSH images (Dong Z. et al., 2022), proposed an improved U-Net network integrating a convolutional attention module and a residual learning module. Its accuracy can reach 93.28% on an ME automatic detection task and was significantly better than those of the previous AI detection models. To improve the spatiotemporal predictability of MEs (Nian et al., 2021), fused GRU and spatial attention in the original MIM (Memory in Memory) architecture to provide a smoother and more accurate sea level anomaly (SLA) time series, which provided a good database for ME prediction. This enhanced approach allows for the effective capture of both temporal and spatial variations in SLAs, significantly improving the accuracy of ME predictions. By leveraging this method, oceanographers can achieve reliable mesoscale eddy forecasts, potentially extending the prediction horizon up to several weeks, thus offering valuable insights for understanding and managing oceanographic processes. Since the temperature structure of MEs leads to changes in the ocean pressure layer, which can be reflected in the SLA (Yu et al., 2021), proposed a deep learning algorithm along with the satellite SLA data to invert the temperature structure within the eddies, the eddy CNN, providing high-resolution ME 3D temperature structure with daily and horizontal resolutions of 0.25° with better spatial and temporal resolutions. For the issue that insufficient spatial resolution of altimeters can lead to the low efficiency of oceanic eddy recognition (Chen et al., 2021), proposed a vertical structure-based eddy recognition algorithm, a 3D CNN based on a residual network and a hybrid CNN-XGBoost model based on Argo profile data and altimetry data; the latter can reach 98% classification accuracy and recapture approximately 36% of the eddy currents.

Most of the above methods use single-mode data (i.e., a single feature variable) to identify MEs and ignore data from other modes closely related to ME detection. The EDNet identification network proposed by (Fan ZL. et al., 2020) considers three modes: SSH, SST, and current velocity, while a deep learning abnormal eddy (warm cyclonic eddies and cold anticyclonic eddies) detection (DL-AED) model proposed by (Liu YJ. et al., 2021) fuses global SSH and SST data to detect eddy features and identify anomalous and normal eddies (cold cyclonic eddies and warm anticyclonic eddies). Anomalous eddies were found to account for a surprising one-third of the total eddies in global ocean.

Unlike previous studies based on SSH data and target detection skeletons (Liu et al., 2021b), proposed a deep eddy current detection neural network (DEDNet) with a pixel partitioning skeleton for high-frequency Radar (HFR) data, which can achieve global optimality in space and powerful detail discrimination for automatic detection and localization of offshore eddy currents. However, there remain two challenges in using HFR data for eddy detection. One is insufficient effective labelled data, and the other is the difficulty in inheriting the experience from previous detections. (Liu et al., 2021a) proposed a cross-domain eddy detection neural network, which used an instance-based domain adaptive approach to expand the training dataset to produce sufficient labelled data and parameter-based transfer learning for multiscene eddy detection to inherit the previous detection experience. In addition (Xu et al., 2021), demonstrated the potential of the bilateral segmentation network (BiSeNet) algorithm to preserve edge information and identify large-scale eddies, which also provides a reference for subsequent applications.

Accurate prediction of eddies is of great scientific and applied importance for understanding the characteristics of eddy propagation and evolution and improving the simulation and prediction of regional weather and climate change. Recently (Wang X. et al., 2020), developed a model for predicting eddy properties and propagation trajectories by using the eddy trajectory data as model inputs, which learned spatiotemporal variation features using LSTM networks and applied the ET algorithm to obtain relevant one-dimensional features from the changes in propagation positions (Wang XN. et al., 2022). proposed the MesoGRU framework to better extract feature correlations by integrating SLA and Archiving, Validation, and Interpretation of Satellite Oceanographic (AVISO) data (Figure 9), and then used two GRU layers to learn different patterns of ME trajectories. It can obtain an average daily center error of about 8 km and maintain a lower center error of 7-day prediction (8 to 18 kilometers), which is a great improvement compared to other methods, such as LSTM/GRU or method using a single dataset.

Figure 9

Figure 9 MesoGRU Framework for ME Trajectory Prediction. (A) Data downloading part obtains TCME and SLA data from AVISO and CMEMS and extracts the SCS information. (B) Data preprocessing part deals with SCS data by MAF, PCA, normalization, and feature combination and establishes a combined dataset of ME (CDME) after data compression. (C) Network prediction part iteratively trains prediction model with our WMSE loss function and renormalization (Wang XN. et al., 2022).

3.3.2 Fronts

As a representative mesoscale ocean phenomenon, ocean fronts occur in narrow transition zones between water bodies of different physical properties and exhibit distinct spatial and temporal behavioral characteristics. Companying mixing, upwelling and convergence of ocean fronts influence the upper layer ocean dynamic processes. At present, ocean-front detection studies confront two challenges, which are the scarcity of tagged data and limitations of ocean-front detection algorithms.

(Li QY. et al., 2022) constructed a labelled oceanfront dataset contributing positively to subsequent studies. With the dataset, oceanfront detection was treated as a weak edge identification problem, and edge locations were predicted by a network consisting of four convolutional blocks. (Cao et al., 2020) set up an oceanfront pixel-level recognition model from the gradient threshold perspective, which trained Mask R-CNN with labelled SST gradient maps and adaptively adjusted the recognition results with the benchmark gradient threshold for each class of fronts. However, the accuracy and applicability of the methods mentioned above are not satisfactory due to the dynamic properties of ocean fronts and their variability in different regions (Xie et al., 2022). combined a channel supervisory unit structure with a location attention mechanism to further improve the prediction capability of multiclass fronts in different regions at different temporal and spatial scales, which improved the pixel-level multiclass ocean front detection accuracy. Here, the location attention mechanism could integrate seasonal characteristics and contextual information of a seashore, while the location attention mechanism adaptively assigned attention weights according to the sea areas where fronts occurred frequently.

3.3.3 Internal waves

Internal waves (IW), a type of sub-mesoscale wave motion occurring within the stratified ocean, are an integral part of ocean dynamical processes, especially for ocean mixing and energy cascade studies. Among the different types of internal waves, internal solitary waves are one of the research hotspots for ocean researchers, whose waveforms remain approximately constant during propagation. The sudden generation of great shock momentum may pose a catastrophic threat to subsurface navigation and marine engineering facilities. Therefore, the real-time monitoring and forecasting of IWs are crucial to operational applications.

IW identification is a necessary foundation for ocean hazard research and prevention. Remote sensing observations with a high time efficiency, large range and long time series provide a rich database for dynamic monitoring of internal waves, such as SAR images. IWs appear as irregular alternating light and dark stripes in SAR images, and this feature makes it possible to identify IWs from SAR images (Vasavi et al., 2021). provided a complete method for IW detection using the U-Net and KdV (Korteweg-de Vries) methods. By improving the U-Net method (Zheng et al., 2021),proposed an IW stripe segmentation algorithm based on SegNet for SAR images. However, the methods only target a small part of the whole SAR image and do not give segmentation results for the whole image (Ma et al., 2023). classified image blocks containing ocean IW based on a multi-decision fusion IW image classification strategy and subsequently applied PAU-Net, the IW streak segmentation method incorporating the pixel attention mechanism, to effectively suppress the complex background information of the ocean. In this condition, the feature extraction of the whole image is realized. In contrast to the way based on U-Net method (Zheng et al., 2022), proposed an IW stripe segmentation algorithm based on Mask R-CNN. The proposed method could identify the presence of oceanic internal waves, obtain the corresponding positions of bright and dark stripes in the image and even the width and directional angle of each bright and dark stripe. To deal with the subsurface observations and numerical modelling results and obtain more detailed information, researchers tried to develop techniques for full-depth 3D ocean internal wave structure extraction and recognition (Zhang X. et al., 2022). used the transfer learning method to fuse laboratory, buoy and remote sensing data to construct an internal solitary wave amplitude inversion model, which made good progress in the reconstruction of the 3D structure of internal solitary wave.

The vertical structure characteristics of the internal solitary waves (ISW) are the key factors in determining the degree of threat to subsurface vehicles or offshore engineering construction. With the rapid development of subsurface vehicles, offshore wind power systems, offshore drilling platforms and other offshore engineering projects continuing to advance to the deep sea, ensuring the safety of navigation of subsurface vehicles and the normal construction of offshore engineering operations matters the national economic development. Accurate forecasting is a crucial way to provide protection to ISWs. Due to the complexity of its generation mechanism and the lack of in situ observation, the prediction of ISW propagation is a difficult problem in oceanography. Based on the high-spatio-temporal-resolution results of an MITgcm model (Lu et al., 2021), applied an LSTM model to predict the propagation and evolution of ISWs, including amplitude, position and arrival time, achieving more promising results (Zhang M. et al., 2022). added a convolutional block attention module to the deep convolutional neural network and then applied them to detect ISW in combination with hydrodynamic signal observations for the first time, and achieved an accuracy of >95% in predicting the position of ISWs. Besides, IWs can generate shear stress on the seafloor and seriously affect the development of deepwater ocean engineering (Tian et al., 2023). trained various models by external environmental factors to predict the bottom shear stress, such as vertical velocity (w), zonal and meridional velocities (u and v) and amplitude (A), and found that the CNN-LSTM is significantly better than other models.

3.4 Typical weather and climate phenomena

Constituting 71% of the Earth’s surface, the ocean’s influence on the global climate is an important part of the oceanography field and cannot be ignored. Any complex dynamic interaction between oceans and atmosphere has the potential to result in dramatic climate phenomena. Currently, the oceanic climate phenomena receiving the most attention are tropical cyclones (TC), El Nino-Southern Oscillation (ENSO) and Indian Ocean Dipole (IOD). We systematically summarize each phenomenon in Table 4 and present its details in the next section.

Table 4

Table 4 A summary of the main application tasks, DL models used and current challenges of the typical weather and climate phenomena.

3.4.1 Tropical cyclones

A TC is a rapidly rotating storm system characterized by a low-pressure center, a closed low-level atmospheric circulation, strong winds and a spiral arrangement of thunderstorms that produce heavy rain and squalls. TCs generated in the western and northwestern Pacific and its adjacent waters are called ‘typhoons’. TCs are strong disturbances to the ocean and atmosphere and result in extreme destruction. Accurate analysis and prediction of typhoon intensity, path and the related wind-wave changes are of great importance for East Asian countries to prevent strong wind waves near typhoon centers and other secondary hazards caused by heavy rainfall near dense cloud structures (Zheng Z. et al., 2020) trained three source models: VGG16, InceptionV3 and ResNet50, on a large-scale ImageNet source dataset and constructed transferred forecasting models (T-typCNNs) for typhoons with small dataset samples using parameter transfer and Typhoon cloud images. The training accuracy on its self-built typhoon dataset was 95.081%, which was 18.571% higher than that trained using shallow CNNs, and at most 9.819% better than the results obtained directly with the three source models trained on the large-scale ImageNet source dataset without parameter transfer (Qian et al., 2021). used a ResNet deep learning model, in which transfer learning followed pretraining, to achieve a more accurate objective intensity estimation of typhoons of different intensities and development stages.

The resolution and accuracy of satellite cloud images are crucial for discriminating typhoon intensities (Zhang et al., 2021). proposed a multipath network model called SRCloudNet to achieve more accurate image reconstruction and improved resolution by integrating features extracted from back-projected units (exploring the dependencies between LR and HR satellite cloud images) and residual dense blocks. This model significantly enhances the quality of satellite cloud image reconstruction, thereby providing a solid foundation for future research and applications in this field. Previous deep learning recognition methods based on satellite cloud images commonly used CNNs to extract features [e.g. (Wang et al., 2020 2020) and (Giffard-Roisin et al., 2020)]. Given the irregular structure of satellite cloud images can affect the feature extraction capability of CNN, a new framework was proposed, the graph convolution (GC)-LSTM network (Zhou et al., 2020). In this work, the GC network is used to mitigate the influence of irregular structure effectively with better accuracy and stability in identifying typhoon eyes and spiral cloud bands; on the other hand, the LSTM network learned the features of satellite cloud images over time for prediction of the typhoon’s formation, and the prediction accuracy can reach 95.12% in typhoons and super typhoons. It also provided new ideas for processing irregular satellite images for other topics. Unlike simply extracting features from satellite clouds (Higa et al., 2021), combined the related meteorological knowledge such as characteristics of the typhoon’s eye, etc. to estimate typhoon intensity class with higher accuracy from individual satellite images using a VGG-16 model with pre-processed fisheye distortion, focusing on enhanced typhoon eyes, eye walls, and cloud distributions. (Lee et al., 2020) combined numerical forecast and satellite cloud image fusion for real-time TC intensity estimation and prediction, fully considering the effects of environmental variables and the feature identification of cloud maps based on multitask learning (MTL). It reduced the computational cost of the MTL model by approximately 300% compared to the single-task model. The performance improvement in TC intensity prediction for 6 and 12 h was 13% and 16%, respectively.

Regarding typhoon path prediction (Lian et al., 2020), proposed a multidimensional feature selection layer from the perspective of correlation analysis (CA) to select the most relevant meteorological variables and temporal ranges of tropical cyclone trajectories, learn their implicit features using a CNN layer and then input them to a GRU layer to mine their deep temporal features to predict the central location of the target tropical cyclone at the next timestamp. In addition, based on the GRU model (Song T. et al., 2022), considered the trajectory task as a time series prediction challenge and built a deep learning framework with an attention mechanism for trajectory prediction through a bidirectional GRU network (BiGRU), which showed significant advantages in medium- and long-term track forecasting, especially in the next 72 h. It significantly improved the training efficiency and accuracy of the network. The framework was further implemented in a distributed system to provide a perspective to improve the training speed of the network, that is, parallelization of the program distribution. To solve the problem of limited access to observational data (Ruttgers et al., 2022), replaced the satellite images with reanalysis data of total cloudiness and vorticity fields, which had a more positive impact in terms of real-time forecasting and provided a new way of thinking to fit first-level sample data (i.e., raw observational data) with multiple reanalysis data, which may have more positive influences.

Typhoons are not only a compound of multi-feature factors but also multidimensional features. Thus, how to extract 3D features and fuse them with the 2D features in an appropriate way remains a key challenge, (Xu et al., 2022) combined the traditional generalized linear model (GLM) with the proposed AM-ConvGRU model based on the attentional Multi-ConvGRU. The GLM was applied to extract 2D typhoon features, while the proposed AM-ConvGRU used the residual channel attention block to select high-response isobaric planes automatically when considering the entire 3D structure of the typhoon, as well as to extract large-scale nonlinear spatial features of the typhoon by the Multi-ConvGRU. The broad logic can be seen in Figure 10. It has good results in predicting the central location of typhoons in the next 24 hours and is the latest method in the field for integrating 2D and 3D elements, providing a methodological guide for subsequent studies to consider multidimensional features.

Figure 10

Figure 10 Overview of the typhoon prediction model using AM-ConvGRU. The architecture of the model is based on the Wide & Deep framework. The model input consists of two folds: 2D typhoon and 3D typhoon, and the Max-Min normalization method is applied to both inputs. The feature extractor consists of two components; namely, the wide component and the deep component. In the wide component, 2D typhoon features with shape (53) were constructed by the CLIPER method and converted into hidden layers by GLM. In the deep component, 3D time-series typhoon features with shapes (4, 4, 31, 31, 31) (whose dimensions denote time step, channel, width, and height, respectively) were transformed by the three-layer AM-ConvGRU for feature map downsampling. For the element integrator, the wide and deep elements were combined in a dense layer. Finally, the model outputs were the latitude and longitude of the typhoon 24 h in advance (Xu et al., 2022).

Typhoons bring heavy rain, strong winds, and wind waves which threaten navigation safety, docking operations in ports and the safety of coastal residents’ lives and properties. During a typhoon, accurate wind and wave prediction is also crucial, (Meng et al., 2021) adopted six different parameters, namely wave height, air pressure and wind speed obtained by buoys, the lowest pressure in the center of typhoons and real-time wind speed, and the calculated distance between typhoon center and buoy, to predict wave heights at buoy sites at different leading time by a BiGRU network, which demonstrated that the stability and effectiveness of the method were reliable. (Wei, 2021) coupled a numerical forecast model with an AI model, which included a VGGNNet and a high-resolution network (HRNet) consisting of a hybrid model integrated with a circulation-based GRU. The simulation results of the Weather Research Forecast (WRF) model (wind) were used as part of the data samples of the AI model to train the wave field, and the experiments in the coastal waters of northeastern Taiwan demonstrated feasibility in handling spatial data. Based on numerical model results, a more comprehensive experiment of multidata fusion was conducted to further consider ground station data collected by the Weather Bureau (WB) of Taiwan ground stations, buoy data, and hourly radar reflectivity images. Considering multi-resource data did help the AI model to predict the typhoon-induced wind and wave within higher accuracy.

3.4.2 ENSO

ENSO is a prominent signal of inter-annual and interdecadal climate change on a global scale, occurring as winds and SST oscillation in the eastern equatorial Pacific. It fluctuates periodically among three phases: neutral, La Niña and El Niño and affects the climate of much of the tropics and subtropics. Deep learning techniques drive accurate advance prediction of ENSO occurrence, category and intensity.

Due to the influence on the global climate, several climate factors can be employed as predictors, such as IOD, Atlantic Nino, North Tropical Atlantic SST and Western Hemisphere warm pool. (Li C. et al., 2022) in predicting SST and determining whether IOD occurs by calculating the IOD index, selected 81 features from three dimensions, atmospheric, surface, and subsurface, to train a deep learning model based on convLSTM and combined the data partially constrained by realistic physical information. The positive effect of wind field information on IOD prediction was verified by adding the wind field signal to the entire time step of learning.

As a landmark work in AI prediction of ENSO (Ham et al., 2019), applied transfer learning on the CMIP5 output and reanalysis products to train the CNN model to predict the Nino3.4 index and ENSO phase using SST and heat content anomalies. A valid prediction 17 months ahead of time was achieved, paving the way for AI applications using this method. Subsequently, they further improved the ability to identify different characteristics of seasonal variabilities by incorporating seasonal factors into climate data and training samples for all target seasons and leading forecast months simultaneously (Ham et al., 2021), which minimized the ‘spring predictability barrier’ problem (During the spring (April-July) in the Northern Hemisphere, the self-perpetuation of ENSO development is weak, and there is a great deal of difficulty in how to forecast ENSO development during this time period, which is called the spring predictability barrier). Although the work of Ham achieved good results, the CNN model still has room for improvement (Hu et al., 2021). incorporated residual ideas and dropout techniques into a CNN model and extended transfer learning for the ENSO index and phase prediction. Not only can the Niño3.4 index be effectively predicted 20 months in advance, but the accuracy of the type prediction can also reach 83.3% 12 months ahead, which is much greater than the current best prediction accuracy (66.7%). (Gupta et al., 2022) and (Zhou P. et al., 2021) also made good progress in dealing with the barrier based on convLSTM and LSTM, respectively. However, these studies used only one deep learning model, which is not better adaptable to different tasks. (Ye et al., 2021) proposed the MS-CNN framework for adaptively adjusting the CNN structure according to different time-prediction tasks, in which the multi-model parallel prediction replaced the traditional single-model iterative process to avoid error accumulation, and improve the reliability of long-term prediction.

Considering the dynamical complexity of ENSO and the close correlation of Walker circulation, a multivariate coupled model ENSO-ASC(air–sea coupler) was proposed by (Mu et al., 2021) not only extract the multiscale spatial and temporal characteristics of multiple physical variables, but also included two attention weights for different air-sea coupling strengths for different starting calendar months and different effects of these variables. The validity of Niño 3.4 index predictions over 18 months was demonstrated, as well as the positive effects of SST and zonal winds. Similarly, the prediction of the EI Niño index and phase using spatiotemporal structures has also been reported in several studies [e.g. (Geng and Wang, 2021; Hashemi, 2021; Jonnalagadda and Hashemi, 2022)., etc.].

In addition, working on enhancing the ENSO-related target signal and reducing the ‘spring predictability barrier’ problem (Zhou and Zhang, 2022), combined the POP (Principal Oscillation Pattern) analysis procedure with the CNN-LSTM algorithm in predicting the Niño-3.4 SST index. They enhanced the ENSO correlated target signal by POP-based preprocessing function; while filtering out the uncorrelated noise. Similarly, by combining physical analysis methods (Mu et al., 2020 2020), proposed a multiscale Bayesian convolutional network which formulated ENSO downscaling as a multiscale probabilistic prediction problem and aggregated the outputs at all scale levels in the form of a joint distribution, which ensured good stability, validity, and scalability. In addition, the transformer, which can better focus on global features than the CNN model, has recently been introduced to ENSO prediction by (Ye et al., 2022), but relevant applications of this model in oceanography are still scarce and deserve in-depth study. With the wide application of the Transformer model, there are attempts to apply it in areas such as ENSO prediction. Zhang et al. predicted 3-D upper ocean temperature anomalies and wind stress anomalies eighteen months in advance (Zhou and Zhang, 2023), and comprehensively explored the reasons for the success of the predictions with interpretability by considering wind and subsurface thermal forcing separately in the input predictor (Gao et al., 2023). At the same time, by coupling multimodal 3D fields, the ability to predict multivariate 3D fields 20 months ahead of time is realized by rolling prediction (Zhang et al., 2024).

Finally, the data mapping relationship construction capability of deep learning is an important tool for constructing and exploring unknown relationships between different phenomena and different modalities. This has already been attempted to be applied in ENSO research. Zhang et al. have attempted to use a deep neural network structure based on U-Net technology to explore the influence of the SST-precipitation feedback process during the evolution of ENSO (A deep learning-based U-Net model for ENSO-related precipitation responses to sea surface temperature anomalies over the tropical Pacific, 2023). As well as constructing the relationship between SST anomalies and wind stress (tau) anomalies. These works also provide a new approach for ocean-atmosphere interaction modeling studies (Du and Zhang, 2024).

4 Future development trends

With the advent of the era of big data, the application practice of deep learning in recent years has proven that the large-scale integration of deep learning technology into the research of specific problems in various fields has become an inevitable trend. The development of remote sensing technology has led to the qualitative improvement of oceanography, and oceanographic research has been in a ‘remote sensing ocean era’ since the 1970s. Thus, this review argues that the development of deep learning will naturally lead to a new leap in oceanographic research. This does not mean that deep learning will replace traditional methods. As mentioned above, it is an auxiliary tool to help traditional methods improve their performance. In the following, the development perspectives of application scenarios and methods of deep learning are discussed.

4.1 Application

Deep learning is a tool that aids traditional research. For example, one can better handle a variety of complex data. The current main application scenario of deep learning is one of the main application aspects in the future. The main methods can be divided into direct data analysis, data secondary information extraction, data reconstruction and inversion, and data generation.

Data analysis is the direct application of deep learning to existing data for operations, such as recognition and classification. For example, Dong et al. (2022) (Dong Z. et al., 2022) used a deep learning model to detect MEs directly from ocean satellite SSH images. The extraction of secondary information from the data is suitable for obtaining secondary data products with higher accuracy and for the subsequent analysis of the secondary products. The existing data are similar to ‘industrial raw materials,’ and the operation of data secondary information extraction aims to combine the ‘industrial raw materials’ into ‘parts’ required for subsequent work by deep learning for a specific job.

Data reconstruction or inversion is essentially interconnected. Applying deep learning to learn the dynamic potential connections of different variables so that the potential relationship model, a trained deep learning model, is built to achieve operations, such as high-resolution reconstruction or inversion of unknown or missing data by a part of the existing data. For example, the wide-format SWH structure can be detected by a deep neural network model. Specific applications, such as (Izumi et al., 2022) and (Barth et al., 2022), used deep learning to reconstruct the missing portions of data due to cloud cover. In particular, the method of (Zhang ZG. et al., 2020) for reconstructing the SSH for an entire basin using TG data from multiple coastal stations is worth extending to other areas.

Data generation confronts problems, such as unavailability or high cost of data on a large scale, and the imbalance in ocean datasets, where variables such as sea surface temperature are overrepresented while others, like subsurface temperature, have significantly less data available, and a common approach is using the deep learning GAN family of neural networks to generate data [e.g. (Izumi et al., 2022; Jamali and Mahdianpari, 2021)]. It not only improves data defects and augmentation but also saves significant human and material resources in data collection and will be an indispensable key step in most future deep learning applications.

In summary, the above application scenarios are the areas that have previously drawn more attention and are currently more technically mature. In the future, researchers need to keep exploring and discovering more potential application directions.

4.2 Methods

In addition to exploring the application of deep learning in more domain directions, we also summarize the following innovative applications of different methods and the directions which are likely to produce innovative results in the near future based on three perspectives: data pre-processing, model selection and training strategy from the deep learning method itself.

4.2.1 Multidimensional, multiscale, and multimodal feature fusion

The idea behind deep-learning algorithms is to learn explicit and implicit connections of data for various purposes, such as prediction, recognition or classification. Therefore, the more comprehensive the data collected from different sources related to the target, the more accurate the results of the trained model will be. Owing to the high dynamics and complexity of the marine environment, any marine phenomenon is the result of interactions of multi-modal and multi-dimensional processes. Some studies select various methods to determine the correlation coefficients of each possible relevant datum in the data processing stage and dynamically determine the characteristic variables to train a model. For example, Shao (2021) et al. (Shao et al., 2021) used a series of methods, such as EOF analysis, to construct correlation coefficients between variables and spatial correlations of different sites when conducting sea surface data analysis in the South China Sea. In addition, related research can draw on the work of (Huang et al., 2022) to consider eddy current identification from a 3D perspective. Nevertheless, how to properly extract 3D features and fuse them with 2D features is currently uncertain and the most promising direction to carry out related innovations to improve research performance.

4.2.2 Transfer learning

The ability of deep learning to be rapidly and comprehensively deployed as a convenient tool for solving problems in various domains depends on the development of methods of transferability and pervasiveness. In this condition, time and computing resources, which are spent undertaking repeated training of similar model solutions based on different datasets, can be saved. Transfer learning shows great potential in this respect, as shown by (Ordonez et al., 2022) on otolith images from different laboratories. Referring to (Zheng Z. et al., 2020), training time can be reduced by training baseline features on similar large datasets already available and being fine-tuned for task-specific adaptation on a small number of target datasets. In addition, scaling from small sample source domains to large target domains can solve the problem of too few labels in the target dataset. In future, transfer learning methods that balance convenience and accuracy from different perspectives are worth exploring.

4.2.3 Unsupervised and semi-supervised

Owing to the reliance of deep learning on data labeling and the difficulty of labelling large datasets, future training samples of deep learning must be unsupervised and semi-supervised. Applying unlabeled or less-labeled data samples to deep learning to accomplish specific tasks through appropriate methods or frameworks is an appealing direction to develop. Data generation and transfer learning have solved such problems to some extent. However, other more suitable methods are still waiting to be discovered.

4.2.4 Model fusion

As an essential aspect of deep learning applications, model fusion is imperative to how well a task is completed. Most current applications use generalized models to solve specific problems directly. As an applicator, oceanographic researchers are of little need to create new models to achieve better results but can appropriately fuse more relevant models and methods or make targeted specialized modifications to models to achieve innovation and improve results according to specific research tasks. Currently, it is popular to fuse or nest convolutional series neural networks and recurrent series neural networks, which can fully consider the spatio-temporal characteristics, such as the CLTS-Net model proposed by (Li et al., 2021). Recent innovative works incorporate the idea of attention mechanisms or residuals [e.g. (Song et al., 2020; Dong Z. et al., 2022)].

4.2.5 Modularity

Breaking down professional barriers and making deep learning technology more widely and conveniently applied by experts in various fields is a difficult hurdle that must be overcome in future developments. A feasible alternative is to modularize deep learning algorithms. We encapsulate the task-based algorithm model in a ‘black box’ form and provide the data interface and tuning interface for non-AI experts. In this condition, they no longer need to fully understand the details of each neural network algorithm before applying deep learning techniques; nevertheless, they can directly find the module that matches their tasks and adjust the required parameters appropriately to generate satisfactory solutions. AI experts are expected to be more involved in algorithm modularization in the future, thus, modularizations can be applied to specific fields to frame the workflow, which uses deep learning networks to handle the assigned tasks as one of the components. It improved the portability of deep learning, facilitate its extension to various industries to develop data processing capabilities and promote its development.

4.2.6 Training strategies

In the final step of the deep learning application process, choosing a smart training strategy can improve results with less effort. For example, rolling prediction is currently used more frequently in prediction tasks. It uses the sliding window within the known variables to predict the post-window state and gradually incorporates the prediction results into the known variables to improve the prediction accuracy; however, it is worth noting that this has the potential to produce problems, such as overfitting and magnifying errors, which can be patched by physical dynamic constraints. Beyond simply cycling the predicted results, MTL can be performed to aid in prediction by cycling the variables of interest for real-time prediction into known variables. For example, (Politikos et al., 2021) used the prediction of fish length as an auxiliary task to estimate the fish age better. Selecting a proper training strategy should be task-oriented and is an open and potentially innovative aspect.

4.3 Integration with numerical methods

As the primary traditional approach in oceanographic research, the development of numerical models enables scientists to better understand and predict the behavior of natural systems, providing crucial support for forecasting and simulating the dynamic changes of atmospheric, oceanic, and terrestrial systems. While the accuracy and efficiency of numerical models are influenced by various factors, including model resolution, parameterization schemes, accuracy of initial conditions, and boundary conditions, their simulation of various physical processes and phenomena is reliable, and the simulation process described by complex physical equations and mathematical methods is clear. Therefore, the main direction of future deep learning applications lies in appropriately integrating the accuracy of deep learning with the reliability and interpretability of numerical models.

Currently, there are two main application scenarios: firstly, deep learning can be nested with numerical models to form hybrid models to improve data processing accuracy. For example (Xiao et al., 2019), combined LSTM with the AdaBoost integrated learning model to settle the overfitting problem in LSTM. Most typically (Ma et al., 2021), combined the numerical weather forecasting model WRF with a deep learning model for a significant wave height prediction, which extracts features from historical dataset and considers the geographic and meteorological factors considered by the WRF model at the meantime, effectively suppressing the randomness and instability of waves and improving the prediction accuracy. In addition, it can also be used in scenarios of pattern recognition and feature extraction to extract useful features from large-scale observational data or identify complex spatial and temporal patterns. A common approach is to use EOF analysis to extract different structural features of the data, thereby improving model performance (Zhou SY. et al., 2021). coupled the EOF and LSTM networks to solve the problem of poor accuracy when predicting significant wave heights.

Furthermore, there are many foreseeable application directions, some of which researchers have already begun to explore:

1. Model Parameterization and Physical Process Modeling: Deep learning can be used to optimize parameterization schemes of numerical models (Zhu et al., 2022), such as adjusting parameter values in parameterization schemes or adjusting parameters in physical equations. Alternatively, deep learning can provide more accurate and reliable parameterization schemes by learning from large amounts of observational data, thereby improving model performance. Neural networks capable of obeying all the laws of physics described by the PDE when solving supervised learning tasks (Dong C. et al., 2022), Physically Informed Neural Networks (PINN), are also an important current direction.

2. Model Defect Correction: Traditional numerical models may have some systematic biases or errors, which may be caused by model simplifications or incomplete physical descriptions. Deep learning can correct these errors by learning the differences between observational data and model outputs, thereby improving the accuracy of the model.

3. Data Assimilation: Data assimilation combines observational data with numerical model outputs to provide more accurate model state estimates. Deep learning can be used to design more effective data assimilation methods, such as handling spatiotemporal correlations through Recurrent Neural Networks (RNN) or Convolutional Neural Networks (CNN), and improving the model’s adaptability to observational data.

4. Model Acceleration and Optimization: Deep learning can be adopted to accelerate the speed of numerical models and optimize computational processes. For example, deep learning methods can be used to design more efficient numerical algorithms or reduce the computational load of models, thereby improving model performance and efficiency.

5. Uncertainty Modeling: Deep learning can model and handle uncertainty in numerical models, such as generating multiple possible prediction results through Generative Adversarial Networks (GAN), or estimating the posterior distribution of parameters through Monte Carlo methods.

The interpretability of neural networks appears to be an important factor influencing oceanographers’ acceptance of artificial intelligence methods. Proper integration with numerical models can help scientists better understand physical mechanisms and their relative effects, improve the predictive capability of numerical models, and address some challenges and difficulties in numerical modeling. Additionally, some studies incorporate experts’ experience and knowledge into the model training and construction process, artificially assigning weights to features or filtering results, which can be considered a good approach to enhance accuracy and interpretability, as shown in (Conradt et al., 2022).

5 Summary

Although deep learning technology has been rapidly applied in the field of oceanography, including physical oceanography, there are still many issues to be addressed. Factors such as multidimensional fusion, integration with physics, complexity of features, and data imbalance are constraining the deeper application of deep learning. At the same time, the majority of existing research remains at the application level, failing to achieve a qualitative improvement in advancing oceanographic research. Compared to the vast unknown areas of the ocean, the future trend lies in leveraging the advantages of deep learning to promote scientific discoveries and propose new scientific questions.

This work discussed the background and necessity of deep learning applications in physical oceanography. After introducing the history of deep neural networks, this review introduced the three main classes of deep learning models and their main application scenarios in a black box format from the application perspective, avoiding the initial hindrance for oceanographers to use deep learning, that is, the difficulty in understanding the details of the models. To provide some comprehensive and cutting-edge references for all oceanographers interested in deep learning techniques, the latest applications and innovative cases of deep learning techniques in various fields of oceanography were reviewed in detail mainly by examining recent studies published in the last 3 years. Moreover, some promising directions for future applications and innovations were introduced for oceanographers from both application tasks and deep-learning perspectives. We look forward to the promotion and popularization of deep learning technology in the oceanography field and more discoveries about the ocean. From the perspective of AI researchers, we hope to obtain an increasing amount of application feedback to improve and innovate deep learning models. We believe that deep learning will help the oceanographic research field to achieve a new leap forward and embark on a more intelligent and rapid development stage within the next decade.

Author contributions

QZ: Conceptualization, Investigation, Writing – original draft. SP: Funding acquisition, Methodology, Writing – review & editing. JW: Conceptualization, Methodology, Supervision, Writing – review & editing. SL: Funding acquisition, Investigation, Methodology, Supervision, Writing – review & editing, Writing – original draft. ZH: Methodology, Validation, Writing – review & editing. GZ: Conceptualization, Methodology, Software, Writing – original draft, Writing – review & editing.

Funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was jointly supported by the Major Projects of National Natural Science Foundation of China (U20A20105), the GuangDong Basic and Applied Basic Research Foundation (Grant no. 2022A1515240081 and 2021B1212050023), the Guangdong Special Support Program (2019BT2H594), the special fund of South China Sea Institute of Oceanology of the Chinese Academy of Sciences (Grant no. SCSIO2023QY01), National Key Research and Development Program of China (Grant No.2022YFC3105000), the National Key Research and Development Program of China under Grant No. 2018AAA0100400, HY Project under Grant No. LZY2022033004, the Natural Science Foundation of Shandong Province under Grants No. ZR2020MF131 and No. ZR2021ZD19, the Science and Technology Program of Qingdao under Grant No. 21-1-4-ny-19-nsh, and Project of Associative Training of Ocean University of China under Grant No. 202265007.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Alsharay N. M., Chen Y. Z., Dobre O. A., De Silva O. (2022). Improved sea-ice identification using semantic segmentation with raindrop removal. IEEE Access 10, 21599–21607.

Google Scholar

Andersson T. R., Hosking J. S., Pérez-Ortiz M., Paige B., Elliott A., and Russell C., et al. (2021). Seasonal Arctic sea ice forecasting with probabilistic deep learning. Nat. Commun. 12, 12, 5124.

PubMed Abstract | Google Scholar

Aparna S. G., D'Souza S., Arjun N. B. (2018). Prediction of daily sea surface temperature using artificial neural networks. Int. J. Remote Sens. 39, 4214–4231.

Google Scholar

Bai G., Wang Z. F., Zhu X. Y., Feng Y. Q. (2022). Development of a 2-D deep learning regional wave field forecast model based on convolutional neural network and the application in South China Sea. Appl. Ocean Res. 118, 14, 103012. doi: 10.1016/j.apor.2021.103012

Applications of deep learning in physical oceanography: a comprehensive review

1 Introduction

2 Deep learning

2.1 Developmental history

2.2 Models of deep learning

3 Applications of Deep learning in oceanography

3.1 Sea surface elements

3.1.1 Sea surface temperature

3.1.2 Sea surface salinity

3.1.3 Sea surface currents

3.1.4 Sea surface height

3.1.5 Significant wave height

3.1.6 Sea ice

3.2 Subsurface temperature and salinity

3.3 Typical ocean phenomena

3.3.1 Mesoscale eddies

3.3.2 Fronts

3.3.3 Internal waves

3.4 Typical weather and climate phenomena

3.4.1 Tropical cyclones

3.4.2 ENSO

4 Future development trends

4.1 Application

4.2 Methods

4.2.1 Multidimensional, multiscale, and multimodal feature fusion

4.2.2 Transfer learning

4.2.3 Unsupervised and semi-supervised

4.2.4 Model fusion

4.2.5 Modularity

4.2.6 Training strategies

4.3 Integration with numerical methods

5 Summary

Author contributions

Funding

Conflict of interest

Publisher’s note

References

94% of researchers rate our articles as excellent or good