Skip to main content

ORIGINAL RESEARCH article

Front. Robot. AI , 14 February 2025

Sec. Robot Learning and Evolution

Volume 11 - 2024 | https://doi.org/10.3389/frobt.2024.1491907

This article is part of the Research Topic Advancements in Neural Learning Control for Enhanced Multi-Robot Coordination View all 3 articles

Adaptive formation learning control for cooperative AUVs under complete uncertainty

  • 1Department of Mechanical, Industrial and Systems Engineering, University of Rhode Island, Kingston, RI, United States
  • 2Graduate School of Oceanography, University of Rhode Island, Kingston, RI, United States
  • 3Department of Electrical, Computer, and Biomedical Engineering, University of Rhode Island, Kingston, RI, United States

Introduction: This paper addresses the critical need for adaptive formation control in Autonomous Underwater Vehicles (AUVs) without requiring knowledge of system dynamics or environmental data. Current methods, often assuming partial knowledge like known mass matrices, limit adaptability in varied settings.

Methods: We proposed two-layer framework treats all system dynamics, including the mass matrix, as entirely unknown, achieving configuration-agnostic control applicable to multiple underwater scenarios. The first layer features a cooperative estimator for inter-agent communication independent of global data, while the second employs a decentralized deterministic learning (DDL) controller using local feedback for precise trajectory control. The framework's radial basis function neural networks (RBFNN) store dynamic information, eliminating the need for relearning after system restarts.

Results: This robust approach addresses uncertainties from unknown parametric values and unmodeled interactions internally, as well as external disturbances such as varying water currents and pressures, enhancing adaptability across diverse environments.

Discussion: Comprehensive and rigorous mathematical proofs are provided to confirm the stability of the proposed controller, while simulation results validate each agent’s control accuracy and signal boundedness, confirming the framework’s stability and resilience in complex scenarios.

1 Introduction

Robotics and autonomous systems have a wide range of applications, spanning from manufacturing and surgical procedures to exploration in challenging environments (Ghafoori et al., 2024; Jandaghi et al., 2023). However, controlling robots in such settings, especially in space and underwater, presents significant difficulties due to unpredictable dynamics. In the context of underwater exploration, AUVs have become essential tools, offering cost-effective, reliable, and versatile solutions for adapting to dynamic conditions. Effective use of AUVs is critical for unlocking the mysteries of marine environments, making advancements in their control and operation essential. As the demand for efficient underwater exploration increases and the complexity of tasks assigned to AUVs grows, there is a pressing need to enhance their operational capabilities. This includes developing sophisticated formation control strategies that allow multiple AUVs to operate in coordination, drawing inspiration from natural behaviors observed in fish schools and bird flocks (Zhou et al., 2023; Yang et al., 2021). By leveraging multi-agent systems, AUVs can work in coordinated groups, enhancing efficiency, stability, and coverage while navigating dynamic and complex underwater environments. These strategies are essential for ensuring precise operations in varied underwater tasks, ranging from pipeline inspections and seafloor mapping to environmental monitoring (Yan et al., 2023).

Despite challenges from intricate nonlinear dynamics, complex interactions among AUVs, and the uncertain dynamic nature of underwater environments, effective multi-AUV formation control is increasingly critical in modern ocean industries (Yan et al., 2018; Hou and Cheah, 2009). Historically, formation control research has predominantly utilized the behavioral approach (Balch and Arkin, 1998; Lawton, 2000), which divides the overall control design into subproblems, with each vehicle’s action determined by a weighted average of solutions, though selecting appropriate weighting parameters can be challenging. The leader-following approach (Cui et al., 2010; Rout and Subudhi, 2016) designates one vehicle as the leader while others follow, maintaining predefined geometric relationships, and controlling formation behavior by designing specific motions for the leader. Alternatively, the virtual structure approach (Millán et al., 2013).

Despite advancements in formation control and path planning for multi-AUV systems, challenges such as environmental disturbances, complex underwater dynamics, and communication limitations continue to pose difficulties (Hadi et al., 2021). To address these challenges, there is a critical need for controllers that are independent of both robot dynamics and environmental disturbances. Developing such controllers would enhance formation control by allowing for decentralized application, which increases flexibility in formation structures and improves robustness against communication constraints. Addressing these gaps is essential for advancing the capabilities and reliability of multi-AUV systems. On the other hand, communication constraints in underwater environments make decentralized control with a virtual leader-following topology ideal for AUVs, enabling coordination using local information despite communication delays or interruptions (Yan et al., 2023).

Reinforcement learning (RL) has also been extensively applied in robotic control (Christen et al., 2021; Cao et al., 2022). RL approaches, such as deep reinforcement learning (DRL), offer advantages in learning complex, non-linear control policies directly from data. However, RL methods generally lack the ability to provide mathematical stability proofs and guarantees for the controller’s behavior, making it challenging to ensure safety and reliability, especially in critical applications. Besides, while Zhang et al. (2018) developed various direct neural adaptive laws that lead to increased oscillations with higher adaptation gains, indirect neural adaptive laws using prediction error methods were proposed to mitigate this issue, though they could not guarantee parameter convergence. However, NN-based learning control methods, such as those utilizing adaptive neural networks or deterministic learning frameworks Jandaghi et al. (2024), can incorporate stability analysis and provide rigorous mathematical proofs for parameter convergence. These methods enable researchers to establish theoretical guarantees for the stability and robustness of the controller, which is essential for deploying controllers in real-world applications where safety and reliability are critical. Most recently, Tutsoy et al. (2024) proposed an optimization-based approach for path planning in Unmanned Air Vehicles (UAVs) with actuator failures using particle swarm optimization and genetic algorithms. Their method focuses on minimizing both time and distance by optimizing predefined cost functions through heuristic methods, while incorporating system constraints such as actuator limits, kinematic, and dynamic constraints, as well as parametric uncertainties.

Despite extensive literature in the field, to the best of our knowledge, existing researches assume homogeneous dynamics and certain system parameters for all AUV agents, which is unrealistic in unpredictable underwater environments. Factors such as buoyancy, drag, and varying water viscosity significantly alter system dynamics and behavior. Additionally, AUVs may change shape during tasks like underwater sampling or when equipped with robotic arms, further complicating control. Typically, designing multi-AUV formation control involves planning desired formation paths and developing tracking controllers for each AUV. However, accurately tracking these paths is challenging due to the complex nonlinear dynamics of AUVs, especially when precise models are unavailable. Implementing a fully distributed and decentralized formation control system is also difficult, as centralized control designs become exceedingly complex with larger AUV groups. To address these challenges previous work, such as Yuan et al. (2017) and Dong et al. (2019), developed adaptive learning controllers that relied on the assumption of a known mass matrix, which is not practical in real-world applications. These controllers relied on known system parameters that can fail due to varying internal forces caused by varying external environmental conditions. The solution is to develop environment-independent controllers that do not rely on any specific system dynamical parameters.

The framework’s control architecture is ingeniously divided into a first-layer Cooperative Estimator Observer and a lower-layer Decentralized Deterministic Learning (DDL) Controller. The first-layer observer is pivotal in enhancing inter-agent communication by sharing crucial system estimates, operating independently of any global information. Concurrently, the second-layer DDL controller utilizes local feedback to finely adjust each AUV’s trajectory, ensuring resilient operation under dynamic conditions heavily influenced by hydrodynamic forces and torques by considering system uncertainty completely unknown. This dual-layer setup not only facilitates acute adaptation to uncertain AUV dynamics but also leverages RBFNN for precise local learning and effective knowledge storage. Such capabilities enable AUVs to efficiently reapply previously learned dynamics after the system restarts. This tow-layer framework achieves a significant advancement by considering all system dynamics parameters as unknown, enabling a universal application across all AUVs, regardless of their operating environments. This universality is crucial for adapting to environmental variations such as water flow, which increases the AUV’s effective mass via the added mass phenomenon and affects the vehicle’s inertia. Additionally, buoyancy forces that vary with depth, along with hydrodynamic forces and torques, stemming from water flow variations, the AUV’s unique shape, its appendages, and drag forces due to water viscosity, significantly impact the damping matrix in the AUV’s dynamics. This framework not only improves operational efficiency but also significantly advances the field of autonomous underwater vehicle control by laying a robust foundation for future enhancements in distributed adaptive control systems and fostering enhanced collaborative intelligence among multi-agent networks in marine environments. Extensive simulations have underscored the effectiveness of the framework, demonstrating its potential to elevate the adaptability and resilience of AUV systems under the most demanding conditions. In summary, the contribution of this paper is as follows:

• The universal controller works in any environment and condition, such as currents or depth.

• Each AUV controller operates independently.

• The controller functions without needing information about the robot’s dynamic parameters, like mass, damping, or inertia. Each AUV can also have different dynamic parameters.

• The system learns the dynamics once and reuses the pre-trained weights, avoiding the need for retraining.

• The use of localized RBFNN reduces real-time computational demands.

• Providing rigorous stability analysis of the controller while providing mathematical proofs to ensure and guarantee the reliability of the controller.

The rest of the paper is organized as follows: Section 2 provides an initial overview of graph theory, RBFNN, and the problem statement. The design of the distributed cooperative estimator and the decentralized deterministic learning controller are discussed in Section 3. The formation adaptive control and formation control using pre-learned dynamics are explored in Section 4 and Section 5, respectively. Simulation studies are presented in Section 6, and Section 7 concludes the paper.

2 Preliminaries and problem statement

2.1 Notation and graph theory

Denoting the set of real numbers as R, we define Rm×n as the set of m×n real matrices, and Rn as the set of n×1 real vectors. The identity matrix is symbolized as I. The vector with all elements being 1 in an n-dimensional space is represented as 1n. The sets S+n and Sn+ stand for real symmetric n×n and positive definite matrices, respectively. A block diagonal matrix with matrices X1,X2,,Xp on its main diagonal is denoted by diag{X1,X2,,Xp}. AB signifies the Kronecker product of matrices A and B. For a matrix A, A is the vectorization of A by stacking its columns on top of each other. For a series of column vectors x1,,xn, col{x1,,xn} represents a column vector formed by stacking them together. Given two integers k1 and k2 with k1<k2, I[k1,k2]={k1,k1+1,,k2}. For a vector xRn, its norm is defined as |x|(xTx)1/2. For a square matrix A, λi(A) denotes its i-th eigenvalue, while λmin(A) and λmax(A) represent its minimum and maximum eigenvalues, respectively.

A directed graph G=(V,E) comprises nodes in the set V={1,2,,N} and edges in EV×V. An edge from node i to node j is represented as (i,j), with i as the parent node and j as the child node. Node i is also termed a neighbor of node j. Ni is considered as the subset of V consisting of the neighbors of node i. A sequence of edges in G, (i1,i2),(i2,i3),,(ik,ik+1), is called a path from node i1 to node ik+1. Node ik+1 is reachable from node i1. A directed tree is a graph where each node, except for a root node, has exactly one parent. The root node is reachable from all other nodes. A directed graph G contains a directed spanning tree if at least one node can reach all other nodes. The weighted adjacency matrix of G is a non-negative matrix A=[aij]RN×N, where aii=0 and aij>0(j,i)E. The Laplacian of G is denoted as L=[lij]RN×N, where lii=j=1Naij and lij=aij if ij. It is established that L has at least one eigenvalue at the origin, and all nonzero eigenvalues of L have positive real parts. From Ren and Beard (2005), L has one zero eigenvalue and remaining eigenvalues with positive real parts if and only if G has a directed spanning tree.

2.2 Radial basis function neural networks (RBFNN)

The RBFNN Networks can be described as fnn(Z)=i=1Nwisi(Z)=WTS(Z), where ZΩZRq and W=w1,,wNTRN as input and weight vectors respectively (Park and Sandberg, 1991). N indicates the number of NN nodes, S(Z)=[s1(Zμi),,sN(Zμi)]T with si() is a radial basis function, and μi(i=1,,N) is distinct points in the state space. The Gaussian function si(Zμi)=exp(Zμi)T(Zμi)ηi2 is generally used for radial basis function, where μi=[μi1,μi2,,μiN]T is the center and ηi is the width of the receptive field. The Gaussian function categorized by localized radial basis function s in the sense that si(Zμi)0 as Z. Moreover, for any bounded trajectory Z(t) within the compact set ΩZ, f(Z) can be approximated using a limited number of neurons located in a local region along the trajectory f(Z)=Wζ*Sζ(Z)+ϵζ. ζ denotes the indices of active RBFNN nodes where |sji(Z)|>ι, based on the state Z(t). ϵζ is the approximation error, with ϵζ=O(ϵ)=O(ϵ*), Sζ(Z)=[sj1(Z),,sjζ(Z)]TRNζ, Wζ*=[wj1*,,wjζ*]TRNζ, Nζ<Nn, and the integers ji=j1,,jζ are defined by |sji(Zp)|>ι (ι>0 is a small positive constant) for some ZpZ(t). This holds if Z(t)ξji<ϵ for t>0. The following lemma regarding the persistent excitation (PE) condition for RBFNN is recalled from Wang and Hill (2018).

Lemma 1. Consider any continuous recurrent trajectory1 Z(t):[0,)Rq. Z(t) remains in a bounded compact set ΩZRq. Then for an RBFNN WTS(Z) with centers placed on a regular lattice (large enough to cover the compact set ΩZ), the regressor subvector Sζ(Z) consisting of RBFNN with centers located in a small neighborhood of Z(t) is persistently exciting.

2.3 Problem statement

A multi-agent system comprising N AUVs with heterogeneous nonlinear uncertain dynamics is considered. The dynamics of each AUV can be expressed as Fossen (1999):

η̇i=Jiηiνi,Miν̇i+Ciνiνi+Diνiνi+giηi+Δiχi=τi.(1)

In this study, the subscript iI[1,N] identifies each AUV within the multi-agent system. For every iI[1,N], the vector ηi=[xi,yi,ψi]TR3 represents the i-th AUV’s position coordinates and heading in the global coordinate frame, while νi=[ui,vi,ri]TR3 denotes its linear velocities and angular rate of heading relative to a body-fixed frame. The positive definite inertial matrix Mi=MiTS3+, Coriolis and centripetal matrix Ci(νi)R3×3, and damping matrix Di(νi)R3×3 characterize the AUV’s dynamic response to motion. The vector gi(ηi)R3×1 accounts for the restoring forces and moments due to gravity and buoyancy. The term Δi(χi)R3×1, with χicol{ηi,νi}, describes the vector of generalized deterministic unmodeled uncertain dynamics for each AUV.

The vector τiR3 represents the control inputs for each AUV. The associated rotation matrix Ji(ηi) is given by:

Jiηi=cosψisinψi0sinψicosψi0001,

Unlike previous work Yuan et al. (2017), which assumed known values for the AUV’s inertia and rotation matrices, this study considers all matrix coefficients, including Ci(νi), Di(νi), gi(ηi), and Δi(χi), as well as the inertia matrix, as completely unknown. The adaptive estimation process inherently addresses the effects of external forces and disturbances on system dynamics. This eliminates the need for explicit parameter estimation of these forces, as disturbances like water flow, varying currents, or depth variations are directly incorporated into the control input through adaptive estimation. This makes the controller universally applicable to any AUV, regardless of its design, weight, or environmental conditions by addressing both internal and external dynamic variation at the same time.

Internally, it handles unknown parameters such as mass and damping coefficients, as well as unmodeled nonlinear interactions and couplings. Externally, it accounts for unpredictable disturbances, including fluctuating water currents, depth-dependent pressures, and changes in hydrodynamic forces.

By avoiding reliance on predefined models, the proposed approach is robust and adaptable to diverse mission scenarios and unexpected environmental changes, ensuring reliable performance even in highly uncertain conditions.

In the context of leader-following formation tracking control, the following virtual leader dynamics generates the tracking reference signals:

χ̇0=A0χ0,(2)

with “0” marking the leader node, the leader state χ0col{η0,ν0} with η0R3 and ν0R3, A0R6×6 is a constant matrix available only to the leader’s neighboring AUV agents.

Considering the system dynamics of multiple AUVs (Equation 1) along with the leader dynamics (Equation 2), we establish a non-negative matrix A=[aij], where i,jI[0,N] such that for each iI[1,N], ai0>0 if and only if agent i has access to the reference signals η0 and ν0. All remaining elements of A are arbitrary non-negative values, such that aii=0 for all i. Correspondingly, we establish G=(V,E) as a directed graph derived from A, where V={0,1,,N} designates node 0 as the leader, and the remaining nodes correspond to the N AUV agents. We proceed under the following assumptions:

Assumption 1. All the eigenvalues of A0 in the leader’s dynamics (Equation 2) are located on the imaginary axis.

Assumption 2. The directed graph G contains a directed spanning tree with the node 0 as its root.

Assumption 1 is crucial for ensuring that the leader dynamics produce stable, meaningful reference trajectories for formation control. It ensures that all states of the leader, represented by χ0=col{η0,ν0}, remain within the bounds of a compact set Ω0R6 for all t0. The trajectory of the system, starting from χ0(0) and denoted by ϕ0(χ0(0)), generates periodic signal. This periodicity is essential for maintaining the Persistent Excitation (PE) condition, which is pivotal for achieving parameter convergence in Distributed Adaptive (DA) control systems. Modifications to the eigenvalue constraints on A0 mentioned in Assumption 1 may be considered when focusing primarily on formation tracking control performance, as discussed later.

Additionally, Assumption 2 reveals key insights into the structure of the Laplacian matrix L of the network graph G. Let Ψ be an N×N non-negative diagonal matrix where each i-th diagonal element is ai0 for iI[1,N]. The Laplacian L is formulated as:

L=j=1Na0ja01,,a0NΨ1NH,

where a0j>0 if (j,0)E and a0j=0 otherwise. This results in H1N=Ψ1N since L1N+1=0. As cited in Su and Huang (2011), all nonzero eigenvalues of H, if present, exhibit positive real parts, confirming H as nonsingular under Assumption 2.

Problem 1. In the context of a multi-AUV system (Equation 1) integrated with virtual leader dynamics (Equation 2) and operating within a directed network topology G, the aim is to develop a distributed NN learning control protocol that leverages only local information. The specific goals are twofold:

1) Formation Control: Each of the N AUV agents will adhere to a predetermined formation pattern relative to the leader, maintaining a specified distance from the leader’s position η0.

2) Decentralized Learning: The nonlinear uncertain dynamics of each AUV will be identified and learned autonomously during the formation control process. The insights gained from this learning process will be utilized to enhance the stability and performance of the formation control system.

Remark 1. The leader dynamics described in Equation 2 are designed as a neutrally stable LTI system. This design choice facilitates the generation of sinusoidal reference trajectories at various frequencies which is essential for effective formation tracking control. This approach to leader dynamics is prevalent in the literature on multiagent leader-following distributed control systems like Yuan (2017) and Jandaghi et al. (2024).

Remark 2. It is important to emphasize that the formulation assumes formation control is required only within the horizontal plane, suitable for AUVs operating at a constant depth, and that the vertical dynamics of the 6 degrees of freedom (DOF) AUV system, as detailed in Prestero (2001), are entirely decoupled from the horizontal dynamics.

As shown in Figure 1, a two-layer hierarchical design approach is proposed to address the aforementioned challenges. The first layer, the Cooperative Estimator, enables information exchange among neighboring agents. The second layer, known as the Decentralized Deterministic Learning (DDL) controller, processes only local data from each individual AUV. The development and formulation of the first layer are discussed in detail in Section 3.1, while the DDL control strategy, along with its corresponding controller design and analysis, is provided in Section 3.2.

Figure 1
www.frontiersin.org

Figure 1. Proposed two-layer distributed controller architecture for each AUVs.

3 Two-layer distributed controller architecture

3.1 First layer: cooperative estimator

In the context of leader-following formation control, not all AUV agents may have direct access to the leader’s information, including tracking reference signals (χ0) and the system matrix (A0). This necessitates collaborative interactions among the AUV agents to estimate the leader’s information effectively. Drawing on principles from multiagent consensus and graph theories Ren and Beard (2008), we propose to develop a distributed adaptive observer for the AUV systems as:

χ̂̇it=A^itχ̂it+βi1j=1Naijχ̂jtχ̂it,iI1,N.(3)

The observer states for each i-th AUV, denoted by χ^i=[η^i,ν̂i]TR6, aim to estimate the leader’s state, χ0=[η0,ν0]TR6. As t, these estimates are expected to converge, such that η̂i approaches η0 and ν̂i approaches ν0, representing the leader’s position and velocity, respectively. Equation 3 accounts for the communication graph by including the adjacency matrix information through the term aij. Note that A^i(t)R6×6 represents an estimate computed by agent i of the leader’s matrix dynamics A0(t)R6×6 which is also not available to the agents. Therefore, each agent estimates such matrix using a cooperative adaptation law:

A^̇it=βi2j=1NaijA^jtA^it,iI1,N,(4)

which borrowed from Ren and Beard (2008) as well. The constants βi1 and βi2 are all positive numbers and are subject to design.

Remark 3. Each AUV agent in the group is equipped with an observer configured as specified in Equations 3, 4, comprising two state variables, χi and Ai. For each iI[1,N], χ̂i estimates the virtual leader’s state χ0, while A^i estimates the leader’s system matrix A0. The real-time data necessary for operating the i-th observer includes: (1) the estimated state χ̂i and matrix Ai^, obtained from the i-th AUV itself, and (2) the estimated states χ̂j and matrices Aj^ for all jNi, obtains from the j-th AUV’s neighbors. Note that in Equations 3, 4, if jNi, then aij=0, indicating that the i-th observer does not utilize information from the j-th AUV agent. This configuration ensures that the proposed distributed observer can be implemented in each local AUV agent using only locally estimated data from the agent itself and its immediate neighbors, without the need for global information such as the size of the AUV group or the network interconnection topology.

To verify the convergence properties, we need to compute the error dynamics. Now we define the estimation error for the state and the system matrix for agent i as χ̃i=χ̂iχ0 and Ãi=A^iA0, and then we derive the error dynamics:

χ̃̇it=A^itχ̂itA0χ0t+βi1j=1Naijχ̂jtχ̂it=A^itχ̂itA0χ̂it+A0χ̂itA0χ0t+βi1j=1Naijχ̂jtχ0t+χ0tχ̂it=A0χ̃it+Ãitχ̃it+Ãitχ0t+βi1j=1Naijχ̃jtχ̃itÃ̇it=βi2j=1NaijÃjtÃit,iI1,N.

Define the collective error states and adaptation matrices: χ̃=col{χ̃1,,χ̃N} for the state errors, Ã=col{Ã1,,ÃN} for the adaptive parameter errors, Ãb=diag{Ã1,,ÃN} representing the block diagonal of adaptive parameters, Bβ1=diag{β11,,βN1} and Bβ2=diag{β12,,βN2} for the diagonal matrices of design constants. With these definitions, the network-wide error dynamics can be expressed as:

χ̃̇t=INA0Bβ1HI6χ̃t+ÃbtI6χ̃t+Ãbt1Nχ0t,Ã̇t=Bβ2HI6Ãt.(5)

Theorem 1. Consider the error system Equation 5. Under Assumptions 1, 2, and given that β1,β2>0, it follows that for all iI[1,N] and for any initial conditions χ0(0),χi(0),Ai(0), the error dynamics of the adaptive parameters and the states will converge to zero exponentially. Specifically, limtÃi(t)=0 and limtχ̃i(t)=0.

This convergence is facilitated by the independent adaptation of each agent’s parameters within their respective error dynamics, represented by the block diagonal structure of Ãb and control gains Bβ1 and Bβ2. These matrices ensure that each agent’s parameter updates are governed by local interactions and error feedback, consistent with the decentralized control framework.

Proof: We begin by examining the estimation error dynamics for à as presented in Equation 5. This can be rewritten in the vector form:

Ã̇0t=β2I6HI6Ã0t.(6)

Under Assumption 2, all eigenvalues of H possess positive real parts according to Su and Huang (2011). Consequently, for any positive β2>0, the matrix β2(I6(HI6)) is guaranteed to be Hurwitz, Which implies the exponential stability of system (Equation 6). Hence, it follows that limtÃ0(t)=0 exponentially, leading to limtÃi0(t)=0 exponentially for all iI[1,N]. Now, we analyze the error dynamics for χ0 in Equation 5. Based on the previous discussions, We have limtÃb(t)=0 exponentially, and the term Ãb(t)(1Nχ0(t)) will similarly decay to zero exponentially. Based on Cai et al. (2015), if the system defined by

χ̃̇0t=INA0β1HI6χ̃0t.(7)

is exponentially stable, then limtχ0(t)=0 exponentially. With Assumption 1, knowing that all eigenvalues of A0 have zero real parts, and since H as nonsingular with all eigenvalues in the right-half plane, system (Equation 7) is exponentially stable for any positive β1>0. Consequently, this ensures that limtχ0(t)=0, i.e., limtχi0(t)=0 exponentially for all iI[1,N].

Now, each individual agent can accurately estimate both the state and the system matrix of the leader through cooperative observer estimation Equations 3, 4. This information will be utilized in the DDL controller design for each agent’s second layer, which will be discussed in the following subsection.

3.2 Second layer: decentralized deterministic learning controller

To fulfill the overall formation learning control objectives, in this section, we develop the DDL control law for the multi-AUV system defined in Equation 1. We use di* to denote the desired distance between the position of the i-th AUV agent ηi and the virtual leader’s position η0. Then, the formation control problem is framed as a position tracking control task, where each local AUV agent’s position ηi is required to track the reference signal ηd,iη0+di*. Besides, due to the inaccessibility of the leader’s state information χ0 for all AUV agents, the tracking reference signal η̂d,iη̂0,i+di* is employed instead of the reference signal ηd,i. As established in Theorem 1, η̂d,i is autonomously generated by each local agent and will exponentially converge to ηd,i. This ensures that the DDL controller is feasible and the formation control objectives are achievable for all iI[1,N] using η̂d,i.

To design the DDL control law that addresses the formation tracking control and the precise learning of the AUVs’ complete nonlinear uncertain dynamics at the same time, we will integrate renowned backstepping adaptive control design method outlined in Krstic et al. (1995) along with techniques from Wang and Hill (2018) and Yuan et al. (2017) for deterministic learning using RBFNN. Specifically, for the i-th AUV agent described in system (Equation 1), we define the position tracking error as z1,i=ηiη̂d,i for all iI[1,N]. Considering Ji(ηi)JiT(ηi)=I for all iI[1,N], we proceed to:

ż1,i=Jiηiνiη̂̇i,iI1,N.(8)

To frame the problem in a more tractable way, we assume νi as a virtual control input and αi as a desired virtual control input in our control strategy design, and by implementing them in the above system we have:

z2,i=νiαi,αi=JiTηiK1,iz1,i+η̂̇i,iI1,N.(9)

A positive definite gain matrix K1,iS3+ is used for tuning the performance. Substituting νi=z2,i+αi into Equation 8 yields:

ż1,i=Jiηiz2,iK1,iz1,i,iI1,N.

Now we derive the first derivatives of the virtual control input and the desired control input as follows:

ż2,i=ν̇iα̇i=Mi1CiνiνiDiνiνigiηiΔiχi+τiα̇i,
α̇i=J̇iTηiK1,iz1,i+η̂̇i+JiTηiK1,iη̂̇iK1,iJiηiνi+η̂̈i,iI1,N.(10)

As previously discussed, unlike earlier research that only identified the matrix coefficients Ci(νi), Di(νi), gi(ηi), and Δi(χi) as unknown system nonlinearities while assuming the mass matrix Mi to be known, this work advances significantly by also considering Mi as unknown. Consequently, all system dynamic parameters are treated as completely unknown, making the controller fully independent of the robot’s configuration such as its dimensions, mass, or any appendages and the uncertain environmental conditions it encounters, like depth, water flow, and viscosity. This independence is critical as it ensures that the controller does not rely on predefined assumptions about the dynamics, aligning with the main goal of this research. To address these challenges, we define a unique nonlinear function Fi(Zi) that encapsulates all nonlinear uncertainties as follows:

FiZi=Miα̇i+Ciνiνi+Diνiνi+giηi+Δiχi,(11)

where Fi(Zi)=[f1,i(Zi),f2,i(Zi),f3,i(Zi)]T and Zi=col{ηi,νi}ΩZiR6, with ΩZi being a bounded compact set. We then employ the following RBFNN to approximate the model dynamics in (Equation 11) expressed by nonlinear functions Fi(Zi) with fk,i for all iI[1,N] and kI[1,3] as follows:

fk,iZi=Wk,i*TSkiZi+ϵk,iZi,(12)

where Wk,i* is the ideal constant NN weights, and ϵk,i(Zi) is the approximation error ϵk,i*>0 for all iI[1,N] and kI[1,3], which satisfies |ϵk,i(Zi)|ϵk,i*. This error can be made arbitrarily small given a sufficient number of neurons in the network. A self-adaptation law is designed to estimate the unknown Wk,i* online. We aim to estimate Wk,i* with W^k,i by constructing the DDL feedback control law as follows:

τi=JiTηiz1,iK2,iz2,i+W^iTSiFZi.(13)

K2,iS3+ is a feedback gain matrix that can be tuned to achieve the desired performance. To approximate the unknown nonlinear function vector Fi(Zi) in (Equation 11) along the trajectory Zi within the compact set ΩZi, we use:

W^iTSiFZi=W^1,iTS1,iZiW^2,iTS2,iZiW^3,iTS3,iZi.

Then, from Equations 1, 13 we have:

Miν̇i+Ciνiνi+Diνiνi+giηi+Δiχi=τi=JiTηiz1,iK2,iz2,i+W^k,iTSk,iZi.

By subtracting Wk,i*TSk,i(Zi)+ϵk,i(Zi) from both sides and considering Equations 9, 11, we define W̃k,iW^k,iWk,i*, leading to:

ż2,i=Mi1JiTηiz1,iK2,iz2,i+W^k,iTSk,iZiϵk,iZi.

For updating W^k,i online, a robust self-adaptation law is constructed using the σ-modification technique Ioannou and Sun (1996) as follows:

W^̇k,i=Γk,iSk,iZiz2k,i+σkiW^k,i.(14)

where z2,i=[z21,i,z22,i,z23,i]T, Γk,i=Γk,iT>0, and σk,i>0 are free parameters to be designed for all iI[1,N] and kI[1,3]. Integrating Equations 9, 13, 14 yields the following closed-loop system:

ż1,i=K1,iz1,i+Jiηiz2,i,ż2,i=Mi1JiTηiz1,iK2,iz2,i+W^kiTSkiZiϵk,iZi,W̃̇k,i=Γk,iSk,iZiz2,k,i+σk,iW^k,i,(15)

where, for all iI[1,N] and kI[1,3], W̃iTSi(Zi)=[W̃1,iTS1,i(Zi),W̃2,iTS2,i(Zi),W̃3,iTS3,i(Zi)]T, and ϵi(Zi)=[ϵ1,i(Zi),ϵ2,i(Zi),ϵ3,i(Zi)]T.

Remark 4. Unlike the first-layer DA observer design, the second-layer control law is fully decentralized for each local agent. It utilizes only the local agent’s information for feedback control, including χi, χ̂i, and Wk,i, without involving any information exchange among neighboring AUVs.

The following theorem summarizes the stability and tracking control performance results of the overall system:

Theorem 2. Consider the local closed-loop system (Equation 15). For each iI[1,N], if there exists a sufficiently large compact set ΩZisuch that ZiΩZifor all t0, then for any bounded initial conditions, we have: 1) All signals in the closed-loop system remain uniformly ultimately bounded (UUB). 2) The position tracking error ηiηd,iconverges exponentially to a small neighborhood around zero in finite time Ti>0by choosing the design parameters with sufficiently large λ̲(K1,i)>0and λ̲(K2,i)>2λ̄(K1,i)>0, and sufficiently small σk,i>0for all iI[1,N]and kI[1,3].

Proof: 1) Consider the following Lyapunov function candidate for the closed-loop system (Equation 15):

Vi=12z1,iTz1,i+12z2,iTMiz2,i+12k=13W̃k,iTΓk,i1W̃k,i.

Evaluating the derivative of Vi along the trajectory of Equation 15 for all iI[1,N] yields:

V̇i=z1,iTK1,iz1,i+Jiηiz2,i+z2,iTJiTηiz1,iK2,iz2,i+W̃k,iTSk,iZiϵk,iZik=13W̃k,iTSk,iZiz2k,i+σk,iW^k,i=z1,iTK1,iz1,iz2,iTK2,iz2,iz2,iTϵk,iZik=13σk,iW̃k,iTW^k,i,iI1,N.

Choose K2,i=K1,i+K22,i such that K1,i,K22,iS3+. Using the completion of squares, we have:

σk,iW̃k,iTW^k,iσk,iW̃k,i22+σk,iWk,i*22,z2,iTK22,iz2,iz2,iTϵiZiϵiTZiϵiZi4λ̲K22,iϵi*24λ̲K22,i,

where ϵi*=[ϵ1,i*,ϵ2,i*,ϵ3,i*]T. Then, we obtain:

V̇iz1,iTK1,iz1,iz2,iTK1,iz2,i+ϵi*24λ̲K22,i+k=13σk,iW̃k,i22+σk,iWk,i*22.

It follows that V̇i is negative definite whenever:

z1,i>ϵi*2λ̲K1,iλ̲K22,i+k=13σk,i2λ̲K1,iWk,i*,z2,i>ϵi*2λ̲K1,iλ̲K22,i+k=13σk,i2λ̲K1,iWk,i*,W̃k,i>ϵi*2σk,iλ̲K22,i+k=13Wk,i*:=W̃k,i*.

For all iI[1,N], kI[1,3]. This leads to the Uniformly Ultimately Bounded (UUF) behavior of the signals z1,i, z2,i, and W̃k,i for all iI[1,N] and kI[1,3]. As a result, it can be easily verified that since ηdi=ηi+di* with ηi bounded (according to Theorem 1 and Assumption 1), ηi=z1,i+ηi is bounded for all iI[1,N]. Similarly, the boundedness of νi=z2,i+αi can be confirmed by the fact that αi in Equation 9 is bounded. In addition, Wk,i=W̃k,i+Wk,i* is also bounded for all iI[1,N] and kI[1,3] because of the boundedness of W̃k,i and Wk,i*. Moreover, in light of Equation 10, α̇i is bounded as all the terms on the right-hand side of Equation 10 are bounded. This leads to the boundedness of the control signal τi in Equation 13 since the Gaussian function vector SiF(Zi) is guaranteed to be bounded for any Zi. As such, all the signals in the closed-loop system remain UUB, which completes the proof of the first part.

2) For the second part, it will be shown that ηi will converge arbitrarily close to ηdi in some finite time Ti>0 for all iI[1,N]. To this end, we consider the following Lyapunov function candidate for the dynamics of z1,i and z2,i in Equation 15:

Vz,i=12z1,iTz1,i+12z2,iTMiz2,i,iI1,N.(16)

The derivative of Vz,i is:

V̇z,i=z1,iTK1,iz1,i+Jiηiz2,i+z2,iTJiTηiz1,iK2,iz2,i+W̃k,iTSk,iZiϵiZi=z1,iTK1,iz1,iz2,iTK2,iz2,i+z2,iTW̃k,iTSk,iZiz2,iTϵk,iZi,iI1,N.

Similar to the proof of part one, we let K2,i=K1,i+2K22,i with K1,i,K22,iS3+. According to Wang and Hill (2018), the Gaussian RBFNN regressor SiF(Zi) is bounded by SiF(Zi)si* for any Zi and for all iI[1,N] with some positive number si*>0. Through completion of squares, we have:

z2,iTK22,iz2,i+z2,iTW̃iTSiFZiW̃i*2si*24λ̲K22,i,z2,iTK22,iz2,iz2,iTϵiZiϵi*24λ̲K22,i.

Also W̃i*=[W̃1,i*,W̃2,i*,W̃3,i*]T. This leads to:

V̇z,iz1,iTK1,iz1,iz2,iTK1,iz2,i+δi,2λ̲K1,i12z1,iTz1,i+12λ̄Miz2,iTMiz2,i+δi,ρiVz,i+δi,iI1,N,(17)

where ρi=min{2λ̲(K1,i),2λ̲(K1,i)/λ̄(Mi)} and δi=(W̃i*2si*2/4λ̲(K22,i))+(ϵi*2/4λ̲(K22,i)), iI[1,N]. Solving the inequality Equation 17 yields:

0Vz,itVz,i0expρit+δiρi,

which together with Equation 16 implies that:

min1,λ̲Mi12z1,i2+z2,i2Vz,i0expρit+δiρi,t0,iI1,N,

also

z1,i2+z2,i22min1,λ̲MiVz,i0expρit+2δiρimin1,λ̲Mi.

Consequently, it is straightforward that given δ̄i>2δi/ρimin{1,λ̲(Mi)}, there exists a finite time Ti>0 for all iI[1,N] such that for all tTi, both z1,i and z2,i satisfy z1,i(t)δ̄i and z2,i(t)δ̄iiI[1,N], where δ̄i can be made arbitrarily small by choosing sufficiently large λ̲(K1,i)>0 and λ̲(K2,i)>2λ̄(K1,i)>0 for all iI[1,N]. This ends the proof.

By integrating the outcomes of Theorems 1, 2, the following theorem is established, which can be presented without additional proof:

Theorem 3. By Considering the multi-AUV system (Equation 1) and the virtual leader dynamics (Equation 2) with the network communication topology G and under Assumptions 1 and 2, the objective 1 of Problem 1 (i.e., ηi converges to η0+di* exponentially for all iI[1,N]) can be achieved by using the cooperative observer Equations 3, 4 and the DDL control law Equations 13, 14 with all the design parameters satisfying the requirements in Theorems 1 and 2, respectively.

Remark 5. With the proposed two-layer formation learning control architecture, inter-agent information exchange occurs solely in the first-layer DA observation. Only the observer’s estimated information, and not the physical plant state information, needs to be shared among neighboring agents. Additionally, since no global information is required for the design of each local AUV control system, the proposed formation learning control protocol can be designed and implemented in a fully distributed manner.

Remark 6. It is important to note that the eigenvalue constraints on A0 in Assumption 1 are not needed for cooperative observer estimation (as detailed in the Section 3 or for achieving formation tracking control performance (as discussed in this section). This indicates that formation tracking control can be attained for general reference trajectories, including both periodic paths and straight lines, provided they are bounded. However, these constraints will become necessary in the next section to ensure the accurate learning capability of the proposed method.

4 Accurate learning from formation control

It is necessary to demonstrate the convergence of the RBFNN weights in Equations 13, 14 to their optimal values for accurate learning and identification. The main result of this section is summarized in the following theorem.

Theorem 4. Consider the local closed-loop system (Equation 15) with Assumptions 1, 2. For each iI[1,N], if there exists a sufficiently large compact set ΩZi such that ZiΩZi for all t0, then for any bounded initial conditions and Wk,i(0)=0iI[1,N],kI[1,3], the local estimated neural weights Wζ,k,i converge to small neighborhoods of their optimal values Wζ,k,i* along the periodic reference tracking orbit ϕζ,i(Zi(t))|tTi (denoting the orbit of the NN input signal Zi(t) starting from time Ti). This leads to locally accurate approximations of the nonlinear uncertain dynamics fk,i(Zi)kI[1,3] in Equation 11 being obtained by Wk,iTSk,i(Zi), as well as by W̄k,iTSk,i(Zi), where iI[1,N],kI[1,3]. Also, ζ denotes the subset of neurons (or nodes) that are active when the system state Z(t) is within a specific neighborhood of the state space.

W̄k,i=meantta,i,tb,iW^k,it,(18)

where [ta,i,tb,i] (tb,i>ta,i>Ti) represents a time segment after the transient process.

Proof: From Theorem 3, we have shown that for all iI[1,N], ηi will closely track the periodic signal ηd,i=η0+di* in finite time Ti. In addition, (Equation 9) implies that νi will also closely track the signal JiT(ηi)η̇0i since both z1,i and z2,i will converge to a small neighborhood around zero according to Theorem 2. Moreover, since η̇0i will converge to η̇0 according to Theorem 1, and Ji(ηi) is a bounded rotation matrix, νi will also be a periodic signal after finite time Ti, because η̇0 is periodic under Assumption 1. Consequently, since the RBFNN input Zi(t)=col{ηi,νi} becomes a periodic signal for all tTi, the PE condition of some internal closed-loop signals, i.e., the RBFNN regression subvector Sζ,k,i(Zi) (tTi), is satisfied according to Lemma 1. As mentioned in Section 2.2, ζ represents the subset of RBFNN nodes and weights that are specifically utilized along the recurrent trajectory of the system state Zi(t). This subset focuses on the active neural components required for approximating the system’s nonlinear dynamics locally, ensuring that the learning and adaptation processes are efficient and accurate within the compact region where the trajectory resides. It should be noted that the periodicity of Zi(t) leads to the PE of the regression subvector Sζ,k,i(Zi), but not necessarily the PE of the whole regression vector Sk,i(Zi). Thus, we term this as a partial PE condition, and we will show the convergence of the associated local estimated neural weights Wζ,k,iWζ,k,i*, rather than Wk,iWk,i*.

Thus, to prove accurate convergence of local neural weights Wζ,k,i associated with the regression subvector Sζ,k,i(Zi) under the satisfaction of the partial PE condition, we first rewrite the closed-loop dynamics of z1,i and z2,i along the periodic tracking orbit ϕζ,i(Zi(t))|tTi by using the localization property of the Gaussian RBFNN:

ż1,i=K1,iz1,i+Jiηiz2,i,ż2,i=Mi1Wζ,i*TSζ,iFZiϵζ,iJiTηiz1,iK2,iz2,i+W^ζ,iTSζ,iFZi+W^ζ̄,iTSζ̄,iFZi=Mi1JiTηiz1,iK2,iz2,i+W̃ζ,iTSζ,iFZiϵζ,i.

where Fi(Zi)=Wζ,i*TSζ,iF(Zi)+ϵζ,i with Wζ,i*TSζ,iF(Zi)=[Wζ,1,i*TSζ,1,i(Zi),Wζ,2,i*TSζ,2,i(Zi),Wζ,3,i*TSζ,3,i(Zi)]T and ϵζ,i=[ϵζ,1,i,ϵζ,2,i,ϵζ,3,i]T being the approximation error. Additionally, Wζ,iTSζ,iF(Zi)+Wζ̄,iTSζ̄,iF(Zi)=WiTSiF(Zi) with subscripts ζ and ζ̄ denoting the regions close to and far away from the periodic trajectory ϕζ,i(Zi(t))|tTi, respectively. According to Wang and Hill (2018), Wζ̄,iTSζ̄,iF(Zi) is small, and the NN local approximation error ϵζ,i=ϵζ,iWζ̄,iTSζ̄,iF(Zi) with ϵζ,i=O(ϵζ,i) is also a small number. Thus, the overall closed-loop adaptive learning system can be described by:

www.frontiersin.org

and

W̃̇ζ̄,i,1W̃̇ζ̄,i,2W̃̇ζ̄,i,3=Γζ̄,1,iSζ̄,1,iZiz2,i+σi,1W^ζ̄,1,iΓζ̄,2,iSζ̄,2,iZiz2,i+σi,2W^ζ̄,2,iΓζ̄,3,iSζ̄,3,iZiz2,i+σi,3W^ζ̄,3,i,

where

Ξi=0Mi1Sζ,1,iTZi000Sζ,2,iTZi000Sζ,3,iTZi,

for all iI[1,N]. The exponential stability property of the nominal part of subsystem (Equation 19) has been well-studied in Wang and Hill (2018), Yuan and Wang (2011), and Yuan and Wang (2012), where it is stated that PE of Sζ,k,i(Zi) will guarantee exponential convergence of (z1,i,z2,i,W̃ζ,k,i)=0 for all iI[1,N] and kI[1,3]. Based on this, since ϵζ,i=O(ϵζ,i)=O(ϵi), and σk,iΓζ,k,iW^ζ,k,i can be made small by choosing sufficiently small σk,i for all iI[1,N], kI[1,3], both the state error signals (z1,i,z2,i) and the local parameter error signals W̃ζ,k,i (iI[1,N],kI[1,3]) will converge exponentially to small neighborhoods of zero, with the sizes of the neighborhoods determined by the RBFNN ideal approximation error ϵi as in Equation 12 and σk,iΓζ,k,iW^ζ,k,i. The convergence of Wζ,k,iWζ,k,i* implies that along the periodic trajectory ϕζ,i(Zi(t))|tTi, we have

fk,iZi=Wζ,k,i*TSζ,k,iZi+ϵζ,k,i=W^ζ,k,iTSζ,k,iZiW̃ζ,k,iTSζ,k,iZi+ϵζ,k,i=W^ζ,k,iTSζ,k,iZi+ϵζ1,k,i=W̄ζ,k,iTSζ,k,iZi+ϵζ2,k,i,

where for all iI[1,N], kI[1,3], ϵζ1,k,i=ϵζ,k,iW̃ζ,k,iTSζ,k,i(Zi)=O(ϵζ,i) due to the convergence of W̃ζ,k,i0. The last equality is obtained according to the definition of (Equation 18) with W̄ζ,k,i being the corresponding subvector of W̄k,i along the periodic trajectory ϕζ,i(Zi(t))|tTi, and ϵζ2,k,i being an approximation error using W̄ζ,k,iTSζ,k,i(Zi). Apparently, after the transient process, we will have ϵζ2,k,i=O(ϵζ1,k,i), iI[1,N], kI[1,3]. Conversely, for the neurons whose centers are distant from the trajectory ϕζ,i(Zi(t))|tTi, the values of Sζ̄,k,i(Zi) will be very small due to the localization property of Gaussian RBFNN. From the adaptation law (Equation 13) with Wik(0)=0, it can be observed that these small values of Sζ̄,k,i(Zi) will only minimally activate the adaptation of the associated neural weights Wζ̄,k,i. As a result, both Wζ̄,k,i and Wζ̄,k,iTSζ̄,k,i(Zi), as well as W̄ζ̄,k,i and W̄ζ̄,k,iTSζ̄,k,i(Zi), will remain very small for all iI[1,N], kI[1,3] along the periodic trajectory ϕζ,i(Zi(t))|tTi. This indicates that the entire RBFNN Wk,iTSk,i(Zi) and W̄k,iTSk,i(Zi) can be used to accurately approximate the unknown function fk,i(Zi) locally along the periodic trajectory ϕζ,i(Zi(t))|tTi, meaning that.

fk,iZi=W^ζ,k,iTSζ,k,iZi+ϵζ1,k,i=W^k,iTSk,iZi+ϵ1,k,i
=W̄ζ,k,iTSζ,k,iZi+ϵζ2,k,i=W̄k,iTSk,iZi+ϵ2,k,i,

with the approximation accuracy level of ϵ1,k,i=ϵζ1,k,iWζ̄,k,iTSζ̄,k,i(Zi)=O(ϵζ1,k,i)=O(ϵk,i) and ϵ2,k,i=ϵζ2,k,iW̄ζ̄,k,iTSζ̄,k,i(Zi)=O(ϵζ2,k,i)=O(ϵk,i) for all iI[1,N], kI[1,3]. This ends the proof.

Remark 7. The key idea in the proof of Theorem 4 is inspired by Wang and Hill (2018). For more detailed analysis on the learning performance, including quantitative analysis on the learning accuracy levels ϵ1,i,k and ϵ2,i,k as well as the learning speed, please refer to Yuan and Wang (2011). Furthermore, the AUV nonlinear dynamics (Equation 9) to be identified do not contain any time-varying random disturbances. This is important to ensure accurate identification/learning performance under the deterministic learning framework. To understand the effects of time-varying external disturbances on deterministic learning performance, interested readers are referred to Yuan and Wang (2012) for more details.

Remark 8. Based on Equation 18, to obtain the constant RBFNN weights W̄k,i for all iI[1,N], kI[1,3], one needs to implement the formation learning control law Equations 13, 14 first. Then, according to Theorem 4, after a finite-time transient process, the RBFNN weights Wk,i will converge to constant steady-state values. Thus, one can select a time segment [ta,i,tb,i] with tb,i>ta,i>Ti for all iI[1,N] to record and store the RBFNN weights Wk,i(t) for t[ta,i,tb,i]. Finally, based on these recorded data, W̄k,i can be calculated off-line using Equation 18.

Remark 9. It is shown in Theorem 4 that locally accurate learning of each individual AUV’s nonlinear uncertain dynamics can be achieved using localized RBFNNs along the periodic trajectory ϕζ,i(Zi(t))|tTi. The learned knowledge can be further represented and stored in a time-invariant fashion using constant RBFNN, i.e., W̄k,iTSk,i(Zi) for all iI[1,N], kI[1,3]. In contrast to many existing techniques (e.g., Peng et al., 2017; Peng et al., 2015), this is the first time, to the authors’ best knowledge, that locally accurate identification and knowledge representation using constant RBFNN are accomplished and rigorously analyzed for multi-AUV formation control under complete uncertain dynamics.

5 Formation control with pre-learned dynamics

In this section, we will further address objective 2 of Problem 1, which involves achieving formation control without readapting to the AUV’s nonlinear uncertain dynamics. To this end, consider the multiple AUV systems (Equation 1) and the virtual leader dynamics (Equation 2). We employ the estimator observer Equations 3, 4 to cooperatively estimate the leader’s state information. Instead of using the DDL feedback control law (Equation 13), and self-adaptation law (Equation 4), we introduce the following constant RBFNN controller, which does not require online adaptation of the NN weights:

τi=JiTηiz1,iK2,iz2,i+W̄iTSiFZi,(20)

where W̄iTSiF(Zi)=[W̄1,iTS1,i(Zi),W̄2,iTS2,i(Zi),W̄3,iTS3,i(Zi)]T is obtained from Equation 18. The term W̄k,iTSk,i(Zi) represents the locally accurate RBFNN approximation of the nonlinear uncertain function fk,i(Zi) along the trajectory ϕζ,i(Zi(t))|tTi, and the associated constant neural weights W̄k,i are obtained from the formation learning control process as discussed in Remark 8.

Theorem 5. Consider the multi-AUV system (Equation 1) and the virtual leader dynamics (Equation 3) with the network communication topology G. Under Assumptions 1, 2, the formation control performance (i.e., ηi converges to η0+di* exponentially with the same η0 and di* defined in Theorem 3 for all iI[1,N]) can be achieved by using the DA observer Equations 3, 20 and the constant RBFNN control law (Equation 4) with the constant NN weights obtained from Equation 18.

Proof: The closed-loop system for each local AUV agent can be established by integrating the controller (Equation 20) with the AUV dynamics (Equation 1).

ż1,i=K1,iz1,i+Jiηiz2,i,ż2,i=Mi1JiTηiz1,iK2,iz2,i+W̄iTSiFZiFiZi=Mi1JiTηiz1,iK2,iz2,iϵ2,i,iI1,N,

where ϵ2,i=[ϵ21,i,ϵ22,i,ϵ23,i]T. Consider the Lyapunov function candidate Vz,i=12z1,iTz1,i+12z2,iTMiz2,i, whose derivative along the closed-loop system described is given by:

V̇z,i=z1,iTK1,iz1,i+Jiηiz2,i+z2,iTJiTηiz1,iK2,iz2,iϵ2,i=z1,iTK1,iz1,iz2,iTK2,iz2,iz2,iTϵ2,i.

Selecting K2,i=K1,i+K22,i where K1,i,K22,iS3+, we can utilize the method of completing squares to obtain:

z2,iTK22,iz2,iz2,iTϵ2,iϵ2,i24λ̲K22,iϵ2,i*24λ̲K22,i,

which implies that:

V̇z,iz1,iTK1,iz1,iz2,iTK1,iz2,i+ϵ2,i*24λ̲K22,iρiVz,i+δi,iI1,N.

where ρi=min{2λ̲(K1,i),(2λ̲(K1,i)/λ̄(Mi))} and δi=(ϵ2,i*2/4λ̲(K22,i)). Using similar reasoning to that in the proof of Theorem 2, it is evident from the derived inequality that all signals within the closed-loop system remain bounded. Additionally, ηiηid will converge to a small neighborhood around zero within a finite period. The magnitude of this neighborhood can be minimized by appropriately choosing large values for λ̲(K1,i)>0 and λ̲(K2,i)>λ̄(K1,i) across all iI[1,N]. In line with Theorem 1, under Assumptions 1, 2, the implementation of the DA observer DA observer Equations 3, 4 facilitates the exponential convergence of ηi towards η0. This conjunction of factors assures that ηi rapidly aligns with ηd,i=η0+di*, achieving the objectives set out for formation control.

Remark 10. Building on the locally accurate learning outcomes discussed in Section 4, the newly developed distributed control protocol comprising Equations 3, 4, 20 facilitates stable formation control across a repeated formation pattern. Unlike the formation learning control approach outlined in Section 3.2, which involves Equations 3, 4 coupled with Equations 13, 14, the current method eliminates the need for online RBFNN adaptation for all AUV agents. This significantly reduces the computational demands, thereby enhancing the practicality of implementing the proposed distributed RBFNN formation control protocol. This innovation marks a significant advancement over many existing techniques in the field.

6 Simulation

We consider a multi-AUV heterogeneous system composed of 5 AUVs for the simulation. The dynamics of these AUVs are described in THE system (Equation 1). The system parameters for each AUV are specified as follows:

Mi=m11,i000m22,im23,i0m23,im33,i,Ci=00m22,ivim23,iri00m11,iuim22,ivi+m23,irim11,iui0,Di=d11,iνi000d22,iνid23,iνi0d32,iνid33,iνi,gi=0,Δi=Δ1,iχiΔ2,iχiΔ3,iχi,iI1,5,

where the mass and damping matrix components for each AUV i are defined as:

m11,i=miXu̇,i,m22,i=miYv̇,i,m23,i=mixg,iYṙ,i,m33,i=Iz,iNṙ,i,d11,i=Xu,i+Xuu,iui,d22,i=Yv,i+Yvv,ivi+Yrv,iri,d23,i=Yr,i+Yvr,ivi+Yrr,iri,d32,i=Nv,i+Nvv,ivi+Nrv,iri,d33,i=Nr,i+Nvr,ivi+Nrr,iri.

According to the notations in Prestero (2001) and Skjetne et al. (2005) the coefficients {X(),Y(),N()} are hydrodynamic parameters. For the associated system parameters are borrowed from Skjetne et al. (2005) (with slight modifications for different AUV agents) and simulation purposes and listed in Table 1. For all iI[1,5], we set xg,i=0.05 and Yṙ,i=Yrv,i=Yvr,i=Yrr,i=Nrv,i=Nrr,i=Nvv,i=Nvr,i=Nr,i=0. Model uncertainties are given by:

Δ1=0,Δ2=0.2u22+0.3v20.950.33r2TΔ3=0.58+cosv30.23r330.74u32TΔ4=0.3100.38u42+v43TΔ5=sinv5cosu5+r50.65T.

Table 1
www.frontiersin.org

Table 1. Parameters of AUVs.

Figure 2 illustrates the communication topology and the spanning tree where agent 0 is the virtual leader and is considered as the root, in accordance with Assumption 2. The desired formation pattern requires each AUV, ηi, to track a periodic signal generated by the virtual leader η0. The dynamics of the leader are defined as follows:

η̇0ν̇0=01000100011000100010η0ν0,η00ν00=080080080T.(21)

Figure 2
www.frontiersin.org

Figure 2. The communication network topology and spanning tree of multi-AUV system of the simulation with 0 as virtual leader.

The initial conditions and system matrix are structured to ensure all eigenvalues of A0 lie on the imaginary axis, thus satisfying Assumption 1. The reference trajectory for η0 is defined as [80sin(t),80cos(t),80sin(t)]T. The predefined offsets di*, which determine the relative positions of the AUVs to the leader, are specified as follows:

d1*=0,0,0T,d4*=10,10,0T,d2*=10,10,0T,d5*=10,10,0T,d3*=10,10,0T.

Each AUV tracks its respective position in the formation by adjusting its location to ηi=η0+di*.

6.1 DDL formation learning control simulation

The estimated virtual leader’s state, derived from the cooperative estimator in the first layer (see Equations 3, 4), is utilized to estimate each agent’s complete uncertain dynamics within the DDL controller (second layer) using Equations 13, 14. The uncertain nonlinear functions Fi(Zi) for each agent are approximated using RBFNN, as described in Equation 11. Specifically, for each agent i{1,,5}, the nonlinear uncertain functions Fi(Zi), dependent on νi, are modeled. The input to the NN, Zi=[ui,vi,ri]T, allows the construction of Gaussian RBFNN, represented by Wk,iTSk,i(Zi), utilizing 4,096 neurons arranged in an 16×16×16 grid. The centers of these neurons are evenly distributed over the state space [100,100]×[100,100]×[100,100], and each has a width γk,i=60, ensuring bounded and structured parameter optimization for all i{1,,5} and k{1,2,3}.

The observer and controller parameters are chosen as β1=β2=5, and the diagonal matrices K1,i=800diag{1.2,1,1} and K2,i=1200diag{1.2,1,1}, with Γk,i=10 and σk,i=0.0001 for all i{1,,5} and k{1,2,3}. The initial conditions for the agents are set as η1(0)=[30,60,0]T, η2(0)=[40,70,0]T, η3(0)=[50,80,0]T, η4(0)=[10,70,0]T, and η5(0)=[10,50,0]T. Zero initial conditions are assumed for all the distributed observer states (χi,0,Ai,0) and the DDL controller states Wk,i for all i{1,,5} and k{1,2,3}. Time-domain simulation is carried out using the DDL formation learning control laws as specified in Equations 13, 14, along with Equations 3, 4.

Figure 3 displays the simulation results of the cooperative estimator (first layer) for all five agents. It illustrates how each agent’s estimated states,η̂i, converge perfectly to the leader’s states η̂0 thorough Equations 3, 4. Figure 4 presents the position tracking control responses of all agents. Figures 4A–C illustrate the tracking performance of AUVs along the x-axis, y-axis, and vehicle heading, respectively, demonstrating effective tracking of the leader’s position signal. While the first AUV exactly tracks the leader’s states, agents 2 through 5 are shown to successfully follow agent 1, maintaining prescribed distances and alignment along the x and y axes, and matching the same heading angle. These results underscore the robustness of the real-time tracking control system, which enforces a predefined formation pattern, initially depicted in Figure 2. Additionally, Figure 5 highlights the real-time control performance for all agents, showcasing the effectiveness of the tracking strategy in maintaining the formation pattern.

Figure 3
www.frontiersin.org

Figure 3. Simulation results of the cooperative observer (first layer) for all three states (x-axis, y-axis, and vehicle heading) of each AUV: (A) x̂ix0(m), (B) ŷiy0(m), (C) ψ̂iψ0(deg).

Figure 4
www.frontiersin.org

Figure 4. Simulation results of position tracking control performance of all agents: (A) xix0(m), (B) yiy0(m), (C) ψiψ0(deg).

Figure 5
www.frontiersin.org

Figure 5. Real-time control performance in simulation for all agents, demonstrating the tracking strategy’s effectiveness in maintaining the formation pattern.

The sum of the absolute values of the neural network weights in Figure 6. This convergence reflects the network’s ability to maintain consistent performance, as further adjustments to the weights become minimal. Also, updating neural network weights and their convergence throughout the learning process into their optimal valies depicted in Figure 7. This convergence of all neural network weights to their optimal values during the training process, aligns with Theorem 4 as well. This leads to achieving accurate function approximation in the second layer. Figure 8 represents the successful function approximation results for the unknown system dynamics F3(Z3) as defined in Equation 11 for the third AUV, using RBFNN. The approximations are plotted for both Wk,3TSk,3(Z3) and W̄k,3TSk,3(Z3) for all kI[1,3] which defined in Theorem 4. The results confirm that locally accurate approximations of the AUV’s nonlinear dynamics were achieved. Moreover, this learned knowledge about the dynamics is effectively stored and represented using localized constant RBFNN.

Figure 6
www.frontiersin.org

Figure 6. Sum of the absolute values of neural network weights in simulation for the third agent, showing the network stabilized and learns a consistent tracking pattern.

Figure 7
www.frontiersin.org

Figure 7. Convergence of neural network weights of each state to their optimal values in simulation for the third agent: (A) Ŵ1,3, (B) Ŵ2,3, (C) Ŵ3,3. The stabilized weights demonstrate accurate learning in the second layer throughout the training process.

Figure 8
www.frontiersin.org

Figure 8. Simulation results of successful function approximation for all three states (k = 1, 2, 3) of the 3rd AUV: (A) k=1, (B) k=2, (C) k=3. Comparison of fk,3(Z3), Wk,3TSk,3(Z3), and W̄k,3TSk,3(Z3) using stored constant NN weights (W̄k,3).

6.2 Simulation for formation control with pre-learned dynamics

To evaluate the distributed control performance of the multi-AUV system, we implemented the pre-learned distributed formation control law. This strategy integrates the estimator observer Equations 3, 4, this time coupled with the constant RBFNN controller (Equation 21). We employed the virtual leader dynamics described in Equation 21 to generate consistent position tracking reference signals, as previously discussed in Section 6.1. To ensure a fair comparison, identical initial conditions and control gains and inputs were used across all simulations. Figure 9 illustrates the comparison of the tracking control results from Equations 13, 14 with the results using pre-trained weights W̄ in Equation 20.

Figure 9
www.frontiersin.org

Figure 9. Simulation results of successful performance of position tracking control using pretrained weights (W̄): (A) xix0(m), (B) yiy0(m), (C) ψiψ0(deg).

The control experiments and simulation results presented demonstrate that the constant RBFNN control law (Equation 4) can achieve satisfactory tracking control performance comparable to that of the adaptive control laws (Equations 13, 14), but with no computational demand. The elimination of online recalculations or readaptations of the NN weights under this control strategy significantly reduces the computational load whenever system restarts without needing to retrain again. This reduction is particularly advantageous in scenarios involving extensive neural networks with a large number of neurons, thereby conserving system energy and enhancing operational efficiency in real-time applications.

Before concluding the paper, a brief contribution of the paper is provided:

• Distributed Observer Results: Simulations showed that the distributed observer effectively estimated the leader’s state, allowing for accurate formation control without needing global information.

• Tracking Control Results: The controller demonstrated reliable tracking of reference signals, maintaining performance even under varying conditions and unknown system dynamics.

• Formation Control: The proposed controller maintained accurate formation control relative to a virtual leader in simulations, even when the system dynamics were unknown with different AUVs.

• Neural Network Weight Convergence: The simulation results demonstrated that the neural network weights converged effectively, ensuring accurate function approximation and reliable performance in controlling AUVs under uncertainties.

• Adaptability and Stability: The framework ensured stable tracking performance across various environmental conditions by relying on the RBFNN’s learning capabilities, allowing the AUVs to use prelearned information and maintain formation control without needing to relearn dynamics whenever system restarts.

• Reduction in Computational Load: The use of pre-trained neural network weights significantly reduced the computational burden during real-time operation, particularly when large neural networks were employed.

7 Conclusion

In conclusion, this paper has introduced a novel two-layer control framework designed for Autonomous Underwater Vehicles (AUVs), aimed at universal applicability across various AUV configurations and environmental conditions. This framework assumes all system dynamics to be unknown, thereby enabling the controller to operate independently of specific dynamic parameters and effectively handle any environmental challenges, including hydrodynamic forces and torques. The framework consists of a first-layer distributed observer estimator that captures the leader’s dynamics using information from adjacent agents, and a second-layer decentralized deterministic learning controller. Each AUV utilizes the estimated signals from the first layer to determine the desired trajectory, simultaneously training its own dynamics using Radial Basis Function Neural Networks (RBFNN). This innovative approach not only sustains stability and performance in dynamic and unpredictable environments but also allows AUVs to efficiently utilize previously learned dynamics after system restarts, facilitating rapid resumption of optimal operations. The robustness and versatility of this framework have been rigorously confirmed through comprehensive simulations, demonstrating its potential to significantly enhance the adaptability and resilience of AUV systems. By embracing total uncertainty in system dynamics, this framework establishes a new benchmark in autonomous underwater vehicle control and lays a solid groundwork for future developments aimed at minimizing energy use and maximizing system flexibility. We plan to expand this framework by accommodating more general leader dynamics and conducting experimental applications to validate its performance in real-world settings. Moreover, a more accurate model of some source of uncertainty could improve performance which we will address in our future research. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

EJ: Conceptualization, Writing–original draft, Writing–review and editing. MZ: Conceptualization, Funding acquisition, Supervision, Writing–review and editing. PS: Conceptualization, Supervision, Writing–review and editing. CY: Conceptualization, Formal Analysis, Funding acquisition, Supervision, Writing–review and editing.

Funding

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. This work is supported in part by the National Science Foundation under Grant CMMI-1952862 and CMMI-2154901.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Footnotes

1A recurrent trajectory represents a large set of periodic and periodic-like trajectories generated from linear/nonlinear dynamical systems. A detailed characterization of recurrent trajectories can be found in Wang and Hill (2018).

References

Balch, T., and Arkin, R. (1998). Behavior-based formation control for multirobot teams. IEEE Trans. Robotics Automation 14, 926–939. doi:10.1109/70.736776

CrossRef Full Text | Google Scholar

Cai, H., Lewis, F. L., Hu, G., and Huang, J. (2015). “Cooperative output regulation of linear multi-agent systems by the adaptive distributed observer,” in 2015 54th IEEE Conference on Decision and Control (CDC) (IEEE), 5432–5437.

CrossRef Full Text | Google Scholar

Cao, X., Ren, L., and Sun, C. (2022). Dynamic target tracking control of autonomous underwater vehicle based on trajectory prediction. IEEE Trans. Cybern. 53, 1968–1981. doi:10.1109/tcyb.2022.3189688

PubMed Abstract | CrossRef Full Text | Google Scholar

Christen, S., Jendele, L., Aksan, E., and Hilliges, O. (2021). Learning functionally decomposed hierarchies for continuous control tasks with path planning. IEEE Robotics Automation Lett. 6, 3623–3630. doi:10.1109/lra.2021.3060403

CrossRef Full Text | Google Scholar

Cui, R., Ge, S. S., How, B. V. E., and Choo, Y. S. (2010). Leader–follower formation control of underactuated autonomous underwater vehicles. Ocean. Eng. 37, 1491–1502. doi:10.1016/j.oceaneng.2010.07.006

CrossRef Full Text | Google Scholar

Dong, X., Yuan, C., Stegagno, P., Zeng, W., and Wang, C. (2019). Composite cooperative synchronization and decentralized learning of multi-robot manipulators with heterogeneous nonlinear uncertain dynamics. J. Frankl. Inst. 356, 5049–5072. doi:10.1016/j.jfranklin.2019.04.028

CrossRef Full Text | Google Scholar

Fossen, T. I. (1999). “Guidance and control of ocean vehicles,”. Norway: University of Trondheim. Doctors Thesis.

Google Scholar

Ghafoori, S., Rabiee, A., Cetera, A., and Abiri, R. (2024). Bispectrum analysis of noninvasive eeg signals discriminates complex and natural grasp types. arXiv Prepr. arXiv:2402.01026, 1–5. doi:10.1109/embc53108.2024.10782163

CrossRef Full Text | Google Scholar

Hadi, B., Khosravi, A., and Sarhadi, P. (2021). A review of the path planning and formation control for multiple autonomous underwater vehicles. J. Intelligent and Robotic Syst. 101, 67–26. doi:10.1007/s10846-021-01330-4

CrossRef Full Text | Google Scholar

Hou, S. P., and Cheah, C. C. (2009). “Coordinated control of multiple autonomous underwater vehicles for pipeline inspection,” in Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference (IEEE), 3167–3172.

Google Scholar

Ioannou, P. A., and Sun, J. (1996). Robust adaptive control, 1. Upper Saddle River, NJ: PTR Prentice-Hall.

Google Scholar

Jandaghi, E., Chen, X., and Yuan, C. (2023). “Motion dynamics modeling and fault detection of a soft trunk robot,” in 2023 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM) (IEEE), 1324–1329.

CrossRef Full Text | Google Scholar

Jandaghi, E., Stein, D. L., Hoburg, A., Stegagno, P., Zhou, M., and Yuan, C. (2024). “Composite distributed learning and synchronization of nonlinear multi-agent systems with complete uncertain dynamics,” in 2024 IEEE International Conference on Advanced Intelligent Mechatronics (AIM), 1367–1372. doi:10.1109/aim55361.2024.10637197

CrossRef Full Text | Google Scholar

Krstic, M., Kokotovic, P. V., and Kanellakopoulos, I. (1995). Nonlinear and adaptive control design. John Wiley and Sons, Inc.

Google Scholar

Lawton, J. R. T. (2000). A Behavior-Based Approach to Multiple Spacecraft Formation Flying (Ph.D. thesis). Brigham Young University, Provo, UT, United States.

Google Scholar

Millán, P., Orihuela, L., Jurado, I., and Rubio, F. R. (2013). Formation control of autonomous underwater vehicles subject to communication delays. IEEE Trans. Control Syst. Technol. 22, 770–777. doi:10.1109/tcst.2013.2262768

CrossRef Full Text | Google Scholar

Park, J., and Sandberg, I. W. (1991). Universal approximation using radial-basis-function networks. Neural Comput. 3, 246–257. doi:10.1162/neco.1991.3.2.246

PubMed Abstract | CrossRef Full Text | Google Scholar

Peng, Z., Wang, D., Shi, Y., Wang, H., and Wang, W. (2015). Containment control of networked autonomous underwater vehicles with model uncertainty and ocean disturbances guided by multiple leaders. Inf. Sci. 316, 163–179. doi:10.1016/j.ins.2015.04.025

CrossRef Full Text | Google Scholar

Peng, Z., Wang, J., and Wang, D. (2017). Distributed maneuvering of autonomous surface vehicles based on neurodynamic optimization and fuzzy approximation. IEEE Trans. Control Syst. Technol. 26, 1083–1090. doi:10.1109/tcst.2017.2699167

CrossRef Full Text | Google Scholar

Prestero, T. (2001). “Development of a six-degree of freedom simulation model for the remus autonomous underwater vehicle,” in MTS/IEEE Oceans 2001. An Ocean Odyssey. Conference Proceedings (IEEE Cat. No.01CH37295), Honolulu, HI, USA, 2001, pp.450–455 vol.1, doi:10.1109/OCEANS.2001.968766

PubMed Abstract | CrossRef Full Text | Google Scholar

Ren, W., and Beard, R. W. (2005). Consensus seeking in multiagent systems under dynamically changing interaction topologies. IEEE Trans. automatic control 50, 655–661. doi:10.1109/tac.2005.846556

CrossRef Full Text | Google Scholar

Ren, W., and Beard, R. W. (2008). Distributed consensus in multi-vehicle cooperative control, 27. Springer.

Google Scholar

Rout, R., and Subudhi, B. (2016). A backstepping approach for the formation control of multiple autonomous underwater vehicles using a leader–follower strategy. J. Mar. Eng. and Technol. 15, 38–46. doi:10.1080/20464177.2016.1173268

CrossRef Full Text | Google Scholar

Skjetne, R., Fossen, T. I., and Kokotović, P. V. (2005). Adaptive maneuvering, with experiments, for a model ship in a marine control laboratory. Automatica 41, 289–298. doi:10.1016/j.automatica.2004.10.006

CrossRef Full Text | Google Scholar

Su, Y., and Huang, J. (2011). Cooperative output regulation of linear multi-agent systems. IEEE Trans. Automatic Control 57, 1062–1066. doi:10.1109/TAC.2011.2169618

CrossRef Full Text | Google Scholar

Tutsoy, O., Asadi, D., Ahmadi, K., Nabavi-Chashmi, S. Y., and Iqbal, J. (2024). Minimum distance and minimum time optimal path planning with bioinspired machine learning algorithms for faulty unmanned air vehicles. IEEE Trans. Intelligent Transp. Syst. 25, 9069–9077. doi:10.1109/tits.2024.3367769

CrossRef Full Text | Google Scholar

Wang, C., and Hill, D. J. (2018). Deterministic learning theory for identification, recognition, and control. CRC Press. doi:10.1201/9781315221755

CrossRef Full Text | Google Scholar

Yan, T., Xu, Z., Yang, S. X., and Gadsden, S. A. (2023). Formation control of multiple autonomous underwater vehicles: a review. Intell. and Robotics 3, 1–22. doi:10.20517/ir.2023.01

CrossRef Full Text | Google Scholar

Yan, Z., Liu, X., Zhou, J., and Wu, D. (2018). Coordinated target tracking strategy for multiple unmanned underwater vehicles with time delays. IEEE Access 6, 10348–10357. doi:10.1109/access.2018.2793338

CrossRef Full Text | Google Scholar

Yang, Y., Xiao, Y., and Li, T. (2021). A survey of autonomous underwater vehicle formation: performance, formation control, and communication capability. IEEE Commun. Surv. and Tutorials 23, 815–841. doi:10.1109/comst.2021.3059998

CrossRef Full Text | Google Scholar

Yuan, C. (2017). Leader-following consensus of parameter-dependent networks via distributed gain-scheduling control. Int. J. Syst. Sci. 48, 2013–2022. doi:10.1080/00207721.2017.1309597

CrossRef Full Text | Google Scholar

Yuan, C., Licht, S., and He, H. (2017). Formation learning control of multiple autonomous underwater vehicles with heterogeneous nonlinear uncertain dynamics. IEEE Trans. Cybern. 48, 2920–2934. doi:10.1109/tcyb.2017.2752458

PubMed Abstract | CrossRef Full Text | Google Scholar

Yuan, C., and Wang, C. (2011). Persistency of excitation and performance of deterministic learning. Syst. and control Lett. 60, 952–959. doi:10.1016/j.sysconle.2011.08.002

CrossRef Full Text | Google Scholar

Yuan, C., and Wang, C. (2012). Performance of deterministic learning in noisy environments. Neurocomputing 78, 72–82. doi:10.1016/j.neucom.2011.05.037

CrossRef Full Text | Google Scholar

Zhang, Y., Li, S., and Liu, X. (2018). Neural network-based model-free adaptive near-optimal tracking control for a class of nonlinear systems. IEEE Trans. neural Netw. Learn. Syst. 29, 6227–6241. doi:10.1109/tnnls.2018.2828114

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhou, J., Si, Y., and Chen, Y. (2023). A review of subsea auv technology. J. Mar. Sci. Eng. 11, 1119. doi:10.3390/jmse11061119

CrossRef Full Text | Google Scholar

Keywords: environment-independent controller, autonomous underwater vehicles (AUV), dynamic learning, formation learning control, multi-agent systems, neural network control, adaptive control, robotics

Citation: Jandaghi E, Zhou M, Stegagno P and Yuan C (2025) Adaptive formation learning control for cooperative AUVs under complete uncertainty. Front. Robot. AI 11:1491907. doi: 10.3389/frobt.2024.1491907

Received: 05 September 2024; Accepted: 12 December 2024;
Published: 14 February 2025.

Edited by:

Giovanni Iacca, University of Trento, Italy

Reviewed by:

Önder Tutsoy, Adana Science and Technology University, Türkiye
Di Wu, Harbin University of Science and Technology, China

Copyright © 2025 Jandaghi, Zhou, Stegagno and Yuan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Chengzhi Yuan, Y3l1YW5AdXJpLmVkdQ==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Research integrity at Frontiers

Man ultramarathon runner in the mountains he trains at sunset

95% of researchers rate our articles as excellent or good

Learn more about the work of our research integrity team to safeguard the quality of each article we publish.


Find out more