![Man ultramarathon runner in the mountains he trains at sunset](https://d2csxpduxe849s.cloudfront.net/media/E32629C6-9347-4F84-81FEAEF7BFA342B3/0B4B1380-42EB-4FD5-9D7E2DBC603E79F8/webimage-C4875379-1478-416F-B03DF68FE3D8DBB5.png)
94% of researchers rate our articles as excellent or good
Learn more about the work of our research integrity team to safeguard the quality of each article we publish.
Find out more
ORIGINAL RESEARCH article
Front. Neurorobot. , 29 January 2025
Volume 19 - 2025 | https://doi.org/10.3389/fnbot.2025.1549414
In this study, we developed an encrypted guaranteed-cost tracking control scheme for autonomous vehicles or robots (AVRs), by using the adaptive dynamic programming technique. To construct the tracking dynamics under unreliable communication, the AVR's motion is analyzed. To mitigate information leakage and unauthorized access in vehicular network systems, an encrypted guaranteed-cost policy iteration algorithm is developed, incorporating encryption and decryption schemes between the vehicle and the cloud based on the tracking dynamics. Building on a simplified single-network framework, the Hamilton-Jacobi-Bellman equation is approximately solved, avoiding the complexity of dual-network structures and reducing the computational costs. The input-constrained issue is successfully handled using a non-quadratic value function. Furthermore, the approximate optimal control is verified to stabilize the tracking system. A case study involving an AVR system validates the effectiveness and practicality of the proposed algorithm.
Autonomous vehicles or robots (AVRs) have rapidly transformed from a futuristic concept to a tangible reality, driving significant advancements in automotive technology. The advancement of autonomous vehicle technology has increasingly focused on improving tracking control systems, which are crucial for effective vehicle guidance (Pan et al., 2023). However, a persistent issue is the unreliable communication between a local vehicle and a reference vehicle, leading to discrepancies in signal reception and affecting tracking precision. In addition to these developments, the emergence of connected vehicles (Li et al., 2019a; Liu et al., 2023b), which leverages cloud computing for data processing and optimization, presents both opportunities and challenges. These systems function as cyber–physical systems (He et al., 2014; Zhang et al., 2014; Mohan et al., 2020), integrating computational and physical processes to enhance real-time data exchange and improve overall traffic management (Jiang et al., 2022; Li et al., 2019b). However, during communication between the vehicle and the cloud, the network's homogeneous and civilian nature makes it, particularly, vulnerable to attacks. This vulnerability, especially in the absence of robust security protocols, exposes these systems to cyber threats, including eavesdropping.
To enhance the security of vehicular cyber-physical systems, researchers from various fields, such as communication, control systems, and information theory, have developed various strategies to address cyberattacks across different layers (Han et al., 2024; Deng and Wen, 2021; Liu et al., 2021, 2023a). Various types of attacks, including denial-of-service (DoS) attacks, false data injection (FDI) attacks, and replay attacks, have been extensively studied (Teixeira et al., 2012; Li et al., 2024; Hu et al., 2023). These types of attacks share the characteristic of being active strategies designed to disrupt system functionality or manipulate transmitted data. Although defense mechanisms have made progress in countering such threats, majority of the existing methods primarily concentrate on detecting and mitigating explicit attacks, often overlooking the fundamental challenge of ensuring communication security. In vehicular cybersecurity, one of the critical issues is the threat of eavesdropping attacks (Yang et al., 2020; Wu et al., 2022). Unlike the direct and active nature of DoS and FDI attacks, eavesdropping operates passively, enabling attackers to intercept sensitive information while remaining undetected. This makes it a significant long-term threat that can compromise communication confidentiality and can even enable more destructive attacks. Addressing this challenge requires advanced encryption and privacy-preserving techniques to ensure secure communication. Although these methods are effective, they do not ensure optimal control performance at minimal energy cost, as they do not incorporate the principles of optimal control.
Optimal tracking control has become a cornerstone of modern control theory, with adaptive dynamic programming (ADP) algorithms attracting considerable interest in recent years (Lu et al., 2020; Mu et al., 2017b). For non-linear optimal control problems, the principal challenge lies in solving the Hamilton-Jacobi-Bellman (HJB) equation—a problem that is nearly intractable through exact mathematical methods. ADP techniques have offered a promising alternative by leveraging neural networks (NNs) to approximate optimal solutions, leading to significant advancements across fields such as automatic control and artificial intelligence (Mu et al., 2017a; Guo et al., 2024). For example, El-Sousy et al. (2021) designed a three-network structure to approximate the solution of the HJB equation for permanent-magnet synchronous motor servo drives. Wang et al. (2020) proposed an dual-network to approximate local Q-functions and control policies, solving optimal consensus control for non-linear multiagent systems. Furthermore, ADP-based optimal tracking control has been widely investigated (Dong et al., 2022; Song et al., 2023), including efforts to address input-constrained systems (Yang et al., 2023; Zhang et al., 2018). However, conventional ADP approaches, particularly those employing actor-critic frameworks, are frequently hindered by significant approximation errors introduced during iterative processes and NN training, thereby restricting their practical applicability.
To address these challenges, researchers have proposed several single-network ADP methodologies designed to streamline system architectures and enhance computational efficiency in handling nonlinear systems (Xu et al., 2023; Chen et al., 2021; Zou and Zhang, 2023). Chen et al. (2021) developed an event-triggered optimal control scheme for a macro–micro stage system, using a single critic NN to solve the modified HJB equation. In Guo et al. (2024), a distributed control strategy for attitude-constrained quadrotor unmanned aerial vehicle is proposed based on a critic network. Among the core ADP algorithms, value iteration and policy iteration (PI) have been widely employed, demonstrating robust performance in numerous applications (Zhang et al., 2020; Lin et al., 2023). However, the two-stage iterative procedures inherent in these methods frequently involve information transmission, which makes them susceptible to interception by adversaries. This vulnerability necessitates additional security measures, thereby increasing computational complexity and further constraining their applicability to complex systems. Although efforts to streamline computational burdens by eliminating actor networks have yielded progress, current ADP methods still inadequately address essential issues such as input saturation and ensuring reliable system performance, leaving these critical areas as potential opportunities for future research.
Unlike the previous studies, this article proposes an encrypted guaranteed-cost tracking control scheme for input-constrained tracking system with unreliable communication, and the main contributions are summarized as follows:
1. This article introduces an encrypted guaranteed-cost tracking control scheme for AVRs under unreliable communication. Compared with existing works, this is the first attempt to integrate ADP with encryption techniques, addressing both control performance and information security challenges in vehicular networks.
2. The designed privacy-preserving control method introduces a strategy to address eavesdropping attacks in control systems. By applying consistent output masking and encryption mechanisms at both the vehicle side and the cloud side, sensitive data and critical control information are effectively protected from potential breaches. This integrated approach ensures secure data transmission while maintaining the integrity and privacy of the control system.
3. A single-network structure with enhanced computational efficiency is proposed to approximate the HJB equation. Compared to conventional dual-network designs, the single-network structure reduces computational complexity while maintaining theoretical guarantees on weight error convergence and system stability. Additionally, input saturation is explicitly addressed through the adoption of a nonlinear value function, further enhancing the robustness.
Consider an AVR operating in the X-Y plane, the position and orientation of the vehicle's mass center are represented by a posture vector
where x(t) and y(t) denote the horizontal and vertical positions, respectively, and ϑ(t) denotes the heading direction measured counterclockwise from the X-axis. The vehicle's motion is governed by the following kinematic model:
Here, v(t) and w(t) represent the vehicle's linear and rotational velocities, respectively, while is the distance between the vehicle's mass center and its drive axle; and is the Jacobian matrix that links the control inputs to the vehicle's motion. So far, the control objectives are summarized in the following.
Control objective: For an AVR under unreliable communication, design an ADP-based robust optimal controller with secure information exchange to drive the vehicle along the target, such that the following objectives are achieved:
1) Robust tracking control objective: For an AVR, c: = [x(t);y(t);ϑ(t)] to track the desired orbit d: = [xd(t);yd(t);ϑd(t)] under malicious cyberattacks on the tracking process, as shown in Figure 1. Due to the occurrence of an attack, a small deviation arises between the received signal and the actual signal. This deviation, caused by malicious attacks, is defined as a: = [xa(t);ya(t);ϑa(t)]. We assume that a and its derivative are bounded.
With the minor difference a caused by unreliable communication, following the framework in Zhang et al. (2022), we derive the tracking error system as
where e: = [xe(t);ye(t);ϑe(t)] denotes the tracking error posture, vd(t) and wd(t) are the desired linear and rotational velocities, vc(t) and wc(t) are the control inputs of the vehicle, and [γx(t);γy(t);γϑ(t)] captures the effect of cyberattacks on the received signals and given by
This model describes the dynamic behavior of the tracking error in AVR control.
To facilitate system description and control implementation, let us consider that z = [xe(t);ye(t);ϑe(t)], f(z) = [cos(ϑe(t))vd(t); sin(ϑe(t))vd(t);wd(t)], g(z) = [−1, ye(t);0, −xe(t);0, 1], and u = [vc(t), wc(t)]. The system (Equation 2) is rewritten as
where u is control input and satisfies the asymmetric constrained set 𝔒 = {u||u| ≤ ℏ}. To follow the conventional optimal tracking architecture, we can rewrite the reference trajectory as follows
where ud is the steady-state control input taking the following form
where , In denotes an n × n identify matrix.
Assumption 1. The unreliable communication γ(t) is bounded by , that is , where is positive constant.
2) Prevent eavesdropping objective: As shown in Figure 1, the cloud handles monitoring, scheduling, optimization, and computation tasks, while the local controller is responsible for distributing control signals, albeit with limited data storage and processing capabilities. The cyberattack considered here is eavesdropping, where unauthorized interception of data during transmission allows attackers to steal sensitive system information, such as real-time control signals and operational states. To mitigate these risks, encryption and decryption mechanisms are implemented to safeguard the confidentiality and integrity of the transmitted data, ensuring secure communication and enhancing the system's overall reliability.
3) Optimal control objective: Based on the optimal control strategy, the AVR can achieve a compromise between performance and cost when running along a target, such that
where , which is the utility function with feedback control μ = u − ud, γ1 is positive constant, = T > 0, and is a positive definite non-quadratic integrand function.
In this section, based on the preceding analysis, the tracking problem is reformulated into a stabilization problem for the error dynamics. To address this, a cryptography-based controller is designed, which not only mitigates the impact of unreliable communication but also ensures the security of information transmission against eavesdropping.
To effectively counter eavesdropping attacks on data transmitted between the vehicle side and the cloud side, privacy-preserving rules are designed for both sides. The encryption and decryption formulas (Han et al., 2024) for each iteration are provided in the following.
1) AVR to Cloud:
Encryption process: At the vehicle side, the data z to be sent are extracted from Equation 3 and encrypted using Equation 7, resulting in the encrypted data zr. This encrypted data are then transmitted to the cloud. The encryption formula is as follows:
where a(t) and ξ(t) are encryption operators, δ1, ρ1, and ρ2 are constants, and A is the channel assignment matrices. To simplify the presentation of the method, it is assumed that s(z)(t − 1) is already stored in the cloud. The value (z) needs to be calculated on the cloud side. Its design is detailed in Section 3.2 and it serves as an essential component of the controller μ.
Decryption process: The cloud side receives the encrypted data zr and decrypts it to recover the original data z. The decryption formula is as follows:
where c(t) is the counterpart of a(t). It is observed that the design forms of the encryption operators a(t) and ξ(t), and encrypted expressions are shared between the vehicle side and the cloud. Furthermore, the parameters A, δ1, ρ1, and ρ2 are also shared.
2) Cloud to AVR:
Encryption process: After policy evaluation, the computed (z) is encrypted using Equation 9 and sent back to the vehicle.
where b(t) and ζ(t) are encryption operators, δ2, ϱ1, and ϱ2 are constants, and B is the channel assignment matrices.
Decryption process: At the vehicle side, the received encrypted data r(z) is decrypted using Equation 10 to recover (z) for policy improvement.
where d(t) is the counterpart of b(t). Similarly, the design forms of the encryption operators b(t) and ζ(t), and encrypted expressions are shared between the vehicle side and the cloud. Furthermore, the parameters B, δ2, ϱ1, and ϱ2 are also shared. At this point, a complete iteration of privacy-preserving processing has been completed.
From the above encryption and decryption processes, it can be observed that the introduced masking signals ξ(t) and ζ(t) and the encryption formula designs effectively ensure privacy during data transmission between the vehicle and the cloud. Notably, the data transmitted over the network are not the raw values z and (z) but their encrypted counterparts, zs, zr, s(z), and r(z), which effectively prevent unauthorized entities from intercepting sensitive information.
The objective is to stabilize Equation 3 by constructing an encrypted iterative algorithm so that minimizing the performance index function, thereby reducing control costs and enhancing system security. Recalling Equation 6, the performance index is
where
where = diag{[r1, ..., rm]} > 0, . The function h(·) is assumed to be a monotonic odd function satisfying h(0) = 0. For the purposes of this article, h(·) is specifically selected as h(x) = (ez − e−z)/(ez + e−z).
According to the optimal control theory, Equation 11 is a Lyapunov function for the Equation 3 and the Hamiltonian function can be derived as
with . On defining *(z) as the minimum value of Equation 11, based on Bellman's principle of optimality, we have
and the optimal control u* is obtained from :
Substituting Equation 15 into Equation 12 yields
where and . Then, the HJB equation can be derived as
As highlighted in the preceding analysis, obtaining the optimal controller in Equation 15 necessitates solving the HJB Equation 17, a task well-known for its considerable computational and analytical challenges. To overcome this challenge, an iterative algorithm based on ADP is employed to obtain an approximate solution. The details of this iterative algorithm are presented in Algorithm 1.
Lemma 1. By utilizing the encrypted PI process as described in Algorithm 1, which incorporates encryption and decryption steps for secure control of the tracking error dynamics in an AVR, the resulting control uς ensures the asymptotic stability of the system dynamics. Additionally, ς(z) will converge to the optimal value function *(z) as ς → ∞, ensuring that uς converges to the optimal control u*.
Proof. Initially, without iterations, the control u1 is considered admissible. For ∀uς produced during iterations, consider the Lyapunov function ς(z), which satisfies
According to HJB Equation 17, we can drive
where μς = uς − ud. Then, substituting Equation 21 into Equation 22 yields
Therefore, the iteration process ensures that the error dynamics remain asymptotically stable. Moreover, policy improvement is achieved by minimizing the associated value function, consistent with the Kleinman method, guaranteeing convergence. As the iteration count ς → ∞, , and hold. This concludes the proof. □
Based on Lemma 1, the iterative process, enhanced with secure encryption and decryption, converges, leading to optimal control as the approximation errors diminish.
In this section, this study employs the fundamental update equations of PI to design a NN, utilizing the critic neural network (CNN) to approximate the solution of the HJB Equation 17 during each iteration step. Therefore, based on the universal approximation property of NNs, there exist ideal weights * such that the ideal value function can be approximated as
where φ(z) ∈ ℝα denotes activation functions and α is the number of neurons. Utilizing Equation 23, HJB Equation 17 becomes
where
with and . Therefore, by defining residual error ϵH, Equation 24 can be rewritten as
where
with , 2i(z) ∈ [i(z), 1i(z)]. Note that if the number of hidden layer neurons α is sufficiently large, the residual error ϵH will approach zero. Based on the Lipschitz assumption of the system dynamics, this ϵH is bounded within a compact set, that is, . Therefore, based on Equation 23 the ideal optimal control is
where , ψi ∈ [i, 1i].
Since the ideal weight is unknown, the approximated value function is
where is approximated value of *. Then, we can get
Thus, approximated Hamiltonian function is
where is the residual error due to NN approximation error.
Furthermore, let us consider , and to ensure that converge toward the optimal weights *, the weight update formula (Zhang et al., 2018) is
where η is learning rate, τ = ∇φ(z)(f(z)+g(z)û+γ), ϖ = τT τ + 1, and 1 and 2 are a tuning matrix. , . Based on the Lemma 2 by Zhang et al. (2018), a denotes Lyapunov function, and if ∇a(f(z)+gû + γ) > 0, then κ = 0, else κ = 1. Defining , we obtain
with , .
Theorem 1. For the optimal control policy described in Equation 30, the weight tuning law of the CNN is determined by the update formula provided in Equation 32. Under this design, the error dynamic system z and the weight errors are uniformly ultimately bounded (UUB).
Proof. Define the Lyapunov function as = 1 + 2, where
First, along Equation 33, the derivative of 2 is
where
Supposing that , , and due to is bound, , . Therefore, is
Owing to κ of 2, is divided into two parts. For κ = 0, we have
From a study by Rudin et al. (1964), we can know , ||z|| ≤ zm, zm > 0, thus, becomes
Moreover, if
or
According to Equation 39, we can derive
For κ = 1, is
Regarding , using the Taylor series, we know
where is the higher order term and satisfies
where ||g|| ≤ ḡ, ḡ > 0 and , .
Recalling Equations 28–30, the term in Equation 41 with respect to ∇ag can be written as
Until now, Equation 41 can be rewritten as
where , . Let ℓ1 and ℓ2 satisfy 0 < ℓ1 < 1, 0 < ℓ2 < 1, and ℓ1 + ℓ2 = 1. Then, Equation 44 can be rewritten as
Therefore, if
or
Similar to Equation 40, we can derive
By considering the two cases, κ = 0 and κ = 1, and based on the derived results as expressed in Equations 38–40 and Equations 45–47, we can conclude that the function ∇a and the error weights are UUB. Furthermore, knowing that a is in polynomial form, it follows that the error z is also UUB.
Remark 1. The algorithm designed in this article is depicted in Figure 2, where Algorithm 1 is implemented using a CNN. The CNN generates the estimated value function , which is subsequently used to derive the approximated optimal control law û based on Equation 30. In contrast to the constrained optimal control designs presented in the studies by Zou and Zhang (2023); Chen et al. (2021), this work integrates privacy-preserving mechanisms during information transmission by leveraging encryption and decryption techniques. This incorporation not only safeguards data confidentiality but also enhances the overall security and reliability of the proposed algorithm.
To analyze the tracking performance of the AVR, we conduct simulations based on a predefined tracking error dynamic model. The tracking error dynamics is modeled as
where represents the distance from the vehicle's center of mass to the rear axle, set to = −1.2m in this article. The desired reference trajectory is initialized with the state:
and the vehicle's trajectory is initialized with the state:
Consequently, the initial value of error denotes
The reference trajectory's desired velocities are vd = 0.5 and wd = 0.04. Under the input constraints, ℏ = 1.5, meaning the constraint range is [−1.5, 1.5]. The unreliable communication γ is defined as
where σ is a random variable uniformly distributed in σ ∈ [−0.1, 0.1]. For the performance evaluation, we define the cost function using the weighting matrices
The activation function vector of CNN is sin(z1), sin(z2), sin(z3), . ρ1 = 1.1, ρ2 = 1.03, ϱ1 = 3.2, ϱ2 = 1.08, , , A = 1, and B = 1.
Using the proposed method, Figure 3A illustrates the two-dimensional trajectory of the AVR. The vehicle quickly adjusts its direction and begins tracking the reference trajectory with good accuracy. After the initial phase, the vehicle follows the desired trajectory smoothly and closely. Figures 3B–G depict the tracking performance and error, demonstrating that the position error gradually reduces to zero, while the directional error also diminishes to zero, effectively ensuring precise position tracking throughout the process.
Figure 3. AVR driving trajectories. (A) The X-Y plot of tracking trajectories. (B–D) Tracking trajectories. (E–G) Tracking errors.
Figure 4 displays the evolution of the designed controller during the vehicle's tracking process. The dashed lines indicate the upper and lower bounds of the input constraints, which are set to [−1.5, 1.5]. The privacy-preserving characteristics of the proposed scheme are illustrated in Figure 5. It is evident that masking the vehicle-side output z effectively safeguards its privacy from potential attackers. Meanwhile, as shown in Figure 6, masking on the cloud side further prevents the leakage of critical information related to the designed control strategy. Therefore, these results ensure robust privacy protection during data transmission.
This study develops an encrypted guaranteed-cost tracking control scheme to address the challenges of information security and computational efficiency in AVR systems using the adaptive dynamic programming technique. By leveraging ADP and integrating encryption mechanisms between the vehicle and the cloud, the proposed method ensures stable tracking performance under unreliable communication. The input constraints are successfully managed using a nonlinear value function, while the CNN facilitates an efficient solution to the HJB equation. Simulation results from a case study confirm the stability and effectiveness of the designed algorithm, demonstrating its potential for real-world applications in AVR networks. Future work will focus on ensuring the security of cloud-based computations by processing encrypted data, further enhancing the safety and reliability of cloud operations in vehicular network systems.
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.
KZ: Conceptualization, Methodology, Writing – original draft. KH: Methodology, Writing – review & editing. ZH: Conceptualization, Supervision, Writing – review & editing. GT: Validation, Writing – review & editing.
The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work is supported by Beijing Nova Program (20240484516), the Fundamental Research Funds for the Central Universities (KG16314701), and Beihang World TOP University Cooperation Program.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The author(s) declare that no Gen AI was used in the creation of this manuscript.
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Chen, X., Chen, X., Bai, W., and Guo, Z. (2021). Event-triggered optimal control for macro–micro composite stage system via single-network ADP method. IEEE Trans. Indust. Elect. 68, 4190–4198. doi: 10.1109/TIE.2020.2984462
Deng, C., and Wen, C. (2021). Mas-based distributed resilient control for a class of cyber-physical systems with communication delays under dos attacks. IEEE Trans. Cybern. 51, 2347–2358. doi: 10.1109/TCYB.2020.2972686
Dong, H., Zhao, X., and Luo, B. (2022). Optimal tracking control for uncertain nonlinear systems with prescribed performance via critic-only ADP. IEEE Trans. Syst. Man, Cybernet.: Syst. 52, 561–573. doi: 10.1109/TSMC.2020.3003797
El-Sousy, F. F. M., Amin, M. M., and Al-Durra, A. (2021). Adaptive optimal tracking control via actor-critic-identifier based adaptive dynamic programming for permanent-magnet synchronous motor drive system. IEEE Trans. Ind. Appl. 57, 6577–6591. doi: 10.1109/TIA.2021.3110936
Guo, Z., Li, H., Ma, H., and Meng, W. (2024). Distributed optimal attitude synchronization control of multiple quavs via adaptive dynamic programming. IEEE Trans. Neural Netw. Learn. Syst. 35:8053–8063. doi: 10.1109/TNNLS.2022.3224029
Han, K., Zhang, K., Wang, Z.-P., and Su, R. (2024). Resilient predictive load frequency control of multi-area interconnected power systems with privacy preserving and active detection against stealthy cyber attacks. IEEE Intern. Things J. 7, 4387–4394. doi: 10.1109/JIOT.2024.3507291
He, W., Yan, G., and Xu, L. D. (2014). Developing vehicular data cloud services in the IoT environment. IEEE Trans. Indust. Inform. 10, 1587–1595. doi: 10.1109/TII.2014.2299233
Hu, S., Ge, X., Chen, X., and Yue, D. (2023). Resilient load frequency control of islanded ac microgrids under concurrent false data injection and denial-of-service attacks. IEEE Trans. Smart Grid 14, 690–700. doi: 10.1109/TSG.2022.3190680
Jiang, M., Wu, T., Wang, Z., Gong, Y., Zhang, L., and Liu, R. P. (2022). A multi-intersection vehicular cooperative control based on end-edge-cloud computing. IEEE Trans. Vehicular Technol. 71, 2459–2471. doi: 10.1109/TVT.2022.3143828
Li, Y., Tang, C., Li, K., He, X., Peeta, S., and Wang, Y. (2019a). Consensus-based cooperative control for multi-platoon under the connected vehicles environment. IEEE Trans. Intellig. Transport. Syst. 20, 2220–2229. doi: 10.1109/TITS.2018.2865575
Li, Y., Tang, C., Peeta, S., and Wang, Y. (2019b). Nonlinear consensus-based connected vehicle platoon control incorporating car-following interactions and heterogeneous time delays. IEEE Trans. Intellig. Transport. Syst. 20, 2209–2219. doi: 10.1109/TITS.2018.2865546
Li, Z., Shi, Y., Xu, S., Xu, H., and Dong, L. (2024). Distributed model predictive consensus of mass against false data injection attacks and denial-of-service attacks. IEEE Trans. Automat. Contr. 69, 5538–5545. doi: 10.1109/TAC.2024.3371895
Lin, Z., Ma, J., Duan, J., Li, S. E., Ma, H., Cheng, B., et al. (2023). Policy iteration based approximate dynamic programming toward autonomous driving in constrained dynamic environment. IEEE Trans. Intellig. Transp. Syst. 24, 5003–5013. doi: 10.1109/TITS.2023.3237568
Liu, K., Zhang, H., Zhang, Y., and Sun, C. (2023a). False data-injection attack detection in cyber–physical systems with unknown parameters: a deep reinforcement learning approach. IEEE Trans. Cybern. 53, 7115–7125. doi: 10.1109/TCYB.2022.3225236
Liu, R., Hao, F., and Yu, H. (2021). Optimal SINR-based dos attack scheduling for remote state estimation via adaptive dynamic programming approach. IEEE Trans. Syst. Man, Cybernet.: Syst. 51, 7622–7632. doi: 10.1109/TSMC.2020.2981478
Liu, T., Cui, L., Pang, B., and Jiang, Z.-P. (2023b). A unified framework for data-driven optimal control of connected vehicles in mixed traffic. IEEE Trans. Intellig. Vehicl. 8, 4131–4145. doi: 10.1109/TIV.2023.3287131
Lu, J., Wei, Q., and Wang, F.-Y. (2020). Parallel control for optimal tracking via adaptive dynamic programming. IEEE/CAA J. Automat. Sinica 7, 1662–1674. doi: 10.1109/JAS.2020.1003426
Mohan, A. M., Meskin, N., and Mehrjerdi, H. (2020). A comprehensive review of the cyber-attacks and cyber-security on load frequency control of power systems. Energies 13:15. doi: 10.3390/en13153860
Mu, C., Ni, Z., Sun, C., and He, H. (2017a). Air-breathing hypersonic vehicle tracking control based on adaptive dynamic programming. IEEE Trans. Neural Netw. Learn. Syst. 28, 584–598. doi: 10.1109/TNNLS.2016.2516948
Mu, C., Ni, Z., Sun, C., and He, H. (2017b). Data-driven tracking control with adaptive dynamic programming for a class of continuous-time nonlinear systems. IEEE Trans. Cybern. 47, 1460–1470. doi: 10.1109/TCYB.2016.2548941
Pan, H., Zhang, C., and Sun, W. (2023). Fault-tolerant multiplayer tracking control for autonomous vehicle via model-free adaptive dynamic programming. IEEE Trans. Reliab. 72, 1395–1406. doi: 10.1109/TR.2022.3208467
Song, S., Gong, D., Zhu, M., Zhao, Y., and Huang, C. (2023). Data-driven optimal tracking control for discrete-time nonlinear systems with unknown dynamics using deterministic adp. IEEE Trans. Neural Netw. Learn. Syst. 36, 1184–1198. doi: 10.1109/TNNLS.2023.3323142
Teixeira, A., Pérez, D., Sandberg, H., and Johansson, K. H. (2012). “Attack models and scenarios for networked control systems,” in Proceedings of the 1st International Conference on High Confidence Networked Systems (New York, NY: Association for Computing Machinery), 55–64.
Wang, W., Chen, X., Fu, H., and Wu, M. (2020). Model-free distributed consensus control based on actor–critic framework for discrete-time nonlinear multiagent systems. IEEE Trans. Syst. Man, Cybernet.: Syst. 50, 4123–4134. doi: 10.1109/TSMC.2018.2883801
Wu, H., Li, M., Gao, Q., Wei, Z., Zhang, N., and Tao, X. (2022). Eavesdropping and anti-eavesdropping game in uav wiretap system: A differential game approach. IEEE Trans. Wireless Commun. 21, 9906–9920. doi: 10.1109/TWC.2022.3180395
Xu, Y., Li, T., Yang, Y., Tong, S., and Chen, C. L. P. (2023). Simplified adp for event-triggered control of multiagent systems against fdi attacks. IEEE Trans. Syst. Man, Cybernet.: Syst. 53, 4672–4683. doi: 10.1109/TSMC.2023.3257031
Yang, W., Zheng, Z., Chen, G., Tang, Y., and Wang, X. (2020). Security analysis of a distributed networked system under eavesdropping attacks. IEEE Trans. Circuits Systems II: Express Briefs 67, 1254–1258. doi: 10.1109/TCSII.2019.2928558
Yang, X., Xu, M., and Wei, Q. (2023). Approximate dynamic programming for event-driven H∞ constrained control. IEEE Trans. Syst. Man, Cybernet.: Syst. 53, 5922–5932. doi: 10.1109/TSMC.2023.3277737
Zhang, H., Qu, Q., Xiao, G., and Cui, Y. (2018). Optimal guaranteed cost sliding mode control for constrained-input nonlinear systems with matched and unmatched disturbances. IEEE Trans. Neural Netw. Learn. Syst. 29, 2112–2126. doi: 10.1109/TNNLS.2018.2791419
Zhang, K., Liang, X., Lu, R., and Shen, X. (2014). Sybil attacks and their defenses in the internet of things. IEEE Intern. Things J. 1, 372–383. doi: 10.1109/JIOT.2014.2344013
Zhang, K., Zhang, H., Cai, Y., and Su, R. (2020). Parallel optimal tracking control schemes for mode-dependent control of coupled markov jump systems via integral rl method. IEEE Trans. Autom. Sci. Eng. 17, 1332–1342. doi: 10.1109/TASE.2019.2948431
Zhang, K., Zhang, H., Xue, W., and Zhang, R. (2022). “A robust control scheme for autonomous vehicles path tracking under unreliable communication,” in 2022 IEEE 11th Data Driven Control and Learning Systems Conference (DDCLS) (Chengdu: IEEE), 1413–1418. doi: 10.1109/DDCLS55054.2022.9858512
Keywords: adaptive dynamic programming, encryption and decryption, tracking control, optimal control, autonomous vehicle
Citation: Zhang K, Han K, Hu Z and Tan G (2025) Privacy-preserving ADP for secure tracking control of AVRs against unreliable communication. Front. Neurorobot. 19:1549414. doi: 10.3389/fnbot.2025.1549414
Received: 21 December 2024; Accepted: 10 January 2025;
Published: 29 January 2025.
Edited by:
Ming-Feng Ge, China University of Geosciences Wuhan, ChinaCopyright © 2025 Zhang, Han, Hu and Tan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Zhijian Hu, emhpamlhbi5odUBsYWFzLmZy
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
Research integrity at Frontiers
Learn more about the work of our research integrity team to safeguard the quality of each article we publish.