无人船性能预设强化学习轨迹跟踪控制

王宁; 王润旭; 霍阳

doi:10.19693/j.issn.1673-3185.04946

无人船性能预设强化学习轨迹跟踪控制

Performance-prescribed reinforcement learning-based trajectory tracking control for an unmanned surface vehicle

摘要

摘要:
目的针对复杂航行环境下无人船因推进系统饱和限制而引发的轨迹跟踪误差振荡失稳问题，提出一种基于预设性能约束的强化学习最优控制策略。
方法首先，引入光滑饱和函数处理控制输入的幅值约束；其次，设计一种改进的预设性能控制方法，通过构造非对称性能边界以严格约束跟踪误差的收敛范围；再建立基于Actor-Critic网络的强化学习优化框架，通过在线迭代学习逼近最优控制策略及价值函数，在状态约束下实现无人船跟踪性能的优化；最后，基于Lyapunov稳定性理论，验证该方案下无人船闭环跟踪控制系统的稳定性。
结果以远洋油轮KVLCC2为研究对象进行数值仿真，结果表明所提方法能够有效处理饱和限制下的无人船轨迹跟踪问题，且跟踪误差始终限制在预设性能边界之内。
结论该研究为受限无人船的高性能跟踪控制提供了新的解决方案，具有实际工程应用价值。

Abstract:
Objective This study addresses oscillatory instability in trajectory tracking errors of unmanned surface vehicles (USV) caused by propulsion saturation under complex navigation conditions. A prescribed-performance reinforcement learning-based optimal control method is proposed.
Method First, a novel saturation function is introduced to handle USV input saturation. Second, an improved prescribed performance control scheme is designed, in which tracking error convergence is constrained by an asymmetric performance boundary, thereby relaxing the strict dependence on initial error conditions. Then, a reinforcement learning optimization framework based on an Actor-Critic architecture is constructed to iteratively learn the optimal control policy and value function, enabling performance optimization under state constraints. Finally, the stability of the closed-loop tracking system is rigorously proven using Lyapunov stability theory.
Results Numerical simulations conducted on the KVLCC2 tanker model demonstrate that the proposed method effectively addresses trajectory tracking under saturation constraints, with all tracking errors strictly confined within the prescribed performance boundaries.
Conclusion The study provides a new solution for high-performance tracking control of constrained USVs and demonstrates strong potential for practical engineering applications.

HTML全文

参考文献(30)

施引文献

资源附件(0)