Abstract:
Objectives To develop a deep reinforcement learning-based search algorithm for unmanned surface vehicles (USVs) in submarine detection tasks.
Methods The study is conducted in the context of submarines infiltrating key maritime areas, where a search environment and a kinematic model are constructed. A sonar detection probability model is established, incorporating the effects of distance and angle, with well-defined criteria for determining detection success. Based on this framework, a Markov decision process (MDP) is formulated using the deep Q-network (DQN) algorithm. The state space is designed to include detection probability, and a multi-objective reward function is constructed by integrating detection probability, distance, and angle. To enhance learning efficiency, a temporal difference Q-network (TS-DQN) algorithm is introduced, integrating a double-dueling network architecture with prioritized experience replay. Additionally, a probabilistic perception-based ε-greedy strategy is employed, allowing the system to dynamically adjust its exploration behavior based on real-time detection states, thereby significantly improving policy learning efficiency.
Results Extensive simulation experiments demonstrate that the proposed method achieves a detection success rate of 38.85%, which is 18 times higher than that of the second-best Dueling DQN. The approach also reduces the average path length to 334.36 steps, shortening the search trajectory by more than 9.5% compared to other algorithms.
Conclusions The proposed algorithm exhibits significant advantages in detection efficiency and effectiveness, providing an innovative solution for advancing autonomous USV-based search and detection technologies.