Abstract:
Objective To address the challenges of high reasoning latency, redundant logical chains, and poor real-time performance when applying chain-of-thought (CoT) reasoning with large language models (LLMs) to the collaborative autonomy of unmanned surface vehicle (USV) swarms, this paper proposes an autonomous cooperation algorithm based on the collaboration between large and small language models. The objective is to substantially reduce the inference latency of the decision-making model while maintaining output quality, thereby providing a highly effective technical framework for real-time decision-making in complex maritime environments.
Method The proposed approach is built around two core technical innovations. First, a two-stage logical compression process is implemented through sentence-level masking and word-level semantic pruning. This process effectively removes redundant logical expressions from the original CoT reasoning sequences and produces streamlined keyword-based representations. These optimized sequences are then utilized to fine-tune lightweight models, enabling them to directly generate the essential CoT reasoning content from task descriptions. Second, a multi-model collaborative mechanism is introduced, in which three lightweight models with 8B-parameters each generate candidate decision solutions in parallel. Subsequently, a 32B-parameter verifier model performs confidence-based evaluation and rapidly selects the optimal solution. This architecture establishes a highly efficient closed-loop pipeline of "compression-generation-verification".
Results Extensive experiments were conducted on three representative collaborative tasks: multi-objective pursuit, dynamic obstacle avoidance, and resource-constrained missions. The empirical results demonstrate the superior performance of the proposed approach. In the dynamic obstacle avoidance task, the method reduces single-step reasoning latency from 3.45 s to 1.53 s while maintaining a high task success rate of 98.8%, with the logical chain achieving an average compression rate of 72.3%. In the multi-objective pursuit task, it attains a high success rate of 97.6% and reduces single-step reasoning latency from 3.21 s to 1.42 s. Furthermore, in the resource-constrained mission, the proposed method achieves a high success rate of 95.1% and significantly decreases the average number of decision steps from 29.7 to 21.4, demonstrating marked improvements in planning efficiency. Ablation studies confirm the complementary effects of the logical compression and speculative decoding modules in balancing decision quality and response speed.
Conclusion By integrating a two-stage logical compression mechanism with a multi-model speculative decoding framework, the proposed method substantially reduces model inference latency without compromising output quality. It provides a robust and efficient technical solution for real-time collaborative decision-making in USV swarms operating in complex and dynamic maritime environments, demonstrating strong application potential and practical value.