面向舰船环境感知的BiCAG-Net海洋哺乳动物声信号识别方法

BiCAG-Net for Marine Mammal Acoustic Signal Recognition Oriented to Vessel Environmental Perception

  • 摘要:目的】舰船航行需兼顾环境感知与生态安全,因此需及时识别海洋哺乳动物等发声生物目标,以支撑避碰预警、敏感栖息地避让和生态友好型海上任务实施。【方法】针对舰船噪声背景下海洋哺乳动物声信号局部谱纹理易受掩蔽、单一表征难以同时刻画谐波结构与时间演化特征、现有多表征方法深层交互不足等问题,提出一种双向交叉注意力与门控融合网络(Bidirectional Cross-Attention and Gated Fusion Network, BiCAG-Net)识别方法。该方法构建log-Mel与常Q变换(constant-Q transform, CQT)双时频表征,分别利用ResNet18分支和卷积增强Transformer(Conformer)分支提取局部谱纹理特征与谐波时序特征,并通过双向交叉注意力实现跨分支深层信息交互;在此基础上,引入门控融合决策模块,对分支输出与融合输出进行自适应加权,以提升舰船噪声背景下的判别稳健性。【结果】实验结果表明,所提方法在20 dB条件下测试集上的宏平均F1和准确率分别达到96.88%和97.96%;与基线方法相比,该方法在20、10、0和−10 dB四种信噪比条件下均取得最高Macro-F1。【结论】所提方法可为舰船复杂海洋环境中的海洋哺乳动物声信号识别与风险预警提供技术支撑。

     

    Abstract: Objectives Vessel navigation needs to balance environmental perception with ecological safety; therefore, vocalizing biological targets such as marine mammals should be identified in a timely manner to support collision-avoidance warning, sensitive habitat avoidance, and the implementation of eco-friendly maritime missions. Methods To address the problems that local spectral textures of marine mammal acoustic signals are easily masked under ship-radiated noise, that a single representation can hardly characterize both harmonic structures and temporal evolution, and that existing multi-representation methods lack sufficient deep interaction, a recognition method based on the Bidirectional Cross-Attention and Gated Fusion Network, namely BiCAG-Net, is proposed. The proposed method constructs dual time-frequency representations using log-Mel spectrograms and the constant-Q transform (CQT), and employs a ResNet18 branch and a Convolution-augmented Transformer (Conformer) branch to extract local spectral texture features and harmonic temporal features, respectively; bidirectional cross-attention is then used to achieve deep cross-branch information interaction, and a gated fusion decision module is introduced to adaptively weight the branch outputs and fused output, thereby improving classification robustness under ship-noise conditions. Results Experimental results show that the proposed method achieves a macro-averaged F1 score and an accuracy of 96.88% and 97.96%, respectively, on the test set at 20 dB; compared with the baseline methods, it obtains the highest Macro-F1 under four signal-to-noise ratio (SNR) conditions, namely 20, 10, 0, and −10 dB. Conclusions The proposed method can provide technical support for marine mammal acoustic signal recognition and risk early warning in complex marine environments encountered by vessels.

     

/

返回文章
返回