A Collaborative Learning Tracking Network for Remote Sensing Videos
提出一种协同学习跟踪网络,通过并行融合模块、空间通道协同注意力和几何约束重跟踪策略,提升复杂遥感场景中小目标和相似目标干扰下的检测与跟踪精度。
With the increasing accessibility of remote sensing videos, remote sensing tracking is gradually becoming a hot issue. However, accurately detecting and tracking in complex remote sensing scenes is still a challenge. In this article, we propose a collaborative learning tracking network for remote sensing videos, including a consistent receptive field parallel fusion module (CRFPF), dual-branch spatial-channel co-attention (DSCA) module, and geometric constraint retrack strategy (GCRT). Considering the small-size objects of remote sensing scenes are difficult for general forward networks to extract effective features, we propose a CRFPF-module to establish parallel branches with consistent receptive fields to separately extract from shallow to deep features and then fuse hierarchical features adaptively. Since the objects and their background are difficult to distinguish, the proposed DSCA-module uses the spatial-channel co-attention mechanism to collaboratively learn the relevant information, which enhances the saliency of the objects and regresses to precise bounding boxes. Considering the interference of similar objects, we designed a GCRT-strategy to judge whether there is a false detection through the estimated motion trajectory and then recover the correct object by weakening the feature response of interference. The experimental results and theoretical analysis on multiple datasets demonstrate our proposed method's feasibility and effectiveness. Code and net are available at https://github.com/Dawn5786/CoCRF-TrackNet.