Wavelet-Infused Convolution-Transformer for Efficient Segmentation in Medical Images
提出WaveCoformer模型,融合小波域和空间域特征,通过卷积模块捕捉纹理细节、Transformer学习全局依赖,在Synapse和肾上腺肿瘤数据集上取得优于现有方法的Dice分数,且计算高效适合资源受限环境。
Recent medical image segmentation methods extract the characteristics of anatomical structures only from the spatial domain, ignoring the distinctive patterns present in the spectral representation. This study aims to develop a novel segmentation architecture that leverages both spatial and spectral characteristics for better segmentation outcomes. This research introduces the wavelet-infused convolutional Transformer (WaveCoformer), a computationally effective framework to fuse information from both spatial and spectral domains of medical images. Fine-grained textural features are captured from the wavelet components by the convolution module. A transformer block identifies the relevant activation maps within the volumes, followed by self-attention to effectively learn long-range dependencies to capture the global context of the target regions. A cross-attention mechanism effectively combines the distinctive features acquired by both modules to produce a comprehensive and robust representation of the input data. WaveCoformer outperforms related state-of-the-art networks in publicly available Synapse and Adrenal tumor segmentation datasets, with a mean Dice score of 83.86% and 79%, respectively. The model is feasible for deployment in resource-constrained environments with rapid medical image analysis due to its computationally efficient nature and improved segmentation performance. The code is available at: <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/duttapallabi2907/WaveCoformer</uri>.