USCT-UNet: Rethinking the Semantic Gap in U-Net Network From U-Shaped Skip Connections With Multichannel Fusion Transformer

Medical image segmentation is a crucial component of computer-aided clinical diagnosis, with state-of-the-art models often being variants of U-Net. Despite their success, these models’ skip connections introduce an unnecessary semantic gap between the encoder and decoder, which hinders th...

Full description

Saved in:

Bibliographic Details
Main Authors:	Xiaoshan Xie (Author), Min Yang (Author)
Format:	Book
Published:	IEEE, 2024-01-01T00:00:00Z.
Subjects:	article
Online Access:	Connect to this object online.
Tags:	Add Tag No Tags, Be the first to tag this record!

MARC


LEADER	00000 am a22000003u 4500
001	doaj_9148946f0dda44888f82aa0a754da99b
042			\|a dc
100	1	0	\|a Xiaoshan Xie \|e author
700	1	0	\|a Min Yang \|e author
245	0	0	\|a USCT-UNet: Rethinking the Semantic Gap in U-Net Network From U-Shaped Skip Connections With Multichannel Fusion Transformer
260			\|b IEEE, \|c 2024-01-01T00:00:00Z.
500			\|a 1534-4320
500			\|a 1558-0210
500			\|a 10.1109/TNSRE.2024.3468339
520			\|a Medical image segmentation is a crucial component of computer-aided clinical diagnosis, with state-of-the-art models often being variants of U-Net. Despite their success, these models’ skip connections introduce an unnecessary semantic gap between the encoder and decoder, which hinders their ability to achieve the high precision required for clinical applications. Awareness of this semantic gap and its detrimental influences have increased over time. However, a quantitative understanding of how this semantic gap compromises accuracy and reliability remains lacking, emphasizing the need for effective mitigation strategies. In response, we present the first quantitative evaluation of the semantic gap between corresponding layers of U-Net and identify two key characteristics: 1) The direct skip connection (DSC) exhibits a semantic gap that negatively impacts models’ performance; 2) The magnitude of the semantic gap varies across different layers. Based on these findings, we re-examine this issue through the lens of skip connections. We introduce a Multichannel Fusion Transformer (MCFT) and propose a novel USCT-UNet architecture, which incorporates U-shaped skip connections (USC) to replace DSC, allocates varying numbers of MCFT blocks based on the semantic gap magnitude at different layers, and employs a spatial channel cross-attention (SCCA) module to facilitate the fusion of features between the decoder and USC. We evaluate USCT-UNet on four challenging datasets, and the results demonstrate that it effectively eliminates the semantic gap. Compared to using DSC, our USC and SCCA strategies achieve maximum improvements of 4.79% in the Dice coefficient, 5.70% in mean intersection over union (MIoU), and 3.26 in Hausdorff distance.
546			\|a EN
690			\|a Medical image segmentation
690			\|a semantic gap
690			\|a U-shaped skip connection
690			\|a multichannel fusion transformer
690			\|a spatial channel cross-attention
690			\|a Medical technology
690			\|a R855-855.5
690			\|a Therapeutics. Pharmacology
690			\|a RM1-950
655	7		\|a article \|2 local
786	0		\|n IEEE Transactions on Neural Systems and Rehabilitation Engineering, Vol 32, Pp 3782-3793 (2024)
787	0		\|n https://ieeexplore.ieee.org/document/10695464/
787	0		\|n https://doaj.org/toc/1534-4320
787	0		\|n https://doaj.org/toc/1558-0210
856	4	1	\|u https://doaj.org/article/9148946f0dda44888f82aa0a754da99b \|z Connect to this object online.

USCT-UNet: Rethinking the Semantic Gap in U-Net Network From U-Shaped Skip Connections With Multichannel Fusion Transformer

MARC

Similar Items