Pub Date : 2026-01-20DOI: 10.1109/tmi.2026.3656184
Xinkai Tang,Zhiyao Luo,Feng Liu,Wencai Huang,Jiani Zou
Accurate monitoring of pulmonary nodules' growth is crucial for preventing lung cancer progression and improving patient outcomes. Yet, identifying high-risk nodules in computed tomography (CT) scans remains challenging due to subtle growth patterns, irregular follow-up intervals, and the limitations of current diagnostic tools. Existing methods often depend on single-timepoint analyses or assume fixed temporal intervals, constraining prediction to rigid scenarios. To address these limitations, we propose NGP-Net, a W-shaped architecture for dynamic nodule growth prediction from irregular longitudinal CT scans. NGP-Net introduces a Spatial-Temporal Encoding Module (STEM) that learns temporal dynamics directly from irregularly sampled data, and a dual-branch decoder that reconstructs high-fidelity nodule textures and shapes at arbitrary future timepoints. We further release a curated dataset of 378 chest CT scans from 103 patients with 226 pulmonary nodules, each followed across at least three timepoints spanning 2-64 months and annotated by seven radiologists. Extensive evaluation demonstrates that NGP-Net achieves state-of-the-art performance on this new dataset, obtaining the lowest mean square error of 6.13 × 10-3 (overall) and 1.28 × 10-4 (nodule-specific), with substantial improvements in Dice similarity coefficient (10.55%), peak signal-to-noise ratio (0.29 dB), and structural similarity index (5.94%). NGP-Net's robust and precise predictions across varied growth scenarios highlight its potential to support radiologists in clinical decision-making. The source code and dataset are publicly available at GitHub and Kaggle.
{"title":"NGP-Net: a Lightweight Growth Prediction Network for Pulmonary Nodules.","authors":"Xinkai Tang,Zhiyao Luo,Feng Liu,Wencai Huang,Jiani Zou","doi":"10.1109/tmi.2026.3656184","DOIUrl":"https://doi.org/10.1109/tmi.2026.3656184","url":null,"abstract":"Accurate monitoring of pulmonary nodules' growth is crucial for preventing lung cancer progression and improving patient outcomes. Yet, identifying high-risk nodules in computed tomography (CT) scans remains challenging due to subtle growth patterns, irregular follow-up intervals, and the limitations of current diagnostic tools. Existing methods often depend on single-timepoint analyses or assume fixed temporal intervals, constraining prediction to rigid scenarios. To address these limitations, we propose NGP-Net, a W-shaped architecture for dynamic nodule growth prediction from irregular longitudinal CT scans. NGP-Net introduces a Spatial-Temporal Encoding Module (STEM) that learns temporal dynamics directly from irregularly sampled data, and a dual-branch decoder that reconstructs high-fidelity nodule textures and shapes at arbitrary future timepoints. We further release a curated dataset of 378 chest CT scans from 103 patients with 226 pulmonary nodules, each followed across at least three timepoints spanning 2-64 months and annotated by seven radiologists. Extensive evaluation demonstrates that NGP-Net achieves state-of-the-art performance on this new dataset, obtaining the lowest mean square error of 6.13 × 10-3 (overall) and 1.28 × 10-4 (nodule-specific), with substantial improvements in Dice similarity coefficient (10.55%), peak signal-to-noise ratio (0.29 dB), and structural similarity index (5.94%). NGP-Net's robust and precise predictions across varied growth scenarios highlight its potential to support radiologists in clinical decision-making. The source code and dataset are publicly available at GitHub and Kaggle.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"6 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146005208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Brain functional connectivity networks (FCNs) derived from resting-state functional magnetic resonance imaging (rs-fMRI) data have been widely used to identify altered brain network patterns in attention-deficit/hyperactivity disorder (ADHD). Current graph neural network (GNN) approaches using FCNs predominantly emphasize node features while underutilizing edge information. Moreover, these GNN-based methods also inadequately represent dynamic interdependencies among evolving node features across network layers, limiting their diagnostic performance. We present a graph convolutional network via joint feature learning between nodes and edges (JNEL-GCN) that integrates neuroimaging features for ADHD classification and biomarker discovery. Our framework constructs dual graph representations: (1) a node graph using amplitude of low-frequency fluctuations (ALFF) measures across multiple frequency bands as nodal features, along with functional connectivity (FC) and node feature relationship matrices as edge attributes; (2) an edge graph derived through line graph theory, enabling the interchange of node and edge roles. By leveraging the dual-graph design, our model implements an alternating feature update mechanism with optimized graph convolution operations, facilitating feature hierarchical learning of node-edge relationships across network layers. Extensive experiments demonstrate remarkable performance, achieving 97.3% accuracy on ADHD200 and 97.1% on ABIDE-I datasets, significantly outperforming current benchmarks. Meanwhile, gradient-based biomarker analysis identifies significant regions in bilateral limbic and default mode networks associated with ADHD, aligning with the findings in existing literature. Therefore, this dual-graph approach advances neuroimaging-based diagnosis by comprehensively capturing dynamic network interactions, while providing interpretable biomarkers for clinical neuroscience applications.
{"title":"ADHD Classification with GCN via Joint Feature Learning among Nodes and Edges.","authors":"Xiaotong Wang,Yibin Tang,Yuan Gao,Xiaojing Meng,Ying Chen,Aimin Jiang","doi":"10.1109/tmi.2026.3656430","DOIUrl":"https://doi.org/10.1109/tmi.2026.3656430","url":null,"abstract":"Brain functional connectivity networks (FCNs) derived from resting-state functional magnetic resonance imaging (rs-fMRI) data have been widely used to identify altered brain network patterns in attention-deficit/hyperactivity disorder (ADHD). Current graph neural network (GNN) approaches using FCNs predominantly emphasize node features while underutilizing edge information. Moreover, these GNN-based methods also inadequately represent dynamic interdependencies among evolving node features across network layers, limiting their diagnostic performance. We present a graph convolutional network via joint feature learning between nodes and edges (JNEL-GCN) that integrates neuroimaging features for ADHD classification and biomarker discovery. Our framework constructs dual graph representations: (1) a node graph using amplitude of low-frequency fluctuations (ALFF) measures across multiple frequency bands as nodal features, along with functional connectivity (FC) and node feature relationship matrices as edge attributes; (2) an edge graph derived through line graph theory, enabling the interchange of node and edge roles. By leveraging the dual-graph design, our model implements an alternating feature update mechanism with optimized graph convolution operations, facilitating feature hierarchical learning of node-edge relationships across network layers. Extensive experiments demonstrate remarkable performance, achieving 97.3% accuracy on ADHD200 and 97.1% on ABIDE-I datasets, significantly outperforming current benchmarks. Meanwhile, gradient-based biomarker analysis identifies significant regions in bilateral limbic and default mode networks associated with ADHD, aligning with the findings in existing literature. Therefore, this dual-graph approach advances neuroimaging-based diagnosis by comprehensively capturing dynamic network interactions, while providing interpretable biomarkers for clinical neuroscience applications.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"275 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146005055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Surgical phase recognition (SPR) is essential for surgical workflow analysis and provides immediate guidance during procedures. Existing methods aggregate frame-level information into a global representation and treat the task as frame-wise classification. However, this pipeline lacks a feedback mechanism for integrating historical information into local temporal modeling. To address this limitation, we propose the Bidirectional Branch Query Network (B2Q-Net), which reformulates the SPR task as the bidirectional query between phase-level features and frame-level features. B2Q-Net incorporates historical information during the initialization of phase queries. This enables bidirectional information flow during iterative refinement of two-level feature maps between phases and frames. Furthermore, we introduce a dual-scale selector (DSS) to generate high-quality phase queries for the current video clip. These phase queries retrieve historical information from the proposed state space query (SSQ) module, which uses learnable tokens as the historical state space to preserve historical information. Extensive evaluations on three datasets demonstrate that B2Q-Net consistently outperforms state-of-the-art methods in recognition accuracy while achieving an inference speed of 106 fps. The B2Q-Net code is available at https://github.com/vsislab/B2Q-Net.
{"title":"B2Q-Net: Bidirectional Branch Query Network for Surgical Phase Recognition.","authors":"Wenjie Zhang,Zhiheng Li,Yue Bi,Xiao Jia,Ran Song,Yipeng Zhang,Wei Zhang","doi":"10.1109/tmi.2026.3654795","DOIUrl":"https://doi.org/10.1109/tmi.2026.3654795","url":null,"abstract":"Surgical phase recognition (SPR) is essential for surgical workflow analysis and provides immediate guidance during procedures. Existing methods aggregate frame-level information into a global representation and treat the task as frame-wise classification. However, this pipeline lacks a feedback mechanism for integrating historical information into local temporal modeling. To address this limitation, we propose the Bidirectional Branch Query Network (B2Q-Net), which reformulates the SPR task as the bidirectional query between phase-level features and frame-level features. B2Q-Net incorporates historical information during the initialization of phase queries. This enables bidirectional information flow during iterative refinement of two-level feature maps between phases and frames. Furthermore, we introduce a dual-scale selector (DSS) to generate high-quality phase queries for the current video clip. These phase queries retrieve historical information from the proposed state space query (SSQ) module, which uses learnable tokens as the historical state space to preserve historical information. Extensive evaluations on three datasets demonstrate that B2Q-Net consistently outperforms state-of-the-art methods in recognition accuracy while achieving an inference speed of 106 fps. The B2Q-Net code is available at https://github.com/vsislab/B2Q-Net.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"37 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145986341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stemmed from our novel single-photon imaging concept of detector self-collimation-which leverages detectors themselves as collimators to overcome the inherent resolution-sensitivity trade-off in conventional SPECT-this study presents the design and evaluation of the first full-ring self-collimation SPECT (SC-SPECT) scanner for small animal imaging. The system features four concentric detector rings and two interchangeable high-aperture-ratio tungsten collimator rings optimized for high-resolution (HR) and general-purpose (GP) imaging applications. Detector rings contain 480, 720, 960, and 1,200 evenly distributed GAGG(Ce) scintillators, each measuring 0.84 mm (tangential) × 6 mm (radial) × 20 mm (axial) and separated by 0.84-mm gaps to enable effective photon collimation. Inner detector rings and the collimator ring collectively provide collimation for photons reaching subsequent outer rings. Dual-end SiPM readouts facilitate axial depth-of-interaction measurements. Phantom and mouse studies are performed to assess the system's resolution, sensitivity, and field-of-view volume, and SC-SPECT demonstrates generally superior performance compared with state-of-the-art small-animal SPECT systems. Mouse bone images using 99mTc-MDP show CT-like resolution, clearly delineating detailed tracer uptake distributions within small structures such as mouse paws and skulls, indicating a significant technological advancement in small-animal SPECT imaging.
{"title":"A High-Performance Self-Collimation SPECT for Small Animal Imaging.","authors":"Debin Zhang,Zhenlei Lyu,Tianpeng Xu,Peng Fan,Zerui Yu,Qiqi Ye,Yifan Hu,Jing Wu,Qingyang Wei,Xin Zhang,Qianqian Gan,Yang Xu,Li Wang,Rutao Yao,Min-Fu Yang,Zuo-Xiang He,Yaqiang Liu,Tianyu Ma","doi":"10.1109/tmi.2026.3654599","DOIUrl":"https://doi.org/10.1109/tmi.2026.3654599","url":null,"abstract":"Stemmed from our novel single-photon imaging concept of detector self-collimation-which leverages detectors themselves as collimators to overcome the inherent resolution-sensitivity trade-off in conventional SPECT-this study presents the design and evaluation of the first full-ring self-collimation SPECT (SC-SPECT) scanner for small animal imaging. The system features four concentric detector rings and two interchangeable high-aperture-ratio tungsten collimator rings optimized for high-resolution (HR) and general-purpose (GP) imaging applications. Detector rings contain 480, 720, 960, and 1,200 evenly distributed GAGG(Ce) scintillators, each measuring 0.84 mm (tangential) × 6 mm (radial) × 20 mm (axial) and separated by 0.84-mm gaps to enable effective photon collimation. Inner detector rings and the collimator ring collectively provide collimation for photons reaching subsequent outer rings. Dual-end SiPM readouts facilitate axial depth-of-interaction measurements. Phantom and mouse studies are performed to assess the system's resolution, sensitivity, and field-of-view volume, and SC-SPECT demonstrates generally superior performance compared with state-of-the-art small-animal SPECT systems. Mouse bone images using 99mTc-MDP show CT-like resolution, clearly delineating detailed tracer uptake distributions within small structures such as mouse paws and skulls, indicating a significant technological advancement in small-animal SPECT imaging.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"4 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145986343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-15DOI: 10.1109/tmi.2026.3654249
Tao Song,Yicheng Wu,Minhao Hu,Xiangde Luo,Linda Wei,Guotai Wang,Yi Guo,Feng Xu,Shaoting Zhang
Multimodal MR image synthesis aims to generate missing modality images by effectively fusing and mapping from a subset of available MRI modalities. Most existing methods adopt an image-to-image translation paradigm, treating multiple modalities as input channels. However, these approaches often yield sub-optimal results due to the inherent difficulty in achieving precise feature-or semantic-level alignment across modalities. To address these challenges, we propose an Adaptive Group-wise Interaction Network (AGI-Net) that explicitly models both inter-modality and intra-modality relationships for multimodal MR image synthesis. Specifically, feature channels are first partitioned into predefined groups, after which an adaptive rolling mechanism is applied to conventional convolutional kernels to better capture feature and semantic correspondences between different modalities. In parallel, a cross-group attention module is introduced to enable effective feature fusion across groups, thereby enhancing the network's representational capacity. We validate the proposed AGI-Net on the publicly available IXI and BraTS2023 datasets. Experimental results demonstrate that AGI-Net achieves state-of-the-art performance in multimodal MR image synthesis tasks, confirming the effectiveness of its modality-aware interaction design. We release the relevant code at: https://github.com/zunzhumu/Adaptive-Group-wise-Interaction-Network-for-Multimodal-MRI-Synthesis.git.
{"title":"Learning Modality-Aware Representations: Adaptive Group-wise Interaction Network for Multimodal MRI Synthesis.","authors":"Tao Song,Yicheng Wu,Minhao Hu,Xiangde Luo,Linda Wei,Guotai Wang,Yi Guo,Feng Xu,Shaoting Zhang","doi":"10.1109/tmi.2026.3654249","DOIUrl":"https://doi.org/10.1109/tmi.2026.3654249","url":null,"abstract":"Multimodal MR image synthesis aims to generate missing modality images by effectively fusing and mapping from a subset of available MRI modalities. Most existing methods adopt an image-to-image translation paradigm, treating multiple modalities as input channels. However, these approaches often yield sub-optimal results due to the inherent difficulty in achieving precise feature-or semantic-level alignment across modalities. To address these challenges, we propose an Adaptive Group-wise Interaction Network (AGI-Net) that explicitly models both inter-modality and intra-modality relationships for multimodal MR image synthesis. Specifically, feature channels are first partitioned into predefined groups, after which an adaptive rolling mechanism is applied to conventional convolutional kernels to better capture feature and semantic correspondences between different modalities. In parallel, a cross-group attention module is introduced to enable effective feature fusion across groups, thereby enhancing the network's representational capacity. We validate the proposed AGI-Net on the publicly available IXI and BraTS2023 datasets. Experimental results demonstrate that AGI-Net achieves state-of-the-art performance in multimodal MR image synthesis tasks, confirming the effectiveness of its modality-aware interaction design. We release the relevant code at: https://github.com/zunzhumu/Adaptive-Group-wise-Interaction-Network-for-Multimodal-MRI-Synthesis.git.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"8 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Intraoperative anomalies cause deviations from the ideal surgical workflow, heightening the risk of consequential errors and complications. Their reliable recognition has traditionally relied on continuous surgeon monitoring, yet automated anomaly detection systems are now indispensable for the safe advancement of assistive and autonomous surgery. However, existing approaches struggle with domain shifts across surgical platforms and unpredictable scenarios in deformable surgical environments. To address this, we propose DA-MIST, a Domain Adaptive Multiple Instance Self-Training framework for weakly supervised anomaly detection. DA-MIST adopts a two-stage training strategy that combines multiple instance learning with self-training, enhanced by a scene-decoupled memory mechanism that disentangles state-irrelevant scene variations from memory banks, preserving only state-discriminative features for robust anomaly identification. Additionally, a state-aware dual-branch attention module integrates Gaussian dynamic and global self-attention for effective temporal reasoning. Evaluated on our newly compiled large-scale endoscopic video dataset encompassing seven representative anomalies, DA-MIST demonstrates strong adaptability across heterogeneous surgical domains, consistently reducing false alarms and enhancing anomaly localization accuracy. Our code and dataset will be available at: https://github.com/iamziang/DA-MIST.
{"title":"Domain Adaptive Multiple Instance Self-Training for Intraoperative Anomaly Detection.","authors":"Ziang Chen,Yiming Ding,Jianchang Zhao,Bo Yi,Jianguo Wei","doi":"10.1109/tmi.2026.3654087","DOIUrl":"https://doi.org/10.1109/tmi.2026.3654087","url":null,"abstract":"Intraoperative anomalies cause deviations from the ideal surgical workflow, heightening the risk of consequential errors and complications. Their reliable recognition has traditionally relied on continuous surgeon monitoring, yet automated anomaly detection systems are now indispensable for the safe advancement of assistive and autonomous surgery. However, existing approaches struggle with domain shifts across surgical platforms and unpredictable scenarios in deformable surgical environments. To address this, we propose DA-MIST, a Domain Adaptive Multiple Instance Self-Training framework for weakly supervised anomaly detection. DA-MIST adopts a two-stage training strategy that combines multiple instance learning with self-training, enhanced by a scene-decoupled memory mechanism that disentangles state-irrelevant scene variations from memory banks, preserving only state-discriminative features for robust anomaly identification. Additionally, a state-aware dual-branch attention module integrates Gaussian dynamic and global self-attention for effective temporal reasoning. Evaluated on our newly compiled large-scale endoscopic video dataset encompassing seven representative anomalies, DA-MIST demonstrates strong adaptability across heterogeneous surgical domains, consistently reducing false alarms and enhancing anomaly localization accuracy. Our code and dataset will be available at: https://github.com/iamziang/DA-MIST.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"17 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-15DOI: 10.1109/tmi.2026.3654612
Yuting Chen, Yuxiang Xing, Li Zhang, Zhi Deng, Hewei Gao
{"title":"Energy-Threshold Bias Calculator: A Physics-Model Based Adaptive Correction Scheme for Photon-Counting CT","authors":"Yuting Chen, Yuxiang Xing, Li Zhang, Zhi Deng, Hewei Gao","doi":"10.1109/tmi.2026.3654612","DOIUrl":"https://doi.org/10.1109/tmi.2026.3654612","url":null,"abstract":"","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"84 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145972015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-14DOI: 10.1109/tmi.2026.3653974
L. Guo, A. Bialkowski, A. Abbosh
{"title":"Medical Microwave Imaging Using Physics-Guided Deep Learning Part 2: The Inverse Solver","authors":"L. Guo, A. Bialkowski, A. Abbosh","doi":"10.1109/tmi.2026.3653974","DOIUrl":"https://doi.org/10.1109/tmi.2026.3653974","url":null,"abstract":"","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"141 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145972159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-13DOI: 10.1109/tmi.2026.3653779
Jiahui Huang,Jiaxin Huang,Mingdu Zhang,Qiong Wang,Xiao-Qing Pei,Ying Hu,Hao Chen,Yan Pang
Multimodal ultrasound imaging, combining B-mode ultrasound, shear wave velocity, and shear wave time, is crucial for diagnosing and treating breast lesions, providing insights into lesion characteristics and tissue properties. However, challenges arise from intermodal feature misalignment and attention shifts due to varied capture methods and an overemphasis on vibrant color data. To tackle these issues, we introduce two innovations: a novel segmentation framework and a comprehensive dataset. The UltraMamba framework utilizes bidirectional alignment between modalities and enhances region-specific information to improve breast lesion segmentation accuracy. Key components include the Cross-Modal Knowledge Interaction module for robust information exchange and the Region-Aware Feature Excitation module to focus on relevant features. We also present the BreLS dataset, the first two-dimensional multimodal ultrasound breast lesion dataset, with paired images from 506 cases, serving as a valuable resource for analysis. UltraMamba shows strong performance on the BreLS dataset, achieving a Dice Similarity Coefficient of 72.16% and an HD95 of 42.02 mm, reflecting improvements of 2.59% in DSC and a 6.78 mm reduction in HD95 compared to the second-best framework, MMCA-NET. These results highlight UltraMamba's potential to enhance segmentation accuracy in clinical settings, facilitating precise treatment planning and, ultimately, leading to improved outcomes. Code: https://github.com/deepang-ai/UltraMamba.
{"title":"UltraMamba: Mamba-based Multimodal Ultrasound Image Adaptive Fusion for Breast Lesion Segmentation.","authors":"Jiahui Huang,Jiaxin Huang,Mingdu Zhang,Qiong Wang,Xiao-Qing Pei,Ying Hu,Hao Chen,Yan Pang","doi":"10.1109/tmi.2026.3653779","DOIUrl":"https://doi.org/10.1109/tmi.2026.3653779","url":null,"abstract":"Multimodal ultrasound imaging, combining B-mode ultrasound, shear wave velocity, and shear wave time, is crucial for diagnosing and treating breast lesions, providing insights into lesion characteristics and tissue properties. However, challenges arise from intermodal feature misalignment and attention shifts due to varied capture methods and an overemphasis on vibrant color data. To tackle these issues, we introduce two innovations: a novel segmentation framework and a comprehensive dataset. The UltraMamba framework utilizes bidirectional alignment between modalities and enhances region-specific information to improve breast lesion segmentation accuracy. Key components include the Cross-Modal Knowledge Interaction module for robust information exchange and the Region-Aware Feature Excitation module to focus on relevant features. We also present the BreLS dataset, the first two-dimensional multimodal ultrasound breast lesion dataset, with paired images from 506 cases, serving as a valuable resource for analysis. UltraMamba shows strong performance on the BreLS dataset, achieving a Dice Similarity Coefficient of 72.16% and an HD95 of 42.02 mm, reflecting improvements of 2.59% in DSC and a 6.78 mm reduction in HD95 compared to the second-best framework, MMCA-NET. These results highlight UltraMamba's potential to enhance segmentation accuracy in clinical settings, facilitating precise treatment planning and, ultimately, leading to improved outcomes. Code: https://github.com/deepang-ai/UltraMamba.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"7 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145961412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}