Pub Date : 2025-11-29DOI: 10.1016/j.media.2025.103890
Veronika Spieker , Hannah Eichhorn , Wenqi Huang , Jonathan K. Stelter , Tabita Catalan , Rickmer F. Braren , Daniel Rueckert , Francisco Sahli Costabal , Kerstin Hammernik , Dimitrios C. Karampinos , Claudia Prieto , Julia A. Schnabel
Neural implicit k-space representations (NIK) have shown promising results for dynamic magnetic resonance imaging (MRI) at high temporal resolutions. Yet, reducing acquisition time, and thereby available training data, results in severe performance drops due to overfitting. To address this, we introduce a novel self-supervised k-space loss function , applicable for regularization of NIK-based reconstructions. The proposed loss function is based on the concept of parallel imaging-inspired self-consistency (PISCO), enforcing a consistent global k-space neighborhood relationship without requiring additional data. Quantitative and qualitative evaluations on static and dynamic MR reconstructions show that integrating PISCO significantly improves NIK representations, making it a competitive dynamic reconstruction method without constraining the temporal resolution. Particularly at high acceleration factors (R ≥ 50), NIK with PISCO can avoid temporal oversmoothing of state-of-the-art methods and achieves superior spatio-temporal reconstruction quality. Furthermore, an extensive analysis of the loss assumptions and stability shows PISCO’s potential as versatile self-supervised k-space loss function for further applications and architectures. Code is available at: https://github.com/compai-lab/2025-pisco-spieker
{"title":"PISCO: Self-supervised k-space regularization for improved neural implicit k-space representations of dynamic MRI","authors":"Veronika Spieker , Hannah Eichhorn , Wenqi Huang , Jonathan K. Stelter , Tabita Catalan , Rickmer F. Braren , Daniel Rueckert , Francisco Sahli Costabal , Kerstin Hammernik , Dimitrios C. Karampinos , Claudia Prieto , Julia A. Schnabel","doi":"10.1016/j.media.2025.103890","DOIUrl":"10.1016/j.media.2025.103890","url":null,"abstract":"<div><div>Neural implicit k-space representations (NIK) have shown promising results for dynamic magnetic resonance imaging (MRI) at high temporal resolutions. Yet, reducing acquisition time, and thereby available training data, results in severe performance drops due to overfitting. To address this, we introduce a novel self-supervised k-space loss function <span><math><msub><mi>L</mi><mtext>PISCO</mtext></msub></math></span>, applicable for regularization of NIK-based reconstructions. The proposed loss function is based on the concept of parallel imaging-inspired self-consistency (PISCO), enforcing a consistent global k-space neighborhood relationship without requiring additional data. Quantitative and qualitative evaluations on static and dynamic MR reconstructions show that integrating PISCO significantly improves NIK representations, making it a competitive dynamic reconstruction method without constraining the temporal resolution. Particularly at high acceleration factors (R ≥ 50), NIK with PISCO can avoid temporal oversmoothing of state-of-the-art methods and achieves superior spatio-temporal reconstruction quality. Furthermore, an extensive analysis of the loss assumptions and stability shows PISCO’s potential as versatile self-supervised k-space loss function for further applications and architectures. Code is available at: <span><span>https://github.com/compai-lab/2025-pisco-spieker</span><svg><path></path></svg></span></div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"109 ","pages":"Article 103890"},"PeriodicalIF":11.8,"publicationDate":"2025-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145619566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-28DOI: 10.1016/j.media.2025.103891
Siddharth Mittal, Michael Woletz, David Linhardt, Christian Windischberger
Population receptive field (pRF) mapping is a fundamental technique for understanding retinotopic organisation of the human visual system. Since its introduction in 2008, however, its scalability has been severely hindered by the computational bottleneck of iterative parameter refinement. Current state-of-the-art implementations either sacrifice precision for speed or rely on slow iterative parameter updates, limiting their applicability to large-scale datasets. Here, we present a novel mathematical reformulation of the General Linear Model (GLM), wrapped in a GPU-Empowered Mapping of population Receptive Fields (GEM-pRF) software implementation. By orthogonalizing the design matrix, our approach enables the direct and fast computation of the objective function’s derivatives, which are used to eliminate the iterative refinement process. This approach dramatically accelerates pRF estimation with high accuracy. Validation using empirical and simulated data confirms GEM-pRF’s accuracy, and benchmarking against established tools demonstrates a reduction in computation time of almost two orders of magnitude. With its modular and extensible design, GEM-pRF provides a critical advancement for large-scale fMRI retinotopic mapping. Furthermore, our reformulated GLM approach in combination with GPU-based implementation offers a broadly applicable solution that may extend beyond visual neuroscience, accelerating computational modelling across various domains in neuroimaging and beyond.
{"title":"GEM-pRF: GPU-empowered mapping of population receptive fields for large-scale fMRI analysis","authors":"Siddharth Mittal, Michael Woletz, David Linhardt, Christian Windischberger","doi":"10.1016/j.media.2025.103891","DOIUrl":"10.1016/j.media.2025.103891","url":null,"abstract":"<div><div>Population receptive field (pRF) mapping is a fundamental technique for understanding retinotopic organisation of the human visual system. Since its introduction in 2008, however, its scalability has been severely hindered by the computational bottleneck of iterative parameter refinement. Current state-of-the-art implementations either sacrifice precision for speed or rely on slow iterative parameter updates, limiting their applicability to large-scale datasets. Here, we present a novel mathematical reformulation of the General Linear Model (GLM), wrapped in a GPU-Empowered Mapping of population Receptive Fields (GEM-pRF) software implementation. By orthogonalizing the design matrix, our approach enables the direct and fast computation of the objective function’s derivatives, which are used to eliminate the iterative refinement process. This approach dramatically accelerates pRF estimation with high accuracy. Validation using empirical and simulated data confirms GEM-pRF’s accuracy, and benchmarking against established tools demonstrates a reduction in computation time of almost two orders of magnitude. With its modular and extensible design, GEM-pRF provides a critical advancement for large-scale fMRI retinotopic mapping. Furthermore, our reformulated GLM approach in combination with GPU-based implementation offers a broadly applicable solution that may extend beyond visual neuroscience, accelerating computational modelling across various domains in neuroimaging and beyond.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"109 ","pages":"Article 103891"},"PeriodicalIF":11.8,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145611717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-25DOI: 10.1016/j.media.2025.103889
Junzhuo Liu , Markus Eckstein , Zhixiang Wang , Friedrich Feuerhake , Dorit Merhof
Spatial transcriptomics is a technology that captures gene expression at different spatial locations, widely used in tumor microenvironment analysis and molecular profiling of histopathology, providing valuable insights into resolving gene expression and clinical diagnosis of cancer. Due to the high cost of data acquisition, large-scale spatial transcriptomics data remain challenging to obtain. In this study, we develop a contrastive learning-based deep learning method to predict spatially resolved gene expression from the whole-slide images (WSIs). Unlike existing end-to-end prediction frameworks, our method leverages multi-modal contrastive learning to establish a correspondence between histopathological morphology and spatial gene expression in the feature space. By computing cross-modal feature similarity, our method generates spatially resolved gene expression directly from WSIs. Furthermore, to enhance the standard contrastive learning paradigm, a cross-modal masked reconstruction is designed as a pretext task, enabling feature-level fusion between modalities. Notably, our method does not rely on large-scale pretraining datasets or abstract semantic representations from either modality, making it particularly effective for scenarios with limited spatial transcriptomics data. Evaluation across six different disease datasets demonstrates that, compared to existing studies, our method improves Pearson Correlation Coefficient (PCC) in the prediction of highly expressed genes, highly variable genes, and marker genes by 6.27 %, 6.11 %, and 11.26 % respectively. Further analysis indicates that our method preserves gene-gene correlations and applies to datasets with limited samples. Additionally, our method exhibits potential in cancer tissue localization based on biomarker expression. The code repository for this work is available at https://github.com/ngfufdrdh/CMRCNet.
{"title":"Spatial transcriptomics expression prediction from histopathology based on cross-modal mask reconstruction and contrastive learning","authors":"Junzhuo Liu , Markus Eckstein , Zhixiang Wang , Friedrich Feuerhake , Dorit Merhof","doi":"10.1016/j.media.2025.103889","DOIUrl":"10.1016/j.media.2025.103889","url":null,"abstract":"<div><div>Spatial transcriptomics is a technology that captures gene expression at different spatial locations, widely used in tumor microenvironment analysis and molecular profiling of histopathology, providing valuable insights into resolving gene expression and clinical diagnosis of cancer. Due to the high cost of data acquisition, large-scale spatial transcriptomics data remain challenging to obtain. In this study, we develop a contrastive learning-based deep learning method to predict spatially resolved gene expression from the whole-slide images (WSIs). Unlike existing end-to-end prediction frameworks, our method leverages multi-modal contrastive learning to establish a correspondence between histopathological morphology and spatial gene expression in the feature space. By computing cross-modal feature similarity, our method generates spatially resolved gene expression directly from WSIs. Furthermore, to enhance the standard contrastive learning paradigm, a cross-modal masked reconstruction is designed as a pretext task, enabling feature-level fusion between modalities. Notably, our method does not rely on large-scale pretraining datasets or abstract semantic representations from either modality, making it particularly effective for scenarios with limited spatial transcriptomics data. Evaluation across six different disease datasets demonstrates that, compared to existing studies, our method improves Pearson Correlation Coefficient (PCC) in the prediction of highly expressed genes, highly variable genes, and marker genes by 6.27 %, 6.11 %, and 11.26 % respectively. Further analysis indicates that our method preserves gene-gene correlations and applies to datasets with limited samples. Additionally, our method exhibits potential in cancer tissue localization based on biomarker expression. The code repository for this work is available at <span><span>https://github.com/ngfufdrdh/CMRCNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"108 ","pages":"Article 103889"},"PeriodicalIF":11.8,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145592833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-24DOI: 10.1016/j.media.2025.103887
Junyu Chen , Shuwen Wei , Yihao Liu , Zhangxing Bian , Yufan He , Aaron Carass , Harrison Bai , Yong Du
Spatially varying regularization accommodates the deformation variations that may be necessary for different anatomical regions during deformable image registration. Historically, optimization-based registration models have harnessed spatially varying regularization to address anatomical subtleties. However, most modern deep learning-based models tend to gravitate towards spatially invariant regularization, wherein a homogenous regularization strength is applied across the entire image, potentially disregarding localized variations. In this paper, we propose a hierarchical probabilistic model that integrates a prior distribution on the deformation regularization strength, enabling the end-to-end learning of a spatially varying deformation regularizer directly from the data. The proposed method is straightforward to implement and easily integrates with various registration network architectures. Additionally, automatic tuning of hyperparameters is achieved through Bayesian optimization, allowing efficient identification of optimal hyperparameters for any given registration task. Comprehensive evaluations on publicly available datasets demonstrate that the proposed method significantly improves registration performance and enhances the interpretability of deep learning-based registration, all while maintaining smooth deformations. Our code is freely available at http://bit.ly/3BrXGxz.
{"title":"Unsupervised learning of spatially varying regularization for diffeomorphic image registration","authors":"Junyu Chen , Shuwen Wei , Yihao Liu , Zhangxing Bian , Yufan He , Aaron Carass , Harrison Bai , Yong Du","doi":"10.1016/j.media.2025.103887","DOIUrl":"10.1016/j.media.2025.103887","url":null,"abstract":"<div><div>Spatially varying regularization accommodates the deformation variations that may be necessary for different anatomical regions during deformable image registration. Historically, optimization-based registration models have harnessed spatially varying regularization to address anatomical subtleties. However, most modern deep learning-based models tend to gravitate towards spatially invariant regularization, wherein a homogenous regularization strength is applied across the entire image, potentially disregarding localized variations. In this paper, we propose a hierarchical probabilistic model that integrates a prior distribution on the deformation regularization strength, enabling the end-to-end learning of a spatially varying deformation regularizer directly from the data. The proposed method is straightforward to implement and easily integrates with various registration network architectures. Additionally, automatic tuning of hyperparameters is achieved through Bayesian optimization, allowing efficient identification of optimal hyperparameters for any given registration task. Comprehensive evaluations on publicly available datasets demonstrate that the proposed method significantly improves registration performance and enhances the interpretability of deep learning-based registration, all while maintaining smooth deformations. Our code is freely available at <span><span>http://bit.ly/3BrXGxz</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"108 ","pages":"Article 103887"},"PeriodicalIF":11.8,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145592834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-22DOI: 10.1016/j.media.2025.103883
Kang Wang , Chen Qin , Zhang Shi , Haoran Wang , Xiwen Zhang , Chen Chen , Cheng Ouyang , Chengliang Dai , Yuanhan Mo , Chenchen Dai , Xutong Kuang , Ruizhe Li , Xin Chen , Xiuzheng Yue , Song Tian , Alejandro Mora-Rubio , Kumaradevan Punithakumar , Shizhan Gong , Qi Dou , Sina Amirrajab , Shuo Wang
Deep learning models have achieved state-of-the-art performance in automated Cardiac Magnetic Resonance (CMR) analysis. However, the efficacy of these models is highly dependent on the availability of high-quality, artifact-free images. In clinical practice, CMR acquisitions are frequently degraded by respiratory motion, yet the robustness of deep learning models against such artifacts remains an underexplored problem. To promote research in this domain, we organized the MICCAI CMRxMotion challenge. We curated and publicly released a dataset of 320 CMR cine series from 40 healthy volunteers who performed specific breathing protocols to induce a controlled spectrum of motion artifacts. The challenge comprised two tasks: 1) automated image quality assessment to classify images based on motion severity, and 2) robust myocardial segmentation in the presence of motion artifacts. A total of 22 algorithms were submitted and evaluated on the two designated tasks. This paper presents a comprehensive overview of the challenge design and dataset, reports the evaluation results for the top-performing methods, and further investigates the impact of motion artifacts on five clinically relevant biomarkers. All resources and code are publicly available at: https://github.com/CMRxMotion.
{"title":"Extreme cardiac MRI analysis under respiratory motion: Results of the CMRxMotion challenge","authors":"Kang Wang , Chen Qin , Zhang Shi , Haoran Wang , Xiwen Zhang , Chen Chen , Cheng Ouyang , Chengliang Dai , Yuanhan Mo , Chenchen Dai , Xutong Kuang , Ruizhe Li , Xin Chen , Xiuzheng Yue , Song Tian , Alejandro Mora-Rubio , Kumaradevan Punithakumar , Shizhan Gong , Qi Dou , Sina Amirrajab , Shuo Wang","doi":"10.1016/j.media.2025.103883","DOIUrl":"10.1016/j.media.2025.103883","url":null,"abstract":"<div><div>Deep learning models have achieved state-of-the-art performance in automated Cardiac Magnetic Resonance (CMR) analysis. However, the efficacy of these models is highly dependent on the availability of high-quality, artifact-free images. In clinical practice, CMR acquisitions are frequently degraded by respiratory motion, yet the robustness of deep learning models against such artifacts remains an underexplored problem. To promote research in this domain, we organized the <em>MICCAI CMRxMotion</em> challenge. We curated and publicly released a dataset of 320 CMR cine series from 40 healthy volunteers who performed specific breathing protocols to induce a controlled spectrum of motion artifacts. The challenge comprised two tasks: 1) automated image quality assessment to classify images based on motion severity, and 2) robust myocardial segmentation in the presence of motion artifacts. A total of 22 algorithms were submitted and evaluated on the two designated tasks. This paper presents a comprehensive overview of the challenge design and dataset, reports the evaluation results for the top-performing methods, and further investigates the impact of motion artifacts on five clinically relevant biomarkers. All resources and code are publicly available at: <span><span>https://github.com/CMRxMotion</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"109 ","pages":"Article 103883"},"PeriodicalIF":11.8,"publicationDate":"2025-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145567299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-21DOI: 10.1016/j.media.2025.103885
He Li , Xiangde Luo , Jia Fu , Ran Gu , Wenjun Liao , Shichuan Zhang , Kang Li , Guotai Wang , Shaoting Zhang
Accurate segmentation of abdominal organs in Computed Tomography (CT) scans is crucial for effective lesion diagnosis, radiotherapy planning, and patient follow-up. Although deep learning has shown great performance with fully supervised learning, it requires voxel-level dense annotations that are time-consuming and costly to obtain, especially for multiple organs. In this work, we propose a novel framework PL-Seg, for multi-organ segmentation in abdominal CT scans using partially labeled data, where only a subset of the target organ classes are annotated in each volume. First, we introduce a novel Hardness-Aware Decoupled Foreground Loss (HADFL), which focuses exclusively on annotated organs and dynamically adjusts class weights based on historical segmentation difficulty. Then, we employ a Classwise Orthogonal Contrastive Loss (COCL) to reduce inter-class ambiguity, which serves as a regularization for unlabeled regions. In addition, a Progressive Self-Distillation (PSD) that distills knowledge from deep high-resolution layers to shallower low-resolution levels is proposed to improve the feature learning ability under partial class annotations. Experiments conducted on a dataset with varying class-wise annotation ratios and a real clinical partially labeled dataset demonstrate that: 1) PL-Seg achieves significant performance improvements by leveraging unlabeled categories, 2) Compared with six state-of-the-art methods, PL-Seg achieves superior results with a simpler pipeline and greater computational efficiency, and 3) Under the same annotation cost, PL-Seg outperforms existing semi-supervised methods. Furthermore, we release a partially labeled medical image segmentation codebase and benchmark to boost research on this topic: https://github.com/HiLab-git/PLS4MIS.
{"title":"PL-Seg: Partially labeled abdominal organ segmentation via classwise orthogonal contrastive learning and progressive self-distillation","authors":"He Li , Xiangde Luo , Jia Fu , Ran Gu , Wenjun Liao , Shichuan Zhang , Kang Li , Guotai Wang , Shaoting Zhang","doi":"10.1016/j.media.2025.103885","DOIUrl":"10.1016/j.media.2025.103885","url":null,"abstract":"<div><div>Accurate segmentation of abdominal organs in Computed Tomography (CT) scans is crucial for effective lesion diagnosis, radiotherapy planning, and patient follow-up. Although deep learning has shown great performance with fully supervised learning, it requires voxel-level dense annotations that are time-consuming and costly to obtain, especially for multiple organs. In this work, we propose a novel framework PL-Seg, for multi-organ segmentation in abdominal CT scans using partially labeled data, where only a subset of the target organ classes are annotated in each volume. First, we introduce a novel Hardness-Aware Decoupled Foreground Loss (HADFL), which focuses exclusively on annotated organs and dynamically adjusts class weights based on historical segmentation difficulty. Then, we employ a Classwise Orthogonal Contrastive Loss (COCL) to reduce inter-class ambiguity, which serves as a regularization for unlabeled regions. In addition, a Progressive Self-Distillation (PSD) that distills knowledge from deep high-resolution layers to shallower low-resolution levels is proposed to improve the feature learning ability under partial class annotations. Experiments conducted on a dataset with varying class-wise annotation ratios and a real clinical partially labeled dataset demonstrate that: 1) PL-Seg achieves significant performance improvements by leveraging unlabeled categories, 2) Compared with six state-of-the-art methods, PL-Seg achieves superior results with a simpler pipeline and greater computational efficiency, and 3) Under the same annotation cost, PL-Seg outperforms existing semi-supervised methods. Furthermore, we release a partially labeled medical image segmentation codebase and benchmark to boost research on this topic: <span><span>https://github.com/HiLab-git/PLS4MIS</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"108 ","pages":"Article 103885"},"PeriodicalIF":11.8,"publicationDate":"2025-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145568010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-20DOI: 10.1016/j.media.2025.103882
Qibiao Wu , Yagang Wang , Qian Zhang
Manual annotation of airway regions in computed tomography images is a time-consuming and expertise-dependent task. Automatic airway segmentation is therefore a prerequisite for enabling rapid bronchoscopic navigation and the clinical deployment of bronchoscopic robotic systems. Although convolutional neural network methods have gained considerable attention in airway segmentation, the unique tree-like structure of airways poses challenges for conventional and deformable convolutions, which often fail to focus on fine airway structures, leading to missed segments and discontinuities. To address this issue, this study proposes a novel tubular feature extraction network, named TfeNet. TfeNet introduces a novel direction-aware convolution operator that adapts the geometry of linear convolution kernels through spatial rotation transformations, enabling it to dynamically align with the tubular structures of airways and effectively enhance feature extraction. Furthermore, a tubular feature fusion module (TFFM) is designed based on asymmetric convolution and residual connection strategies, effectively capturing the features of airway tubules from different directions. Extensive experiments conducted on one public dataset and two datasets used in airway segmentation challenges demonstrate the effectiveness of TfeNet. Specifically, our method achieves a comprehensive lead in both accuracy and continuity on the BAS dataset, attains the highest mean score of 94.95 % on the ATM22 dataset by balancing accuracy and continuity, and demonstrates superior leakage control and precision on the challenging AIIB23 dataset. The code is available at https://github.com/QibiaoWu/TfeNet.
{"title":"Direction-Aware convolution for airway tubular feature enhancement network","authors":"Qibiao Wu , Yagang Wang , Qian Zhang","doi":"10.1016/j.media.2025.103882","DOIUrl":"10.1016/j.media.2025.103882","url":null,"abstract":"<div><div>Manual annotation of airway regions in computed tomography images is a time-consuming and expertise-dependent task. Automatic airway segmentation is therefore a prerequisite for enabling rapid bronchoscopic navigation and the clinical deployment of bronchoscopic robotic systems. Although convolutional neural network methods have gained considerable attention in airway segmentation, the unique tree-like structure of airways poses challenges for conventional and deformable convolutions, which often fail to focus on fine airway structures, leading to missed segments and discontinuities. To address this issue, this study proposes a novel tubular feature extraction network, named TfeNet. TfeNet introduces a novel direction-aware convolution operator that adapts the geometry of linear convolution kernels through spatial rotation transformations, enabling it to dynamically align with the tubular structures of airways and effectively enhance feature extraction. Furthermore, a tubular feature fusion module (TFFM) is designed based on asymmetric convolution and residual connection strategies, effectively capturing the features of airway tubules from different directions. Extensive experiments conducted on one public dataset and two datasets used in airway segmentation challenges demonstrate the effectiveness of TfeNet. Specifically, our method achieves a comprehensive lead in both accuracy and continuity on the BAS dataset, attains the highest mean score of 94.95 % on the ATM22 dataset by balancing accuracy and continuity, and demonstrates superior leakage control and precision on the challenging AIIB23 dataset. The code is available at <span><span>https://github.com/QibiaoWu/TfeNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"108 ","pages":"Article 103882"},"PeriodicalIF":11.8,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145559818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-20DOI: 10.1016/j.media.2025.103884
Patrick Fuhlert , Fabian Westhaeusser , Esther Dietrich , Maximilian Lennartz , Robin Khatri , Nico Kaiser , Pontus Röbeck , Roman Bülow , Saskia Von Stillfried , Anja Witte , Sam Ladjevardi , Anders Drotte , Peter Severgårdh , Jan Baumbach , Victor G. Puelles , Michael Häggman , Michael Brehler , Peter Boor , Peter Walhagen , Anca Dragomir , Stefan Bonn
The histopathological evaluation of biopsies by human experts is a gold standard in clinical disease diagnosis. While recent artificial intelligence-based (AI) approaches have reached human expert-level performance, they often display shortcomings caused by variations in sample preparation, limiting clinical applicability. This study investigates the impact of data variation on AI-based histopathological grading and explores algorithmic approaches that confer prediction robustness. To evaluate the impact of data variation in histopathology, we collected a multicentric, retrospective, observational prostate cancer (PCa) trial consisting of six cohorts in 3 countries with 25,591 patients, 83,864 images. This includes a high-variance dataset of 8,157 patients and 28,236 images with variations in section thickness, staining protocol, and scanner. This unique training dataset enabled the development of an AI-based PCa grading framework by training on patient outcome, not subjective grading. It was made robust through several algorithmic adaptations, including domain adversarial training and credibility-guided color adaptation. We named the final grading framework PCAI. We compare PCAI to a BASE model and human experts on three external test cohorts, comprising 2,255 patients and 9,437 images. Variations in sample processing, particularly section thickness and staining time, significantly reduced the performance of AI-based PCa grading by up to 8.6 percentage points in the event-ordered concordance index (EOC-Index) thus highlighting serious risks for AI-based histopathological grading. Algorithmic improvements for model robustness, credibility, and training on high-variance data as well as outcome-based severity prediction give rise to robust models with grading performance surpassing experienced pathologists. We demonstrate how our algorithmic enhancements for greater robustness lead to significantly better performance, surpassing expert grading on EOC-Index and 5-year AUROC by up to 21.2 percentage points.
{"title":"A systematic analysis of the impact of data variation on AI-based histopathological grading of prostate cancer","authors":"Patrick Fuhlert , Fabian Westhaeusser , Esther Dietrich , Maximilian Lennartz , Robin Khatri , Nico Kaiser , Pontus Röbeck , Roman Bülow , Saskia Von Stillfried , Anja Witte , Sam Ladjevardi , Anders Drotte , Peter Severgårdh , Jan Baumbach , Victor G. Puelles , Michael Häggman , Michael Brehler , Peter Boor , Peter Walhagen , Anca Dragomir , Stefan Bonn","doi":"10.1016/j.media.2025.103884","DOIUrl":"10.1016/j.media.2025.103884","url":null,"abstract":"<div><div>The histopathological evaluation of biopsies by human experts is a gold standard in clinical disease diagnosis. While recent artificial intelligence-based (AI) approaches have reached human expert-level performance, they often display shortcomings caused by variations in sample preparation, limiting clinical applicability. This study investigates the impact of data variation on AI-based histopathological grading and explores algorithmic approaches that confer prediction robustness. To evaluate the impact of data variation in histopathology, we collected a multicentric, retrospective, observational prostate cancer (PCa) trial consisting of six cohorts in 3 countries with 25,591 patients, 83,864 images. This includes a high-variance dataset of 8,157 patients and 28,236 images with variations in section thickness, staining protocol, and scanner. This unique training dataset enabled the development of an AI-based PCa grading framework by training on patient outcome, not subjective grading. It was made robust through several algorithmic adaptations, including domain adversarial training and credibility-guided color adaptation. We named the final grading framework PCAI. We compare PCAI to a BASE model and human experts on three external test cohorts, comprising 2,255 patients and 9,437 images. Variations in sample processing, particularly section thickness and staining time, significantly reduced the performance of AI-based PCa grading by up to 8.6 percentage points in the event-ordered concordance index (EOC-Index) thus highlighting serious risks for AI-based histopathological grading. Algorithmic improvements for model robustness, credibility, and training on high-variance data as well as outcome-based severity prediction give rise to robust models with grading performance surpassing experienced pathologists. We demonstrate how our algorithmic enhancements for greater robustness lead to significantly better performance, surpassing expert grading on EOC-Index and 5-year AUROC by up to 21.2 percentage points.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"108 ","pages":"Article 103884"},"PeriodicalIF":11.8,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145559819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-19DOI: 10.1016/j.media.2025.103863
Mingyuan Liu , Lu Xu , Yuzhuo Gu , Jicong Zhang , Shuo Li
Unlike the prevalent image classification paradigm that assumes all samples belong to pre-defined classes, Open set recognition (OSR) indicates that new classes unobserved during training could appear in testing. It mandates a model to not only categorize known classes but also recognize unknowns. Existing prototype-based solutions model each class using a single prototype and recognize samples that are distant from these prototypes as unknowns. However, single-prototype modeling overlooks intra-class variance, leading to large open space risk. Additionally, open space regularization is ignored, allowing unknown samples to remain in their initial positions that overlap with the known space, thus impeding unknown discrimination. To address these limitations, we propose Openness-Aware Multi-Prototype Learning (OAMPL) with two novel designs: (1) Adaptive Open Multi-Prototype Formulation (AOMF) extends single-prototype modeling to a novel multi-prototype formulation. It reduces open space risk by simultaneously avoiding class underrepresentation and anticipating unknown occurrences. Additionally, AOMF incorporates a balancing term, a marginal factor, and a learnable scalar to flexibly fit intricate open environments. (2) Difficulty Aware Openness Simulator (DAOS) dynamically synthesizes fake features at varying difficulties to represent open classes. By punishing the adjacency between the fake and the known, the known-unknown discrimination could be enhanced. DAOS is distinguished by its joint optimization with AOMF, allowing it to cooperate with the classifier to produce samples with appropriate difficulties for effective learning. As OSR is a nascent topic in medical fields, we contribute three benchmark datasets. Compared with state-of-the-art models, our OAMPL maintains closed set accuracy and achieves improvements in OSR at about 1.5 % and 1.2 % measured by AUROC and OSCR, respectively. Extensive ablation experiments demonstrate the effectiveness of each design.
{"title":"Openness-aware multi-prototype learning for open set medical diagnosis","authors":"Mingyuan Liu , Lu Xu , Yuzhuo Gu , Jicong Zhang , Shuo Li","doi":"10.1016/j.media.2025.103863","DOIUrl":"10.1016/j.media.2025.103863","url":null,"abstract":"<div><div>Unlike the prevalent image classification paradigm that assumes all samples belong to pre-defined classes, Open set recognition (OSR) indicates that new classes unobserved during training could appear in testing. It mandates a model to not only categorize known classes but also recognize unknowns. Existing prototype-based solutions model each class using a single prototype and recognize samples that are distant from these prototypes as unknowns. However, single-prototype modeling overlooks intra-class variance, leading to large open space risk. Additionally, open space regularization is ignored, allowing unknown samples to remain in their initial positions that overlap with the known space, thus impeding unknown discrimination. To address these limitations, we propose Openness-Aware Multi-Prototype Learning (OAMPL) with two novel designs: (1) Adaptive Open Multi-Prototype Formulation (AOMF) extends single-prototype modeling to a novel multi-prototype formulation. It reduces open space risk by simultaneously avoiding class underrepresentation and anticipating unknown occurrences. Additionally, AOMF incorporates a balancing term, a marginal factor, and a learnable scalar to flexibly fit intricate open environments. (2) Difficulty Aware Openness Simulator (DAOS) dynamically synthesizes fake features at varying difficulties to represent open classes. By punishing the adjacency between the fake and the known, the known-unknown discrimination could be enhanced. DAOS is distinguished by its joint optimization with AOMF, allowing it to cooperate with the classifier to produce samples with appropriate difficulties for effective learning. As OSR is a nascent topic in medical fields, we contribute three benchmark datasets. Compared with state-of-the-art models, our OAMPL maintains closed set accuracy and achieves improvements in OSR at about 1.5 % and 1.2 % measured by AUROC and OSCR, respectively. Extensive ablation experiments demonstrate the effectiveness of each design.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"108 ","pages":"Article 103863"},"PeriodicalIF":11.8,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145553607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-17DOI: 10.1016/j.media.2025.103880
Mingrui Ma , Furong Luo , Binlin Ma , Shuxian Liu , Xiaoyi Lv , Pan Huang
The pathological grading of cervical squamous cell carcinoma (CSCC) is a fundamental and important index in tumor diagnosis. Pathologists tend to focus on single differentiation areas during the grading process. Existing multi-instance learning (MIL) methods divide pathology images into regions, generating multiple differentiated instances (MDIs) that often exhibit ambiguous grading patterns. These ambiguities reduce the model’s ability to accurately represent CSCC pathological grading patterns. Motivated by these issues, we propose an end-to-end multi-instance learning network with prototype-instance adversarial contrastive learning, termed PacMIL, which incorporates three key ideas. First, we introduce an end-to-end multi-instance nonequilibrium learning algorithm that addresses the mismatch between MIL feature representations and CSCC pathological grading, and enables nonequilibrium representation. Second, we design a prototype-instance adversarial contrastive (PAC) approach that integrates a priori prototype instances and a probability distribution attention mechanism. This enhances the model’s ability to learn representations for single differentiated instances (SDIs). Third, we incorporate an adversarial contrastive learning strategy into the PAC method to overcome the limitation that fixed metrics rarely capture the variability of MDIs and SDIs. In addition, we embed the correct metric distances of the MDIs and SDIs into the optimization objective function to further guide representation learning. Extensive experiments demonstrate that our PacMIL model achieves 93.09% and 0.9802 for the mAcc and AUC metrics, respectively, outperforming other SOTA models. Moreover, the representation ability of PacMIL is superior to that of existing SOTA approaches. Overall, our model offers enhanced practicality in CSCC pathological grading. Our code and dataset will be publicly available at https://github.com/Baron-Huang/PacMIL.
{"title":"A Multi-instance Learning Network with Prototype-instance Adversarial Contrastive for Cervix Pathology Grading","authors":"Mingrui Ma , Furong Luo , Binlin Ma , Shuxian Liu , Xiaoyi Lv , Pan Huang","doi":"10.1016/j.media.2025.103880","DOIUrl":"10.1016/j.media.2025.103880","url":null,"abstract":"<div><div>The pathological grading of cervical squamous cell carcinoma (CSCC) is a fundamental and important index in tumor diagnosis. Pathologists tend to focus on single differentiation areas during the grading process. Existing multi-instance learning (MIL) methods divide pathology images into regions, generating multiple differentiated instances (MDIs) that often exhibit ambiguous grading patterns. These ambiguities reduce the model’s ability to accurately represent CSCC pathological grading patterns. Motivated by these issues, we propose an end-to-end multi-instance learning network with prototype-instance adversarial contrastive learning, termed PacMIL, which incorporates three key ideas. First, we introduce an end-to-end multi-instance nonequilibrium learning algorithm that addresses the mismatch between MIL feature representations and CSCC pathological grading, and enables nonequilibrium representation. Second, we design a prototype-instance adversarial contrastive (PAC) approach that integrates a priori prototype instances and a probability distribution attention mechanism. This enhances the model’s ability to learn representations for single differentiated instances (SDIs). Third, we incorporate an adversarial contrastive learning strategy into the PAC method to overcome the limitation that fixed metrics rarely capture the variability of MDIs and SDIs. In addition, we embed the correct metric distances of the MDIs and SDIs into the optimization objective function to further guide representation learning. Extensive experiments demonstrate that our PacMIL model achieves 93.09% and 0.9802 for the mAcc and AUC metrics, respectively, outperforming other SOTA models. Moreover, the representation ability of PacMIL is superior to that of existing SOTA approaches. Overall, our model offers enhanced practicality in CSCC pathological grading. Our code and dataset will be publicly available at <span><span><em>https://github.com/Baron-Huang/PacMIL</em></span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"109 ","pages":"Article 103880"},"PeriodicalIF":11.8,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145535684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}