Accurate assessment of computed tomography (CT) image quality is crucial for ensuring diagnostic accuracy, optimizing imaging protocols, and preventing excessive radiation exposure. In clinical settings, where high-quality reference images are often unavailable, developing no-reference image quality assessment (NR-IQA) methods is essential. Recently, CT-NR-IQA methods using deep learning have been widely studied; however, significant challenges remain in handling multiple degradation factors and accurately reflecting real-world degradations. To address these issues, we propose a novel CT-NR-IQA method. Our approach utilizes a dataset that combines two degradation factors (noise and blur) to train convolutional neural network (CNN) models capable of handling multiple degradation factors. Additionally, we leveraged RadImageNet pre-trained models (ResNet50, DenseNet121, InceptionV3, and InceptionResNetV2), allowing the models to learn deep features from large-scale real clinical images, thus enhancing adaptability to real-world degradations without relying on artificially degraded images. The models' performances were evaluated by measuring the correlation between the subjective scores and predicted image quality scores for both artificially degraded and real clinical image datasets. The results demonstrated positive correlations between the subjective and predicted scores for both datasets. In particular, ResNet50 showed the best performance, with a correlation coefficient of 0.910 for the artificially degraded images and 0.831 for the real clinical images. These findings indicate that the proposed method could serve as a potential surrogate for subjective assessment in CT-NR-IQA.
{"title":"Development of a No-Reference CT Image Quality Assessment Method Using RadImageNet Pre-trained Deep Learning Models.","authors":"Kohei Ohashi, Yukihiro Nagatani, Asumi Yamazaki, Makoto Yoshigoe, Kyohei Iwai, Ryo Uemura, Masayuki Shimomura, Kenta Tanimura, Takayuki Ishida","doi":"10.1007/s10278-025-01542-2","DOIUrl":"10.1007/s10278-025-01542-2","url":null,"abstract":"<p><p>Accurate assessment of computed tomography (CT) image quality is crucial for ensuring diagnostic accuracy, optimizing imaging protocols, and preventing excessive radiation exposure. In clinical settings, where high-quality reference images are often unavailable, developing no-reference image quality assessment (NR-IQA) methods is essential. Recently, CT-NR-IQA methods using deep learning have been widely studied; however, significant challenges remain in handling multiple degradation factors and accurately reflecting real-world degradations. To address these issues, we propose a novel CT-NR-IQA method. Our approach utilizes a dataset that combines two degradation factors (noise and blur) to train convolutional neural network (CNN) models capable of handling multiple degradation factors. Additionally, we leveraged RadImageNet pre-trained models (ResNet50, DenseNet121, InceptionV3, and InceptionResNetV2), allowing the models to learn deep features from large-scale real clinical images, thus enhancing adaptability to real-world degradations without relying on artificially degraded images. The models' performances were evaluated by measuring the correlation between the subjective scores and predicted image quality scores for both artificially degraded and real clinical image datasets. The results demonstrated positive correlations between the subjective and predicted scores for both datasets. In particular, ResNet50 showed the best performance, with a correlation coefficient of 0.910 for the artificially degraded images and 0.831 for the real clinical images. These findings indicate that the proposed method could serve as a potential surrogate for subjective assessment in CT-NR-IQA.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"46-58"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12921086/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144164187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-04-01DOI: 10.1007/s10278-025-01492-9
Jinglan Guo, Jue Liao, Yuanlian Chen, Lisha Wen, Song Cheng
Microarray technology has become a vital tool in cardiovascular research, enabling the simultaneous analysis of thousands of gene expressions. This capability provides a robust foundation for heart disease classification and biomarker discovery. However, the high dimensionality, noise, and sparsity of microarray data present significant challenges for effective analysis. Gene selection, which aims to identify the most relevant subset of genes, is a crucial preprocessing step for improving classification accuracy, reducing computational complexity, and enhancing biological interpretability. Traditional gene selection methods often fall short in capturing complex, nonlinear interactions among genes, limiting their effectiveness in heart disease classification tasks. In this study, we propose a novel framework that leverages deep neural networks (DNNs) for optimizing gene selection and heart disease classification using microarray data. DNNs, known for their ability to model complex, nonlinear patterns, are integrated with feature selection techniques to address the challenges of high-dimensional data. The proposed method, DeepGeneNet (DGN), combines gene selection and DNN-based classification into a unified framework, ensuring robust performance and meaningful insights into the underlying biological mechanisms. Additionally, the framework incorporates hyperparameter optimization and innovative U-Net segmentation techniques to further enhance computational performance and classification accuracy. These optimizations enable DGN to deliver robust and scalable results, outperforming traditional methods in both predictive accuracy and interpretability. Experimental results demonstrate that the proposed approach significantly improves heart disease classification accuracy compared to other methods. By focusing on the interplay between gene selection and deep learning, this work advances the field of cardiovascular genomics, providing a scalable and interpretable framework for future applications.
{"title":"New Machine Learning Method for Medical Image and Microarray Data Analysis for Heart Disease Classification.","authors":"Jinglan Guo, Jue Liao, Yuanlian Chen, Lisha Wen, Song Cheng","doi":"10.1007/s10278-025-01492-9","DOIUrl":"10.1007/s10278-025-01492-9","url":null,"abstract":"<p><p>Microarray technology has become a vital tool in cardiovascular research, enabling the simultaneous analysis of thousands of gene expressions. This capability provides a robust foundation for heart disease classification and biomarker discovery. However, the high dimensionality, noise, and sparsity of microarray data present significant challenges for effective analysis. Gene selection, which aims to identify the most relevant subset of genes, is a crucial preprocessing step for improving classification accuracy, reducing computational complexity, and enhancing biological interpretability. Traditional gene selection methods often fall short in capturing complex, nonlinear interactions among genes, limiting their effectiveness in heart disease classification tasks. In this study, we propose a novel framework that leverages deep neural networks (DNNs) for optimizing gene selection and heart disease classification using microarray data. DNNs, known for their ability to model complex, nonlinear patterns, are integrated with feature selection techniques to address the challenges of high-dimensional data. The proposed method, DeepGeneNet (DGN), combines gene selection and DNN-based classification into a unified framework, ensuring robust performance and meaningful insights into the underlying biological mechanisms. Additionally, the framework incorporates hyperparameter optimization and innovative U-Net segmentation techniques to further enhance computational performance and classification accuracy. These optimizations enable DGN to deliver robust and scalable results, outperforming traditional methods in both predictive accuracy and interpretability. Experimental results demonstrate that the proposed approach significantly improves heart disease classification accuracy compared to other methods. By focusing on the interplay between gene selection and deep learning, this work advances the field of cardiovascular genomics, providing a scalable and interpretable framework for future applications.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"884-907"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12921063/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143766305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-04-08DOI: 10.1007/s10278-025-01493-8
Jamie Chow, Ryan Lee, Honghan Wu
Artificial intelligence (AI) in radiology is becoming increasingly prevalent; however, there is not a clear picture of how AI is being monitored today and how this should practically be done given the inherent risk of AI model performance degradation over time. This research investigates current practices and what difficulties radiologists face in monitoring AI. Semi-structured virtual interviews were conducted with 6 USA and 10 Europe-based radiologists. The interviews were automatically transcribed and underwent thematic analysis. The findings suggest that AI monitoring in radiology is still relatively nascent as most of the AI projects had not yet progressed into a fully live clinical deployment. The most common method of monitoring involved a manual process of retrospectively comparing the AI results against the radiology report. Automated and statistical methods of monitoring were much less common. The biggest challenges are a lack of resources to support AI monitoring and uncertainty about how to create a robust and scalable process of monitoring the breadth and variety of radiology AI applications available. There is currently a lack of practical guidelines on how to monitor AI which has led to a variety of approaches being proposed from both healthcare providers and vendors. An ensemble of mixed methods is recommended to monitor AI across multiple domains and metrics. This will be enabled by appropriate allocation of resources and the formation of robust and diverse multidisciplinary AI governance groups.
{"title":"How Do Radiologists Currently Monitor AI in Radiology and What Challenges Do They Face? An Interview Study and Qualitative Analysis.","authors":"Jamie Chow, Ryan Lee, Honghan Wu","doi":"10.1007/s10278-025-01493-8","DOIUrl":"10.1007/s10278-025-01493-8","url":null,"abstract":"<p><p>Artificial intelligence (AI) in radiology is becoming increasingly prevalent; however, there is not a clear picture of how AI is being monitored today and how this should practically be done given the inherent risk of AI model performance degradation over time. This research investigates current practices and what difficulties radiologists face in monitoring AI. Semi-structured virtual interviews were conducted with 6 USA and 10 Europe-based radiologists. The interviews were automatically transcribed and underwent thematic analysis. The findings suggest that AI monitoring in radiology is still relatively nascent as most of the AI projects had not yet progressed into a fully live clinical deployment. The most common method of monitoring involved a manual process of retrospectively comparing the AI results against the radiology report. Automated and statistical methods of monitoring were much less common. The biggest challenges are a lack of resources to support AI monitoring and uncertainty about how to create a robust and scalable process of monitoring the breadth and variety of radiology AI applications available. There is currently a lack of practical guidelines on how to monitor AI which has led to a variety of approaches being proposed from both healthcare providers and vendors. An ensemble of mixed methods is recommended to monitor AI across multiple domains and metrics. This will be enabled by appropriate allocation of resources and the formation of robust and diverse multidisciplinary AI governance groups.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"6-19"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12920929/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143813388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Compared to non-functional pituitary neuroendocrine tumors (NF-PitNETs), posterior pituitary tumors (PPTs) require more intraoperative protection of the pituitary stalk and hypothalamus, and their perioperative management is more complex than NF-PitNETs. However, they are difficult to be distinguished via magnetic resonance images (MRI) before operation. Based on clinical features and radiological signature extracted from MRI, this study aims to establish a model for distinguishing NF-PitNETs and PPTs. Preoperative MRI of 110 patients with NF-PitNETs and 55 patients with PPTs were retrospectively obtained. Patients were randomly assigned to the training (n = 110) and validation (n = 55) cohorts in a 2:1 ratio. The lest absolute shrinkage and selection operator (LASSO) algorithm was applied to develop a radiomic signature. Afterwards, an individualized predictive model (nomogram) incorporating radiomic signatures and predictive clinical features was developed. The nomogram's performance was evaluated by calibration and decision curve analyses. Five features derived from contrast-enhanced images were selected using the LASSO algorithm. Based on the mentioned methods, the calculation formula of radiomic score was obtained. The constructed nomogram incorporating radiomic signature and predictive clinical features showed a good calibration and outperformed the clinical features for predicting NF-PitNETs and PPTs (area under the curve [AUC]: 0.937 vs. 0.595 in training cohort [p < 0.001]; 0.907 vs. 0.782 in validation cohort [p = 0.03]). The decision curve shows that the individualized predictive model adds more benefit than clinical feature when the threshold probability ranges from 10 to 100%. Individualized predictive model provides a novel noninvasive imaging biomarker and could be conveniently used to distinguish NF-PitNETs and PPTs, which provides a significant reference for preoperative preparation and intraoperative decision-making.
{"title":"Preoperative Prediction of Non-functional Pituitary Neuroendocrine Tumors and Posterior Pituitary Tumors Based on MRI Radiomic Features.","authors":"Shucheng Jin, Qin Xu, Chen Sun, Yuan Zhang, Yangyang Wang, Xi Wang, Xiudong Guan, Deling Li, Yiming Li, Chuanbao Zhang, Wang Jia","doi":"10.1007/s10278-025-01400-1","DOIUrl":"10.1007/s10278-025-01400-1","url":null,"abstract":"<p><p>Compared to non-functional pituitary neuroendocrine tumors (NF-PitNETs), posterior pituitary tumors (PPTs) require more intraoperative protection of the pituitary stalk and hypothalamus, and their perioperative management is more complex than NF-PitNETs. However, they are difficult to be distinguished via magnetic resonance images (MRI) before operation. Based on clinical features and radiological signature extracted from MRI, this study aims to establish a model for distinguishing NF-PitNETs and PPTs. Preoperative MRI of 110 patients with NF-PitNETs and 55 patients with PPTs were retrospectively obtained. Patients were randomly assigned to the training (n = 110) and validation (n = 55) cohorts in a 2:1 ratio. The lest absolute shrinkage and selection operator (LASSO) algorithm was applied to develop a radiomic signature. Afterwards, an individualized predictive model (nomogram) incorporating radiomic signatures and predictive clinical features was developed. The nomogram's performance was evaluated by calibration and decision curve analyses. Five features derived from contrast-enhanced images were selected using the LASSO algorithm. Based on the mentioned methods, the calculation formula of radiomic score was obtained. The constructed nomogram incorporating radiomic signature and predictive clinical features showed a good calibration and outperformed the clinical features for predicting NF-PitNETs and PPTs (area under the curve [AUC]: 0.937 vs. 0.595 in training cohort [p < 0.001]; 0.907 vs. 0.782 in validation cohort [p = 0.03]). The decision curve shows that the individualized predictive model adds more benefit than clinical feature when the threshold probability ranges from 10 to 100%. Individualized predictive model provides a novel noninvasive imaging biomarker and could be conveniently used to distinguish NF-PitNETs and PPTs, which provides a significant reference for preoperative preparation and intraoperative decision-making.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"115-126"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12920986/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144056627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-05-08DOI: 10.1007/s10278-025-01523-5
Mana Moassefi, Sina Houshmand, Shahriar Faghani, Peter D Chang, Shawn H Sun, Bardia Khosravi, Aakash G Triphati, Ghulam Rasool, Neil K Bhatia, Les Folio, Katherine P Andriole, Judy W Gichoya, Bradley J Erickson
The rapid evolution of large language models (LLMs) offers promising opportunities for radiology report annotation, aiding in determining the presence of specific findings. This study evaluates the effectiveness of a human-optimized prompt in labeling radiology reports across multiple institutions using LLMs. Six distinct institutions collected 500 radiology reports: 100 in each of 5 categories. A standardized Python script was distributed to participating sites, allowing the use of one common locally executed LLM with a standard human-optimized prompt. The script executed the LLM's analysis for each report and compared predictions to reference labels provided by local investigators. Models' performance using accuracy was calculated, and results were aggregated centrally. The human-optimized prompt demonstrated high consistency across sites and pathologies. Preliminary analysis indicates significant agreement between the LLM's outputs and investigator-provided reference across multiple institutions. At one site, eight LLMs were systematically compared, with Llama 3.1 70b achieving the highest performance in accurately identifying the specified findings. Comparable performance with Llama 3.1 70b was observed at two additional centers, demonstrating the model's robust adaptability to variations in report structures and institutional practices. Our findings illustrate the potential of optimized prompt engineering in leveraging LLMs for cross-institutional radiology report labeling. This approach is straightforward while maintaining high accuracy and adaptability. Future work will explore model robustness to diverse report structures and further refine prompts to improve generalizability.
{"title":"Cross-Institutional Evaluation of Large Language Models for Radiology Diagnosis Extraction: A Prompt-Engineering Perspective.","authors":"Mana Moassefi, Sina Houshmand, Shahriar Faghani, Peter D Chang, Shawn H Sun, Bardia Khosravi, Aakash G Triphati, Ghulam Rasool, Neil K Bhatia, Les Folio, Katherine P Andriole, Judy W Gichoya, Bradley J Erickson","doi":"10.1007/s10278-025-01523-5","DOIUrl":"10.1007/s10278-025-01523-5","url":null,"abstract":"<p><p>The rapid evolution of large language models (LLMs) offers promising opportunities for radiology report annotation, aiding in determining the presence of specific findings. This study evaluates the effectiveness of a human-optimized prompt in labeling radiology reports across multiple institutions using LLMs. Six distinct institutions collected 500 radiology reports: 100 in each of 5 categories. A standardized Python script was distributed to participating sites, allowing the use of one common locally executed LLM with a standard human-optimized prompt. The script executed the LLM's analysis for each report and compared predictions to reference labels provided by local investigators. Models' performance using accuracy was calculated, and results were aggregated centrally. The human-optimized prompt demonstrated high consistency across sites and pathologies. Preliminary analysis indicates significant agreement between the LLM's outputs and investigator-provided reference across multiple institutions. At one site, eight LLMs were systematically compared, with Llama 3.1 70b achieving the highest performance in accurately identifying the specified findings. Comparable performance with Llama 3.1 70b was observed at two additional centers, demonstrating the model's robust adaptability to variations in report structures and institutional practices. Our findings illustrate the potential of optimized prompt engineering in leveraging LLMs for cross-institutional radiology report labeling. This approach is straightforward while maintaining high accuracy and adaptability. Future work will explore model robustness to diverse report structures and further refine prompts to improve generalizability.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"989-994"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12920939/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144059653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-05-08DOI: 10.1007/s10278-025-01535-1
Oona Rainio, Riku Klén
The Sørensen-Dice similarity coefficient (DSC) is the most common evaluation metric used for image segmentation but it is not always ideal. Namely, the DSC values only depend on the number of misplaced elements instead of their location with respect to the correct segments. Because of this, the DSC is ill-suited for such tasks where the correct location of the borders of an object is difficult to define in an objective way, as is the case in tumor segmentation in positron emission tomography (PET) images. To avoid this issue, we introduce two different modifications of the DSC, one with weights and one with an additional loss term, which also evaluate the distance between the real and the predicted segments. We computed the values of DSC and our new coefficient from 191 predicted tumor segmentation masks created by using PET images of 89 head and neck squamous cell carcinoma patients. We compared the values of all three coefficients with the scores given to these masks by human evaluators. According to our results, the weighted modification of DSC had a higher correlation with the scores given by the human evaluators than the original DSC, and it also produced significantly less variation within the two highest score classes (p-value 0.018). The new weighted coefficient introduced here has much potential in the evaluation of segmentation results from medical imaging.
{"title":"Modified Dice Coefficients for Evaluation of Tumor Segmentation from PET Images: A Proof-of-Concept Study.","authors":"Oona Rainio, Riku Klén","doi":"10.1007/s10278-025-01535-1","DOIUrl":"10.1007/s10278-025-01535-1","url":null,"abstract":"<p><p>The Sørensen-Dice similarity coefficient (DSC) is the most common evaluation metric used for image segmentation but it is not always ideal. Namely, the DSC values only depend on the number of misplaced elements instead of their location with respect to the correct segments. Because of this, the DSC is ill-suited for such tasks where the correct location of the borders of an object is difficult to define in an objective way, as is the case in tumor segmentation in positron emission tomography (PET) images. To avoid this issue, we introduce two different modifications of the DSC, one with weights and one with an additional loss term, which also evaluate the distance between the real and the predicted segments. We computed the values of DSC and our new coefficient from 191 predicted tumor segmentation masks created by using PET images of 89 head and neck squamous cell carcinoma patients. We compared the values of all three coefficients with the scores given to these masks by human evaluators. According to our results, the weighted modification of DSC had a higher correlation with the scores given by the human evaluators than the original DSC, and it also produced significantly less variation within the two highest score classes (p-value <math><mo>≤</mo></math> 0.018). The new weighted coefficient introduced here has much potential in the evaluation of segmentation results from medical imaging.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"785-793"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12920985/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144061588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-02-12DOI: 10.1007/s10278-025-01431-8
Francesca Pia Villani, Maria Chiara Fiorentino, Lorenzo Federici, Cesare Piazza, Emanuele Frontoni, Alberto Paderno, Sara Moccia
Accurate vocal fold (VF) pose estimation is crucial for diagnosing larynx diseases that can eventually lead to VF paralysis. The videoendoscopic examination is used to assess VF motility, usually estimating the change in the anterior glottic angle (AGA). This is a subjective and time-consuming procedure requiring extensive expertise. This research proposes a deep learning framework to estimate VF pose from laryngoscopy frames acquired in the actual clinical practice. The framework performs heatmap regression relying on three anatomically relevant keypoints as a prior for AGA computation, which is estimated from the coordinates of the predicted points. The assessment of the proposed framework is performed using a newly collected dataset of 471 laryngoscopy frames from 124 patients, 28 of whom with cancer. The framework was tested in various configurations and compared with other state-of-the-art approaches (direct keypoints regression and glottal segmentation) for both pose estimation, and AGA evaluation. The proposed framework obtained the lowest root mean square error (RMSE) computed on all the keypoints (5.09, 6.56, and 6.40 pixels, respectively) among all the models tested for VF pose estimation. Also for the AGA evaluation, heatmap regression reached the lowest mean average error (MAE) ( ). Results show that relying on keypoints heatmap regression allows to perform VF pose estimation with a small error, overcoming drawbacks of state-of-the-art algorithms, especially in challenging images such as pathologic subjects, presence of noise, and occlusion.
{"title":"A Deep-Learning Approach for Vocal Fold Pose Estimation in Videoendoscopy.","authors":"Francesca Pia Villani, Maria Chiara Fiorentino, Lorenzo Federici, Cesare Piazza, Emanuele Frontoni, Alberto Paderno, Sara Moccia","doi":"10.1007/s10278-025-01431-8","DOIUrl":"10.1007/s10278-025-01431-8","url":null,"abstract":"<p><p>Accurate vocal fold (VF) pose estimation is crucial for diagnosing larynx diseases that can eventually lead to VF paralysis. The videoendoscopic examination is used to assess VF motility, usually estimating the change in the anterior glottic angle (AGA). This is a subjective and time-consuming procedure requiring extensive expertise. This research proposes a deep learning framework to estimate VF pose from laryngoscopy frames acquired in the actual clinical practice. The framework performs heatmap regression relying on three anatomically relevant keypoints as a prior for AGA computation, which is estimated from the coordinates of the predicted points. The assessment of the proposed framework is performed using a newly collected dataset of 471 laryngoscopy frames from 124 patients, 28 of whom with cancer. The framework was tested in various configurations and compared with other state-of-the-art approaches (direct keypoints regression and glottal segmentation) for both pose estimation, and AGA evaluation. The proposed framework obtained the lowest root mean square error (RMSE) computed on all the keypoints (5.09, 6.56, and 6.40 pixels, respectively) among all the models tested for VF pose estimation. Also for the AGA evaluation, heatmap regression reached the lowest mean average error (MAE) ( <math><mrow><mn>5</mn> <mo>.</mo> <msup><mn>87</mn> <mo>∘</mo></msup> </mrow> </math> ). Results show that relying on keypoints heatmap regression allows to perform VF pose estimation with a small error, overcoming drawbacks of state-of-the-art algorithms, especially in challenging images such as pathologic subjects, presence of noise, and occlusion.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"842-852"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12920861/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143412210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-04-03DOI: 10.1007/s10278-025-01467-w
Yuwen Ning, Jiaxin Li, Shuyi Sun
Diabetic retinopathy (DR) is a significant vision-threatening condition, necessitating accurate and efficient automated screening methods. Traditional deep learning (DL) models struggle to detect subtle lesions and also suffer from high computational complexity. Existing models primarily mimic the primary visual cortex (V1) of the human visual system, neglecting other higher-order processing regions. To overcome these limitations, this research introduces the vision core-adapted network-based crossover osprey algorithm (VCANet-COP) for subtle lesion recognition with better computational efficiency. The model integrates sparse autoencoders (SAEs) to extract vascular structures and lesion-specific features at a pixel level for improved abnormality detection. The front-end network in the VCANet emulates the V1, V2, V4, and inferotemporal (IT) regions to derive subtle lesions effectively and improve lesion detection accuracy. Additionally, the COP algorithm leveraging the osprey optimization algorithm (OOA) with a crossover strategy optimizes hyperparameters and network configurations to ensure better computational efficiency, faster convergence, and enhanced performance in lesion recognition. The experimental assessment of the VCANet-COP model on multiple DR datasets namely Diabetic_Retinopathy_Data (DR-Data), Structured Analysis of the Retina (STARE) dataset, Indian Diabetic Retinopathy Image Dataset (IDRiD), Digital Retinal Images for Vessel Extraction (DRIVE) dataset, and Retinal fundus multi-disease image dataset (RFMID) demonstrates superior performance over baseline works, namely EDLDR, FFU_Net, LSTM_MFORG, fundus-DeepNet, and CNN_SVD by achieving average outcomes of 98.14% accuracy, 97.9% sensitivity, 98.08% specificity, 98.4% precision, 98.1% F1-score, 96.2% kappa coefficient, 2.0% false positive rate (FPR), 2.1% false negative rate (FNR), and 1.5-s execution time. By addressing critical limitations, VCANet-COP provides a scalable and robust solution for real-world DR screening and clinical decision support.
{"title":"Advancing Visual Perception Through VCANet-Crossover Osprey Algorithm: Integrating Visual Technologies.","authors":"Yuwen Ning, Jiaxin Li, Shuyi Sun","doi":"10.1007/s10278-025-01467-w","DOIUrl":"10.1007/s10278-025-01467-w","url":null,"abstract":"<p><p>Diabetic retinopathy (DR) is a significant vision-threatening condition, necessitating accurate and efficient automated screening methods. Traditional deep learning (DL) models struggle to detect subtle lesions and also suffer from high computational complexity. Existing models primarily mimic the primary visual cortex (V1) of the human visual system, neglecting other higher-order processing regions. To overcome these limitations, this research introduces the vision core-adapted network-based crossover osprey algorithm (VCANet-COP) for subtle lesion recognition with better computational efficiency. The model integrates sparse autoencoders (SAEs) to extract vascular structures and lesion-specific features at a pixel level for improved abnormality detection. The front-end network in the VCANet emulates the V1, V2, V4, and inferotemporal (IT) regions to derive subtle lesions effectively and improve lesion detection accuracy. Additionally, the COP algorithm leveraging the osprey optimization algorithm (OOA) with a crossover strategy optimizes hyperparameters and network configurations to ensure better computational efficiency, faster convergence, and enhanced performance in lesion recognition. The experimental assessment of the VCANet-COP model on multiple DR datasets namely Diabetic_Retinopathy_Data (DR-Data), Structured Analysis of the Retina (STARE) dataset, Indian Diabetic Retinopathy Image Dataset (IDRiD), Digital Retinal Images for Vessel Extraction (DRIVE) dataset, and Retinal fundus multi-disease image dataset (RFMID) demonstrates superior performance over baseline works, namely EDLDR, FFU_Net, LSTM_MFORG, fundus-DeepNet, and CNN_SVD by achieving average outcomes of 98.14% accuracy, 97.9% sensitivity, 98.08% specificity, 98.4% precision, 98.1% F1-score, 96.2% kappa coefficient, 2.0% false positive rate (FPR), 2.1% false negative rate (FNR), and 1.5-s execution time. By addressing critical limitations, VCANet-COP provides a scalable and robust solution for real-world DR screening and clinical decision support.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"669-698"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12920876/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143782301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Compared to desktop fundus cameras, handheld ones offer portability and affordability, although they often produce lower-quality images. This paper primarily addresses the issue of reduced image quality commonly associated with images captured by handheld fundus cameras. We first collected 538 fundus images obtained from handheld devices to form a dataset called Mule. A unified framework that consists of three main modules is then proposed to enhance the quality of fundus images. The Light Balance Module is employed first to suppress overexposure and underexposure. This is followed by the Super Resolution Module to enhance vascular details. Finally, the Vessel Enhancement Module is applied to improve image contrast. And a special preservation strategy is additionally applied to retain mocular features in the final fundus image. Objective evaluations demonstrate that the proposed framework yields the most promising results. Further experiments also suggest that it improves accuracy in downstream tasks, such as vessel segmentation, optic disc/optic cup detection, macula detection, and fundus image quality assessment. Our code is available at: https://github.com/Alen880/UFELQ.
{"title":"Unified Framework for Enhancement of Low-Quality Fundus Images.","authors":"Lihua Ding, Chengyi Zhang, Xingzheng Lyu, Deji Cheng, Shuchang Xu","doi":"10.1007/s10278-025-01509-3","DOIUrl":"10.1007/s10278-025-01509-3","url":null,"abstract":"<p><p>Compared to desktop fundus cameras, handheld ones offer portability and affordability, although they often produce lower-quality images. This paper primarily addresses the issue of reduced image quality commonly associated with images captured by handheld fundus cameras. We first collected 538 fundus images obtained from handheld devices to form a dataset called Mule. A unified framework that consists of three main modules is then proposed to enhance the quality of fundus images. The Light Balance Module is employed first to suppress overexposure and underexposure. This is followed by the Super Resolution Module to enhance vascular details. Finally, the Vessel Enhancement Module is applied to improve image contrast. And a special preservation strategy is additionally applied to retain mocular features in the final fundus image. Objective evaluations demonstrate that the proposed framework yields the most promising results. Further experiments also suggest that it improves accuracy in downstream tasks, such as vessel segmentation, optic disc/optic cup detection, macula detection, and fundus image quality assessment. Our code is available at: https://github.com/Alen880/UFELQ.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"699-713"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12921125/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144048922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-05-13DOI: 10.1007/s10278-025-01524-4
Sarra Kharbech, Nabil Sherif Mahmood, Ma'mon Qasem, Julien Abinahed, Amal Alobadli, Mohamed Abunada, Omar Aboumarzouk, Abdulla Al Ansari, Shidin Balakrishnan, Nikhil Navkar, Adham Darweesh
Perianal fistula is a complex condition where surgeons conduct surgeries based on the mentally mapped images they created from the information found in the radiology report. If not properly treated, a fistula could reoccur. To reduce the chances of reoccurrence, a patient-specific, visual, and accurate depiction of the internal tracts in relation to the pelvic floor is required. A three-dimensional (3D) parametric model generation software was previously developed and evaluated successfully with radiologists. In this paper, the software output is evaluated with two colorectal surgeons for 10 fistula cases. The paper compares three reporting different modes: (1) 3D models only, (2) conventional radiology report and picture archiving and communication system (PACS) magnetic resonance (MR) images, and (3) 3D models + standardized radiology report. The percentage of agreement between surgeons across cases and cognitive load are the primary metrics used for evaluation. Mode 3 superseded both modes 1 and 2, meaning that surgeons prefer to see a 3D model along with a standardized report to plan a case's surgical intervention. Mode 1 superseded mode 2, which also shows surgeons preference to inspect a 3D model rather than inspecting cases the conventional way. Surgeons' agreement in opinions across cases in mode 3 was 85%, whereas it was 18% and 5% in mode 1 and mode 2, respectively. This shows that information was conveyed more consistently across surgeons in mode 3. NASA TLX tests show that surgeons had the least cognitive load while working with mode 3, followed by mode 1 and then mode 2. Overall, the findings indicate that 3D models, even without radiologists' written input, outperform the current standard practice of delivering unstructured radiology reports alongside raw PACS images.
{"title":"Evaluation of Reporting Methods for Assessment and Surgical Planning of Perianal Fistulas.","authors":"Sarra Kharbech, Nabil Sherif Mahmood, Ma'mon Qasem, Julien Abinahed, Amal Alobadli, Mohamed Abunada, Omar Aboumarzouk, Abdulla Al Ansari, Shidin Balakrishnan, Nikhil Navkar, Adham Darweesh","doi":"10.1007/s10278-025-01524-4","DOIUrl":"10.1007/s10278-025-01524-4","url":null,"abstract":"<p><p>Perianal fistula is a complex condition where surgeons conduct surgeries based on the mentally mapped images they created from the information found in the radiology report. If not properly treated, a fistula could reoccur. To reduce the chances of reoccurrence, a patient-specific, visual, and accurate depiction of the internal tracts in relation to the pelvic floor is required. A three-dimensional (3D) parametric model generation software was previously developed and evaluated successfully with radiologists. In this paper, the software output is evaluated with two colorectal surgeons for 10 fistula cases. The paper compares three reporting different modes: (1) 3D models only, (2) conventional radiology report and picture archiving and communication system (PACS) magnetic resonance (MR) images, and (3) 3D models + standardized radiology report. The percentage of agreement between surgeons across cases and cognitive load are the primary metrics used for evaluation. Mode 3 superseded both modes 1 and 2, meaning that surgeons prefer to see a 3D model along with a standardized report to plan a case's surgical intervention. Mode 1 superseded mode 2, which also shows surgeons preference to inspect a 3D model rather than inspecting cases the conventional way. Surgeons' agreement in opinions across cases in mode 3 was 85%, whereas it was 18% and 5% in mode 1 and mode 2, respectively. This shows that information was conveyed more consistently across surgeons in mode 3. NASA TLX tests show that surgeons had the least cognitive load while working with mode 3, followed by mode 1 and then mode 2. Overall, the findings indicate that 3D models, even without radiologists' written input, outperform the current standard practice of delivering unstructured radiology reports alongside raw PACS images.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"20-33"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12921070/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144048918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}