Chronic kidney disease (CKD) remains a major public health concern, requiring better predictive models for early intervention. This study evaluates a deep learning model (DLM) that utilizes raw chest X-ray (CXR) data to predict moderate to severe kidney function decline. We analyzed data from 79,219 patients with an estimated Glomerular Filtration Rate (eGFR) between 65 and 120, segmented into development (n = 37,983), tuning (n = 15,346), internal validation (n = 14,113), and external validation (n = 11,777) sets. Our DLM, pretrained on CXR-report pairs, was fine-tuned with the development set. We retrospectively examined data spanning April 2011 to February 2022, with a 5-year maximum follow-up. Primary and secondary endpoints included CKD stage 3b progression, ESRD/dialysis, and mortality. The overall concordance index (C-index) values for the internal and external validation sets were 0.903 (95% CI, 0.885-0.922) and 0.851 (95% CI, 0.819-0.883), respectively. In these sets, the incidences of progression to CKD stage 3b at 5 years were 19.2% and 13.4% in the high-risk group, significantly higher than those in the median-risk (5.9% and 5.1%) and low-risk groups (0.9% and 0.9%), respectively. The sex, age, and eGFR-adjusted hazard ratios (HR) for the high-risk group compared to the low-risk group were 16.88 (95% CI, 10.84-26.28) and 7.77 (95% CI, 4.77-12.64), respectively. The high-risk group also exhibited higher probabilities of progressing to ESRD/dialysis or experiencing mortality compared to the low-risk group. Further analysis revealed that the high-risk group compared to the low/median-risk group had a higher prevalence of complications and abnormal blood/urine markers. Our findings demonstrate that a DLM utilizing CXR can effectively predict CKD stage 3b progression, offering a potential tool for early intervention in high-risk populations.
{"title":"Prediction of Future Risk of Moderate to Severe Kidney Function Loss Using a Deep Learning Model-Enabled Chest Radiography.","authors":"Kai-Chieh Chen, Shang-Yang Lee, Dung-Jang Tsai, Kai-Hsiung Ko, Yi-Chih Hsu, Wei-Chou Chang, Wen-Hui Fang, Chin Lin, Yu-Juei Hsu","doi":"10.1007/s10278-025-01489-4","DOIUrl":"10.1007/s10278-025-01489-4","url":null,"abstract":"<p><p>Chronic kidney disease (CKD) remains a major public health concern, requiring better predictive models for early intervention. This study evaluates a deep learning model (DLM) that utilizes raw chest X-ray (CXR) data to predict moderate to severe kidney function decline. We analyzed data from 79,219 patients with an estimated Glomerular Filtration Rate (eGFR) between 65 and 120, segmented into development (n = 37,983), tuning (n = 15,346), internal validation (n = 14,113), and external validation (n = 11,777) sets. Our DLM, pretrained on CXR-report pairs, was fine-tuned with the development set. We retrospectively examined data spanning April 2011 to February 2022, with a 5-year maximum follow-up. Primary and secondary endpoints included CKD stage 3b progression, ESRD/dialysis, and mortality. The overall concordance index (C-index) values for the internal and external validation sets were 0.903 (95% CI, 0.885-0.922) and 0.851 (95% CI, 0.819-0.883), respectively. In these sets, the incidences of progression to CKD stage 3b at 5 years were 19.2% and 13.4% in the high-risk group, significantly higher than those in the median-risk (5.9% and 5.1%) and low-risk groups (0.9% and 0.9%), respectively. The sex, age, and eGFR-adjusted hazard ratios (HR) for the high-risk group compared to the low-risk group were 16.88 (95% CI, 10.84-26.28) and 7.77 (95% CI, 4.77-12.64), respectively. The high-risk group also exhibited higher probabilities of progressing to ESRD/dialysis or experiencing mortality compared to the low-risk group. Further analysis revealed that the high-risk group compared to the low/median-risk group had a higher prevalence of complications and abnormal blood/urine markers. Our findings demonstrate that a DLM utilizing CXR can effectively predict CKD stage 3b progression, offering a potential tool for early intervention in high-risk populations.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"454-467"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12920974/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143775244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-04-08DOI: 10.1007/s10278-025-01495-6
Hanyue Mo, Ziwen Kuang, Haoxuan Wang, Xinyi Cai, Kun Cheng
Accurate classification of burn severity is crucial for effective clinical treatment; however, existing methods often fail to balance precision and real-time performance. To address this challenge, we propose a deep learning-based approach utilizing an enhanced ResNet18 architecture with integrated attention mechanisms to improve classification accuracy. The system consists of data preprocessing, classification, optimization, and post-processing modules. The optimization strategy employs an adaptive learning rate combining cosine annealing and class-specific gradient adaptation, alongside targeted adjustments for class imbalance, while an improved Adam optimizer enhances convergence stability. Post-processing incorporates confidence filtering (threshold 0.3) and selective evaluation, with weighted aggregation-integrating dynamic accuracy calculation and moving average to refine predictions and enhance diagnostic reliability. Experimental results on a burn skin test dataset demonstrate that the proposed model achieves a classification accuracy of 99.19% ± 0.12 and a mean average precision (mAP) of 98.72% ± 0.10, highlighting its potential for real-time clinical burn assessment.
{"title":"Enhancing Burn Diagnosis through SE-ResNet18 and Confidence Filtering.","authors":"Hanyue Mo, Ziwen Kuang, Haoxuan Wang, Xinyi Cai, Kun Cheng","doi":"10.1007/s10278-025-01495-6","DOIUrl":"10.1007/s10278-025-01495-6","url":null,"abstract":"<p><p>Accurate classification of burn severity is crucial for effective clinical treatment; however, existing methods often fail to balance precision and real-time performance. To address this challenge, we propose a deep learning-based approach utilizing an enhanced ResNet18 architecture with integrated attention mechanisms to improve classification accuracy. The system consists of data preprocessing, classification, optimization, and post-processing modules. The optimization strategy employs an adaptive learning rate combining cosine annealing and class-specific gradient adaptation, alongside targeted adjustments for class imbalance, while an improved Adam optimizer enhances convergence stability. Post-processing incorporates confidence filtering (threshold 0.3) and selective evaluation, with weighted aggregation-integrating dynamic accuracy calculation and moving average to refine predictions and enhance diagnostic reliability. Experimental results on a burn skin test dataset demonstrate that the proposed model achieves a classification accuracy of 99.19% ± 0.12 and a mean average precision (mAP) of 98.72% ± 0.10, highlighting its potential for real-time clinical burn assessment.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"639-654"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12920881/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143813386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-04-29DOI: 10.1007/s10278-025-01515-5
Osman Güler
Breast ultrasound is a useful and rapid diagnostic tool for the early detection of breast cancer. Artificial intelligence-supported computer-aided decision systems, which assist expert radiologists and clinicians, provide reliable and rapid results. Deep learning methods and techniques are widely used in the field of health for early diagnosis, abnormality detection, and disease diagnosis. Therefore, in this study, a deep ensemble learning model based on Dirichlet distribution using pre-trained transfer learning models for breast cancer classification from ultrasound images is proposed. In the study, experiments were conducted using the Breast Ultrasound Images Dataset (BUSI). The dataset, which had an imbalanced class structure, was balanced using data augmentation techniques. DenseNet201, InceptionV3, VGG16, and ResNet152 models were used for transfer learning with fivefold cross-validation. Statistical analyses, including the ANOVA test and Tukey HSD test, were applied to evaluate the model's performance and ensure the reliability of the results. Additionally, Grad-CAM (Gradient-weighted Class Activation Mapping) was used for explainable AI (XAI), providing visual explanations of the deep learning model's decision-making process. The spaced repetition method, commonly used to improve the success of learners in educational sciences, was adapted to artificial intelligence in this study. The results of training with transfer learning models were used as input for further training, and spaced repetition was applied using previously learned information. The use of the spaced repetition method led to increased model success and reduced learning times. The weights obtained from the trained models were input into an ensemble learning system based on Dirichlet distribution with different variations. The proposed model achieved 99.60% validation accuracy on the dataset, demonstrating its effectiveness in breast cancer classification.
{"title":"A Dirichlet Distribution-Based Complex Ensemble Approach for Breast Cancer Classification from Ultrasound Images with Transfer Learning and Multiphase Spaced Repetition Method.","authors":"Osman Güler","doi":"10.1007/s10278-025-01515-5","DOIUrl":"10.1007/s10278-025-01515-5","url":null,"abstract":"<p><p>Breast ultrasound is a useful and rapid diagnostic tool for the early detection of breast cancer. Artificial intelligence-supported computer-aided decision systems, which assist expert radiologists and clinicians, provide reliable and rapid results. Deep learning methods and techniques are widely used in the field of health for early diagnosis, abnormality detection, and disease diagnosis. Therefore, in this study, a deep ensemble learning model based on Dirichlet distribution using pre-trained transfer learning models for breast cancer classification from ultrasound images is proposed. In the study, experiments were conducted using the Breast Ultrasound Images Dataset (BUSI). The dataset, which had an imbalanced class structure, was balanced using data augmentation techniques. DenseNet201, InceptionV3, VGG16, and ResNet152 models were used for transfer learning with fivefold cross-validation. Statistical analyses, including the ANOVA test and Tukey HSD test, were applied to evaluate the model's performance and ensure the reliability of the results. Additionally, Grad-CAM (Gradient-weighted Class Activation Mapping) was used for explainable AI (XAI), providing visual explanations of the deep learning model's decision-making process. The spaced repetition method, commonly used to improve the success of learners in educational sciences, was adapted to artificial intelligence in this study. The results of training with transfer learning models were used as input for further training, and spaced repetition was applied using previously learned information. The use of the spaced repetition method led to increased model success and reduced learning times. The weights obtained from the trained models were input into an ensemble learning system based on Dirichlet distribution with different variations. The proposed model achieved 99.60% validation accuracy on the dataset, demonstrating its effectiveness in breast cancer classification.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"202-228"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12920884/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144057056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1007/s10278-025-01476-9
C F Del Cerro, R C Gimenez, J Garcia-Blas, K Sosenko, J M Ortega, M Desco, M Abella
{"title":"Correction: Deep Learning-Based Estimation of Radiographic Position to Automatically Set Up the X-Ray Prime Factors.","authors":"C F Del Cerro, R C Gimenez, J Garcia-Blas, K Sosenko, J M Ortega, M Desco, M Abella","doi":"10.1007/s10278-025-01476-9","DOIUrl":"10.1007/s10278-025-01476-9","url":null,"abstract":"","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"1040"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12920969/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144060070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-04-11DOI: 10.1007/s10278-025-01478-7
Tejas Sudharshan Mathai, Boah Kim, Oana M Stroie, Ronald M Summers
In current radiology practice, radiologists identify a finding in the current imaging exam, manually match it against the description from the prior exam report and assess interval changes. Large Language Models (LLMs) can identify report findings, but their ability to track interval changes has not been tested. The goal of this study was to determine the utility of a privacy-preserving LLM for matching findings between two reports (prior and follow-up) and tracking interval changes in size. In this retrospective study, body MRI reports from NIH (internal) were collected. A two-stage framework was employed for matching findings and tracking interval changes. In Stage 1, the LLM took a sentence from the follow-up report and discovered a matched finding in the prior report. In Stage 2, the LLM predicted the interval change status (increase, decrease, or stable) of the matched findings. Seven LLMs were locally evaluated and the best LLM was validated on an external non-contrast chest CT dataset. Agreement with the reference (radiologist) was measured using Cohen's Kappa (κ). The internal body MRI dataset had 240 studies (120 patients, mean age, 47 ± 16 years; 65 men) and the external non-contrast chest CT dataset contained 134 studies (67 patients, mean age, 58 ± 18 years; 44 men). On the internal dataset, TenyxChat-7B LLM fared the best for matching findings with an F1-score of 85.4% (95% CI: 80.8, 89.9) over the other LLMs (p < 0.05). For interval change detection, the same LLM achieved a 62.7% F1-score and showed a moderate agreement (κ = 0.46, 95% CI: 0.37, 0.55). For the external dataset, the same LLM attained F1-scores of 81.8% (95% CI: 74.4, 89.1) for matching findings and 77.4% for interval change detection respectively, with a substantial agreement (κ = 0.64, 95% CI: 0.49, 0.80). The TenyxChat-7B LLM used for matching longitudinal report findings and tracking interval changes showed moderate to substantial agreement with the reference standard. For structured reporting, the LLM can pre-fill the "Findings" section of the next follow-up exam report with a summary of longitudinal changes to important findings. It can also enhance the communication between the referring physician and radiologist.
{"title":"Privacy-Preserving Large Language Model for Matching Findings and Tracking Interval Changes in Longitudinal Radiology Reports.","authors":"Tejas Sudharshan Mathai, Boah Kim, Oana M Stroie, Ronald M Summers","doi":"10.1007/s10278-025-01478-7","DOIUrl":"10.1007/s10278-025-01478-7","url":null,"abstract":"<p><p>In current radiology practice, radiologists identify a finding in the current imaging exam, manually match it against the description from the prior exam report and assess interval changes. Large Language Models (LLMs) can identify report findings, but their ability to track interval changes has not been tested. The goal of this study was to determine the utility of a privacy-preserving LLM for matching findings between two reports (prior and follow-up) and tracking interval changes in size. In this retrospective study, body MRI reports from NIH (internal) were collected. A two-stage framework was employed for matching findings and tracking interval changes. In Stage 1, the LLM took a sentence from the follow-up report and discovered a matched finding in the prior report. In Stage 2, the LLM predicted the interval change status (increase, decrease, or stable) of the matched findings. Seven LLMs were locally evaluated and the best LLM was validated on an external non-contrast chest CT dataset. Agreement with the reference (radiologist) was measured using Cohen's Kappa (κ). The internal body MRI dataset had 240 studies (120 patients, mean age, 47 ± 16 years; 65 men) and the external non-contrast chest CT dataset contained 134 studies (67 patients, mean age, 58 ± 18 years; 44 men). On the internal dataset, TenyxChat-7B LLM fared the best for matching findings with an F1-score of 85.4% (95% CI: 80.8, 89.9) over the other LLMs (p < 0.05). For interval change detection, the same LLM achieved a 62.7% F1-score and showed a moderate agreement (κ = 0.46, 95% CI: 0.37, 0.55). For the external dataset, the same LLM attained F1-scores of 81.8% (95% CI: 74.4, 89.1) for matching findings and 77.4% for interval change detection respectively, with a substantial agreement (κ = 0.64, 95% CI: 0.49, 0.80). The TenyxChat-7B LLM used for matching longitudinal report findings and tracking interval changes showed moderate to substantial agreement with the reference standard. For structured reporting, the LLM can pre-fill the \"Findings\" section of the next follow-up exam report with a summary of longitudinal changes to important findings. It can also enhance the communication between the referring physician and radiologist.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"1017-1030"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12920831/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144061042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-05-16DOI: 10.1007/s10278-025-01530-6
Mohammed M Kanani, Arezu Monawer, Lauryn Brown, William E King, Zachary D Miller, Nitin Venugopal, Patrick J Heagerty, Jeffrey G Jarvik, Trevor Cohen, Nathan M Cross
Extracting information from radiology reports can provide critical data to empower many radiology workflows. For spinal compression fractures, these data can facilitate evidence-based care for at-risk populations. Manual extraction from free-text reports is laborious, and error-prone. Large language models (LLMs) have shown promise; however, fine-tuning strategies to optimize performance in specific tasks can be resource intensive. A variety of prompting strategies have achieved similar results with fewer demands. Our study pioneers the use of Meta's Llama 3.1, together with prompt-based strategies, for automated extraction of compression fractures from free-text radiology reports, outputting structured data without model training. We tested performance on a time-based sample of CT exams covering the spine from 2/20/2024 to 2/22/2024 acquired across our healthcare enterprise (637 anonymized reports, age 18-102, 47% Female). Ground truth annotations were manually generated and compared against the performance of three models (Llama 3.1 70B, Llama 3.1 8B, and Vicuna 13B) with nine different prompting configurations for a total of 27 model/prompt experiments. The highest F1 score (0.91) was achieved by the 70B Llama 3.1 model when provided with a radiologist-written background, with similar results when the background was written by a separate LLM (0.86). The addition of few-shot examples to these prompts had variable impact on F1 measurements (0.89, 0.84 respectively). Comparable ROC-AUC and PR-AUC performance was observed. Our work demonstrated that an open-weights LLM excelled at extracting compression fractures findings from free-text radiology reports using prompt-based techniques without requiring extensive manually labeled examples for model training.
{"title":"High-Performance Prompting for LLM Extraction of Compression Fracture Findings from Radiology Reports.","authors":"Mohammed M Kanani, Arezu Monawer, Lauryn Brown, William E King, Zachary D Miller, Nitin Venugopal, Patrick J Heagerty, Jeffrey G Jarvik, Trevor Cohen, Nathan M Cross","doi":"10.1007/s10278-025-01530-6","DOIUrl":"10.1007/s10278-025-01530-6","url":null,"abstract":"<p><p>Extracting information from radiology reports can provide critical data to empower many radiology workflows. For spinal compression fractures, these data can facilitate evidence-based care for at-risk populations. Manual extraction from free-text reports is laborious, and error-prone. Large language models (LLMs) have shown promise; however, fine-tuning strategies to optimize performance in specific tasks can be resource intensive. A variety of prompting strategies have achieved similar results with fewer demands. Our study pioneers the use of Meta's Llama 3.1, together with prompt-based strategies, for automated extraction of compression fractures from free-text radiology reports, outputting structured data without model training. We tested performance on a time-based sample of CT exams covering the spine from 2/20/2024 to 2/22/2024 acquired across our healthcare enterprise (637 anonymized reports, age 18-102, 47% Female). Ground truth annotations were manually generated and compared against the performance of three models (Llama 3.1 70B, Llama 3.1 8B, and Vicuna 13B) with nine different prompting configurations for a total of 27 model/prompt experiments. The highest F1 score (0.91) was achieved by the 70B Llama 3.1 model when provided with a radiologist-written background, with similar results when the background was written by a separate LLM (0.86). The addition of few-shot examples to these prompts had variable impact on F1 measurements (0.89, 0.84 respectively). Comparable ROC-AUC and PR-AUC performance was observed. Our work demonstrated that an open-weights LLM excelled at extracting compression fractures findings from free-text radiology reports using prompt-based techniques without requiring extensive manually labeled examples for model training.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"973-988"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12921121/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144087436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate assessment of computed tomography (CT) image quality is crucial for ensuring diagnostic accuracy, optimizing imaging protocols, and preventing excessive radiation exposure. In clinical settings, where high-quality reference images are often unavailable, developing no-reference image quality assessment (NR-IQA) methods is essential. Recently, CT-NR-IQA methods using deep learning have been widely studied; however, significant challenges remain in handling multiple degradation factors and accurately reflecting real-world degradations. To address these issues, we propose a novel CT-NR-IQA method. Our approach utilizes a dataset that combines two degradation factors (noise and blur) to train convolutional neural network (CNN) models capable of handling multiple degradation factors. Additionally, we leveraged RadImageNet pre-trained models (ResNet50, DenseNet121, InceptionV3, and InceptionResNetV2), allowing the models to learn deep features from large-scale real clinical images, thus enhancing adaptability to real-world degradations without relying on artificially degraded images. The models' performances were evaluated by measuring the correlation between the subjective scores and predicted image quality scores for both artificially degraded and real clinical image datasets. The results demonstrated positive correlations between the subjective and predicted scores for both datasets. In particular, ResNet50 showed the best performance, with a correlation coefficient of 0.910 for the artificially degraded images and 0.831 for the real clinical images. These findings indicate that the proposed method could serve as a potential surrogate for subjective assessment in CT-NR-IQA.
{"title":"Development of a No-Reference CT Image Quality Assessment Method Using RadImageNet Pre-trained Deep Learning Models.","authors":"Kohei Ohashi, Yukihiro Nagatani, Asumi Yamazaki, Makoto Yoshigoe, Kyohei Iwai, Ryo Uemura, Masayuki Shimomura, Kenta Tanimura, Takayuki Ishida","doi":"10.1007/s10278-025-01542-2","DOIUrl":"10.1007/s10278-025-01542-2","url":null,"abstract":"<p><p>Accurate assessment of computed tomography (CT) image quality is crucial for ensuring diagnostic accuracy, optimizing imaging protocols, and preventing excessive radiation exposure. In clinical settings, where high-quality reference images are often unavailable, developing no-reference image quality assessment (NR-IQA) methods is essential. Recently, CT-NR-IQA methods using deep learning have been widely studied; however, significant challenges remain in handling multiple degradation factors and accurately reflecting real-world degradations. To address these issues, we propose a novel CT-NR-IQA method. Our approach utilizes a dataset that combines two degradation factors (noise and blur) to train convolutional neural network (CNN) models capable of handling multiple degradation factors. Additionally, we leveraged RadImageNet pre-trained models (ResNet50, DenseNet121, InceptionV3, and InceptionResNetV2), allowing the models to learn deep features from large-scale real clinical images, thus enhancing adaptability to real-world degradations without relying on artificially degraded images. The models' performances were evaluated by measuring the correlation between the subjective scores and predicted image quality scores for both artificially degraded and real clinical image datasets. The results demonstrated positive correlations between the subjective and predicted scores for both datasets. In particular, ResNet50 showed the best performance, with a correlation coefficient of 0.910 for the artificially degraded images and 0.831 for the real clinical images. These findings indicate that the proposed method could serve as a potential surrogate for subjective assessment in CT-NR-IQA.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"46-58"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12921086/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144164187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-04-01DOI: 10.1007/s10278-025-01492-9
Jinglan Guo, Jue Liao, Yuanlian Chen, Lisha Wen, Song Cheng
Microarray technology has become a vital tool in cardiovascular research, enabling the simultaneous analysis of thousands of gene expressions. This capability provides a robust foundation for heart disease classification and biomarker discovery. However, the high dimensionality, noise, and sparsity of microarray data present significant challenges for effective analysis. Gene selection, which aims to identify the most relevant subset of genes, is a crucial preprocessing step for improving classification accuracy, reducing computational complexity, and enhancing biological interpretability. Traditional gene selection methods often fall short in capturing complex, nonlinear interactions among genes, limiting their effectiveness in heart disease classification tasks. In this study, we propose a novel framework that leverages deep neural networks (DNNs) for optimizing gene selection and heart disease classification using microarray data. DNNs, known for their ability to model complex, nonlinear patterns, are integrated with feature selection techniques to address the challenges of high-dimensional data. The proposed method, DeepGeneNet (DGN), combines gene selection and DNN-based classification into a unified framework, ensuring robust performance and meaningful insights into the underlying biological mechanisms. Additionally, the framework incorporates hyperparameter optimization and innovative U-Net segmentation techniques to further enhance computational performance and classification accuracy. These optimizations enable DGN to deliver robust and scalable results, outperforming traditional methods in both predictive accuracy and interpretability. Experimental results demonstrate that the proposed approach significantly improves heart disease classification accuracy compared to other methods. By focusing on the interplay between gene selection and deep learning, this work advances the field of cardiovascular genomics, providing a scalable and interpretable framework for future applications.
{"title":"New Machine Learning Method for Medical Image and Microarray Data Analysis for Heart Disease Classification.","authors":"Jinglan Guo, Jue Liao, Yuanlian Chen, Lisha Wen, Song Cheng","doi":"10.1007/s10278-025-01492-9","DOIUrl":"10.1007/s10278-025-01492-9","url":null,"abstract":"<p><p>Microarray technology has become a vital tool in cardiovascular research, enabling the simultaneous analysis of thousands of gene expressions. This capability provides a robust foundation for heart disease classification and biomarker discovery. However, the high dimensionality, noise, and sparsity of microarray data present significant challenges for effective analysis. Gene selection, which aims to identify the most relevant subset of genes, is a crucial preprocessing step for improving classification accuracy, reducing computational complexity, and enhancing biological interpretability. Traditional gene selection methods often fall short in capturing complex, nonlinear interactions among genes, limiting their effectiveness in heart disease classification tasks. In this study, we propose a novel framework that leverages deep neural networks (DNNs) for optimizing gene selection and heart disease classification using microarray data. DNNs, known for their ability to model complex, nonlinear patterns, are integrated with feature selection techniques to address the challenges of high-dimensional data. The proposed method, DeepGeneNet (DGN), combines gene selection and DNN-based classification into a unified framework, ensuring robust performance and meaningful insights into the underlying biological mechanisms. Additionally, the framework incorporates hyperparameter optimization and innovative U-Net segmentation techniques to further enhance computational performance and classification accuracy. These optimizations enable DGN to deliver robust and scalable results, outperforming traditional methods in both predictive accuracy and interpretability. Experimental results demonstrate that the proposed approach significantly improves heart disease classification accuracy compared to other methods. By focusing on the interplay between gene selection and deep learning, this work advances the field of cardiovascular genomics, providing a scalable and interpretable framework for future applications.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"884-907"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12921063/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143766305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-04-08DOI: 10.1007/s10278-025-01493-8
Jamie Chow, Ryan Lee, Honghan Wu
Artificial intelligence (AI) in radiology is becoming increasingly prevalent; however, there is not a clear picture of how AI is being monitored today and how this should practically be done given the inherent risk of AI model performance degradation over time. This research investigates current practices and what difficulties radiologists face in monitoring AI. Semi-structured virtual interviews were conducted with 6 USA and 10 Europe-based radiologists. The interviews were automatically transcribed and underwent thematic analysis. The findings suggest that AI monitoring in radiology is still relatively nascent as most of the AI projects had not yet progressed into a fully live clinical deployment. The most common method of monitoring involved a manual process of retrospectively comparing the AI results against the radiology report. Automated and statistical methods of monitoring were much less common. The biggest challenges are a lack of resources to support AI monitoring and uncertainty about how to create a robust and scalable process of monitoring the breadth and variety of radiology AI applications available. There is currently a lack of practical guidelines on how to monitor AI which has led to a variety of approaches being proposed from both healthcare providers and vendors. An ensemble of mixed methods is recommended to monitor AI across multiple domains and metrics. This will be enabled by appropriate allocation of resources and the formation of robust and diverse multidisciplinary AI governance groups.
{"title":"How Do Radiologists Currently Monitor AI in Radiology and What Challenges Do They Face? An Interview Study and Qualitative Analysis.","authors":"Jamie Chow, Ryan Lee, Honghan Wu","doi":"10.1007/s10278-025-01493-8","DOIUrl":"10.1007/s10278-025-01493-8","url":null,"abstract":"<p><p>Artificial intelligence (AI) in radiology is becoming increasingly prevalent; however, there is not a clear picture of how AI is being monitored today and how this should practically be done given the inherent risk of AI model performance degradation over time. This research investigates current practices and what difficulties radiologists face in monitoring AI. Semi-structured virtual interviews were conducted with 6 USA and 10 Europe-based radiologists. The interviews were automatically transcribed and underwent thematic analysis. The findings suggest that AI monitoring in radiology is still relatively nascent as most of the AI projects had not yet progressed into a fully live clinical deployment. The most common method of monitoring involved a manual process of retrospectively comparing the AI results against the radiology report. Automated and statistical methods of monitoring were much less common. The biggest challenges are a lack of resources to support AI monitoring and uncertainty about how to create a robust and scalable process of monitoring the breadth and variety of radiology AI applications available. There is currently a lack of practical guidelines on how to monitor AI which has led to a variety of approaches being proposed from both healthcare providers and vendors. An ensemble of mixed methods is recommended to monitor AI across multiple domains and metrics. This will be enabled by appropriate allocation of resources and the formation of robust and diverse multidisciplinary AI governance groups.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"6-19"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12920929/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143813388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Compared to non-functional pituitary neuroendocrine tumors (NF-PitNETs), posterior pituitary tumors (PPTs) require more intraoperative protection of the pituitary stalk and hypothalamus, and their perioperative management is more complex than NF-PitNETs. However, they are difficult to be distinguished via magnetic resonance images (MRI) before operation. Based on clinical features and radiological signature extracted from MRI, this study aims to establish a model for distinguishing NF-PitNETs and PPTs. Preoperative MRI of 110 patients with NF-PitNETs and 55 patients with PPTs were retrospectively obtained. Patients were randomly assigned to the training (n = 110) and validation (n = 55) cohorts in a 2:1 ratio. The lest absolute shrinkage and selection operator (LASSO) algorithm was applied to develop a radiomic signature. Afterwards, an individualized predictive model (nomogram) incorporating radiomic signatures and predictive clinical features was developed. The nomogram's performance was evaluated by calibration and decision curve analyses. Five features derived from contrast-enhanced images were selected using the LASSO algorithm. Based on the mentioned methods, the calculation formula of radiomic score was obtained. The constructed nomogram incorporating radiomic signature and predictive clinical features showed a good calibration and outperformed the clinical features for predicting NF-PitNETs and PPTs (area under the curve [AUC]: 0.937 vs. 0.595 in training cohort [p < 0.001]; 0.907 vs. 0.782 in validation cohort [p = 0.03]). The decision curve shows that the individualized predictive model adds more benefit than clinical feature when the threshold probability ranges from 10 to 100%. Individualized predictive model provides a novel noninvasive imaging biomarker and could be conveniently used to distinguish NF-PitNETs and PPTs, which provides a significant reference for preoperative preparation and intraoperative decision-making.
{"title":"Preoperative Prediction of Non-functional Pituitary Neuroendocrine Tumors and Posterior Pituitary Tumors Based on MRI Radiomic Features.","authors":"Shucheng Jin, Qin Xu, Chen Sun, Yuan Zhang, Yangyang Wang, Xi Wang, Xiudong Guan, Deling Li, Yiming Li, Chuanbao Zhang, Wang Jia","doi":"10.1007/s10278-025-01400-1","DOIUrl":"10.1007/s10278-025-01400-1","url":null,"abstract":"<p><p>Compared to non-functional pituitary neuroendocrine tumors (NF-PitNETs), posterior pituitary tumors (PPTs) require more intraoperative protection of the pituitary stalk and hypothalamus, and their perioperative management is more complex than NF-PitNETs. However, they are difficult to be distinguished via magnetic resonance images (MRI) before operation. Based on clinical features and radiological signature extracted from MRI, this study aims to establish a model for distinguishing NF-PitNETs and PPTs. Preoperative MRI of 110 patients with NF-PitNETs and 55 patients with PPTs were retrospectively obtained. Patients were randomly assigned to the training (n = 110) and validation (n = 55) cohorts in a 2:1 ratio. The lest absolute shrinkage and selection operator (LASSO) algorithm was applied to develop a radiomic signature. Afterwards, an individualized predictive model (nomogram) incorporating radiomic signatures and predictive clinical features was developed. The nomogram's performance was evaluated by calibration and decision curve analyses. Five features derived from contrast-enhanced images were selected using the LASSO algorithm. Based on the mentioned methods, the calculation formula of radiomic score was obtained. The constructed nomogram incorporating radiomic signature and predictive clinical features showed a good calibration and outperformed the clinical features for predicting NF-PitNETs and PPTs (area under the curve [AUC]: 0.937 vs. 0.595 in training cohort [p < 0.001]; 0.907 vs. 0.782 in validation cohort [p = 0.03]). The decision curve shows that the individualized predictive model adds more benefit than clinical feature when the threshold probability ranges from 10 to 100%. Individualized predictive model provides a novel noninvasive imaging biomarker and could be conveniently used to distinguish NF-PitNETs and PPTs, which provides a significant reference for preoperative preparation and intraoperative decision-making.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":"115-126"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12920986/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144056627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}