Pedro Teixeira Castro, Ana Paula Pinho Matos, Gerson Ribeiro, Marcio Silva, Jorge Lopes, Edward Araujo Júnior, Heron Werner
{"title":"Time-of-Flight MRI Transition From 2D to 3D Fused Sequences: Noninvasive Technique for Angiographically Evaluating Pelvic Arteries in Placenta Accreta Spectrum Cases.","authors":"Pedro Teixeira Castro, Ana Paula Pinho Matos, Gerson Ribeiro, Marcio Silva, Jorge Lopes, Edward Araujo Júnior, Heron Werner","doi":"10.3348/kjr.2025.0327","DOIUrl":"https://doi.org/10.3348/kjr.2025.0327","url":null,"abstract":"","PeriodicalId":17881,"journal":{"name":"Korean Journal of Radiology","volume":"26 9","pages":"893-895"},"PeriodicalIF":5.3,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12394818/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144959338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cardiac sarcoidosis (CS) poses significant diagnostic and therapeutic challenges due to its heterogeneous clinical manifestations and the limitations of conventional diagnostic approaches. Advances in imaging modalities, particularly cardiac magnetic resonance imaging (CMR) and ¹⁸F-fluorodeoxyglucose positron emission tomography (FDG-PET), have revolutionized the evaluation and management of this complex condition. CMR, with its superior spatial resolution and advanced techniques such as late gadolinium enhancement, T1/T2 mapping, and extracellular volume quantification, offers unparalleled insights into myocardial structure and fibrosis. These techniques not only enhance diagnostic accuracy but also provide critical information on disease activity and treatment response. Among these, T2 mapping has emerged as a valuable marker for active inflammation, with high values reliably indicating acute disease states. FDG-PET serves as a complementary modality by detecting active granulomatous inflammation and guiding immunosuppressive therapy. The synergistic integration of CMR and FDG-PET provides a comprehensive approach to diagnosing and monitoring CS, enabling the identification of subclinical disease and the optimization of therapeutic strategies. Furthermore, the incorporation of quantitative biomarkers, such as strain metrics and T2 values, promises to refine disease assessment and management. These advancements have the potential to transform the paradigm of CS care, ultimately improving patient outcomes.
{"title":"Updates on Cardiac MRI and PET Imaging for the Diagnosis and Monitoring of Cardiac Sarcoidosis.","authors":"Noriko Oyama-Manabe, Osamu Manabe, Tadao Aikawa, Yoshitaka Sobue, Ryosuke Asakura","doi":"10.3348/kjr.2025.0148","DOIUrl":"https://doi.org/10.3348/kjr.2025.0148","url":null,"abstract":"<p><p>Cardiac sarcoidosis (CS) poses significant diagnostic and therapeutic challenges due to its heterogeneous clinical manifestations and the limitations of conventional diagnostic approaches. Advances in imaging modalities, particularly cardiac magnetic resonance imaging (CMR) and ¹⁸F-fluorodeoxyglucose positron emission tomography (FDG-PET), have revolutionized the evaluation and management of this complex condition. CMR, with its superior spatial resolution and advanced techniques such as late gadolinium enhancement, T1/T2 mapping, and extracellular volume quantification, offers unparalleled insights into myocardial structure and fibrosis. These techniques not only enhance diagnostic accuracy but also provide critical information on disease activity and treatment response. Among these, T2 mapping has emerged as a valuable marker for active inflammation, with high values reliably indicating acute disease states. FDG-PET serves as a complementary modality by detecting active granulomatous inflammation and guiding immunosuppressive therapy. The synergistic integration of CMR and FDG-PET provides a comprehensive approach to diagnosing and monitoring CS, enabling the identification of subclinical disease and the optimization of therapeutic strategies. Furthermore, the incorporation of quantitative biomarkers, such as strain metrics and T2 values, promises to refine disease assessment and management. These advancements have the potential to transform the paradigm of CS care, ultimately improving patient outcomes.</p>","PeriodicalId":17881,"journal":{"name":"Korean Journal of Radiology","volume":"26 9","pages":"804-816"},"PeriodicalIF":5.3,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12394825/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144959320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dabin Min, Kwang Nam Jin, SangHeum Bang, Moon Young Kim, Hack-Lyoung Kim, Won Gi Jeong, Hye-Jeong Lee, Kyongmin Sarah Beck, Sung Ho Hwang, Eun Young Kim, Chang Min Park
Objective: To evaluate the accuracy of large language models (LLMs) in extracting Coronary Artery Disease-Reporting and Data System (CAD-RADS) 2.0 components from coronary CT angiography (CCTA) reports, and assess the impact of prompting strategies.
Materials and methods: In this multi-institutional study, we collected 319 synthetic, semi-structured CCTA reports from six institutions to protect patient privacy while maintaining clinical relevance. The dataset included 150 reports from a primary institution (100 for instruction development and 50 for internal testing) and 169 reports from five external institutions for external testing. Board-certified radiologists established reference standards following the CAD-RADS 2.0 guidelines for all three components: stenosis severity, plaque burden, and modifiers. Six LLMs (GPT-4, GPT-4o, Claude-3.5-Sonnet, o1-mini, Gemini-1.5-Pro, and DeepSeek-R1-Distill-Qwen-14B) were evaluated using an optimized instruction with prompting strategies, including zero-shot or few-shot with or without chain-of-thought (CoT) prompting. The accuracy was assessed and compared using McNemar's test.
Results: LLMs demonstrated robust accuracy across all CAD-RADS 2.0 components. Peak stenosis severity accuracies reached 0.980 (48/49, Claude-3.5-Sonnet and o1-mini) in internal testing and 0.946 (158/167, GPT-4o and o1-mini) in external testing. Plaque burden extraction showed exceptional accuracy, with multiple models achieving perfect accuracy (43/43) in internal testing and 0.993 (137/138, GPT-4o, and o1-mini) in external testing. Modifier detection demonstrated consistently high accuracy (≥0.990) across most models. One open-source model, DeepSeek-R1-Distill-Qwen-14B, showed a relatively low accuracy for stenosis severity: 0.898 (44/49, internal) and 0.820 (137/167, external). CoT prompting significantly enhanced the accuracy of several models, with GPT-4 showing the most substantial improvements: stenosis severity accuracy increased by 0.192 (P < 0.001) and plaque burden accuracy by 0.152 (P < 0.001) in external testing.
Conclusion: LLMs demonstrated high accuracy in automated extraction of CAD-RADS 2.0 components from semi-structured CCTA reports, particularly when used with CoT prompting.
{"title":"Large Language Models for CAD-RADS 2.0 Extraction From Semi-Structured Coronary CT Angiography Reports: A Multi-Institutional Study.","authors":"Dabin Min, Kwang Nam Jin, SangHeum Bang, Moon Young Kim, Hack-Lyoung Kim, Won Gi Jeong, Hye-Jeong Lee, Kyongmin Sarah Beck, Sung Ho Hwang, Eun Young Kim, Chang Min Park","doi":"10.3348/kjr.2025.0293","DOIUrl":"10.3348/kjr.2025.0293","url":null,"abstract":"<p><strong>Objective: </strong>To evaluate the accuracy of large language models (LLMs) in extracting Coronary Artery Disease-Reporting and Data System (CAD-RADS) 2.0 components from coronary CT angiography (CCTA) reports, and assess the impact of prompting strategies.</p><p><strong>Materials and methods: </strong>In this multi-institutional study, we collected 319 synthetic, semi-structured CCTA reports from six institutions to protect patient privacy while maintaining clinical relevance. The dataset included 150 reports from a primary institution (100 for instruction development and 50 for internal testing) and 169 reports from five external institutions for external testing. Board-certified radiologists established reference standards following the CAD-RADS 2.0 guidelines for all three components: stenosis severity, plaque burden, and modifiers. Six LLMs (GPT-4, GPT-4o, Claude-3.5-Sonnet, o1-mini, Gemini-1.5-Pro, and DeepSeek-R1-Distill-Qwen-14B) were evaluated using an optimized instruction with prompting strategies, including zero-shot or few-shot with or without chain-of-thought (CoT) prompting. The accuracy was assessed and compared using McNemar's test.</p><p><strong>Results: </strong>LLMs demonstrated robust accuracy across all CAD-RADS 2.0 components. Peak stenosis severity accuracies reached 0.980 (48/49, Claude-3.5-Sonnet and o1-mini) in internal testing and 0.946 (158/167, GPT-4o and o1-mini) in external testing. Plaque burden extraction showed exceptional accuracy, with multiple models achieving perfect accuracy (43/43) in internal testing and 0.993 (137/138, GPT-4o, and o1-mini) in external testing. Modifier detection demonstrated consistently high accuracy (≥0.990) across most models. One open-source model, DeepSeek-R1-Distill-Qwen-14B, showed a relatively low accuracy for stenosis severity: 0.898 (44/49, internal) and 0.820 (137/167, external). CoT prompting significantly enhanced the accuracy of several models, with GPT-4 showing the most substantial improvements: stenosis severity accuracy increased by 0.192 (<i>P</i> < 0.001) and plaque burden accuracy by 0.152 (<i>P</i> < 0.001) in external testing.</p><p><strong>Conclusion: </strong>LLMs demonstrated high accuracy in automated extraction of CAD-RADS 2.0 components from semi-structured CCTA reports, particularly when used with CoT prompting.</p>","PeriodicalId":17881,"journal":{"name":"Korean Journal of Radiology","volume":"26 9","pages":"817-831"},"PeriodicalIF":5.3,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12394816/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144959333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chanyoung Rhee, Ki Jeong Hong, Ki Hong Kim, Jin Mo Goo, Eui Jin Hwang
Objective: In this study, we investigated whether artificial intelligence (AI) analysis of chest radiographs (CXRs) can predict major adverse clinical events in patients visiting the emergency department (ED) with acute cardiopulmonary symptoms.
Materials and methods: This secondary analysis of a previous clinical trial included patients who visited the ED with symptoms suggestive of acute cardiopulmonary disease and underwent chest radiography between June 2020 and December 2021. All patients underwent triage upon arrival at ED according to the Korean Triage and Acuity Scale (KTAS). The CXRs were retrospectively analyzed using a commercial AI (Lunit INSIGHT CXR, version 3.1.4.1) capable of detecting seven abnormalities on a single frontal CXR. The predictive performance of the AI analysis for major adverse cardiopulmonary events (any among hospitalization, ED revisits, and death in the ED due to acute cardiopulmonary disease) was compared with that of the KTAS using the area under the receiver operating characteristic curve (AUC). Multivariable (the AI analysis result and KTAS level) logistic regression analysis was conducted to investigate whether the AI analysis result was an independent predictor of the events and whether the combination of the AI analysis and KTAS has additional merit.
Results: Among 3576 patients (1966 males; mean age, 64 years), 1148 (32.1%) experienced major adverse cardiopulmonary events. AI analysis of CXRs outperformed the KTAS in predicting these events (AUC, 0.795 vs. 0.610; P < 0.001). The AI analysis result was an independent predictor of these events after adjusting for the KTAS level (adjusted odd ratios of 1.032 and 6.913 for every 1% increase and ≥15%, respectively, in the AI probability score; P < 0.001). The combination of the AI analysis and KTAS showed an AUC that was higher than that of the KTAS alone (0.799; P < 0.001) and in-par with that of the AI analysis only (P = 0.187).
Conclusion: AI analysis of CXRs showed greater accuracy than the KTAS did in predicting major adverse cardiopulmonary events in patients visiting the ED with acute cardiopulmonary symptoms. AI analysis may enhance the efficacy of patient triage in the ED.
目的:在本研究中,我们探讨人工智能(AI)胸片(cxr)分析是否可以预测急诊科(ED)急性心肺症状患者的主要不良临床事件。材料和方法:这项对先前临床试验的二次分析纳入了在2020年6月至2021年12月期间因急性心肺疾病症状就诊于急诊科并接受胸部x光检查的患者。所有患者在到达急诊科时均根据韩国分诊和视力分级(KTAS)进行分诊。使用商用人工智能(Lunit INSIGHT CXR,版本3.1.4.1)对CXR进行回顾性分析,该人工智能能够在单个正面CXR上检测到七个异常。使用受试者工作特征曲线(AUC)下的面积,比较AI分析对主要不良心肺事件(住院、急诊科就诊和急诊科因急性心肺疾病死亡)的预测性能与KTAS的预测性能。进行多变量(人工智能分析结果和KTAS水平)逻辑回归分析,以调查人工智能分析结果是否是事件的独立预测因子,以及人工智能分析和KTAS的组合是否具有额外的优点。结果:在3576例患者中(男性1966例,平均年龄64岁),1148例(32.1%)发生重大不良心肺事件。人工智能分析在预测这些事件方面优于KTAS (AUC, 0.795 vs. 0.610; P < 0.001)。在调整KTAS水平后,人工智能分析结果是这些事件的独立预测因子(人工智能概率评分每增加1%和≥15%,调整奇数比分别为1.032和6.913,P < 0.001)。人工智能分析与KTAS联合使用的AUC高于单独使用KTAS的AUC (0.799, P < 0.001),与单独使用人工智能分析的AUC相当(P = 0.187)。结论:在预测急诊科有急性心肺症状患者的主要不良心肺事件方面,cxr的AI分析比KTAS更准确。人工智能分析可以提高急诊科患者分诊的效率。
{"title":"Artificial Intelligence Analysis of Chest Radiographs for Predicting Major Adverse Events in Patients Visiting the Emergency Department With Acute Cardiopulmonary Symptoms.","authors":"Chanyoung Rhee, Ki Jeong Hong, Ki Hong Kim, Jin Mo Goo, Eui Jin Hwang","doi":"10.3348/kjr.2025.0237","DOIUrl":"https://doi.org/10.3348/kjr.2025.0237","url":null,"abstract":"<p><strong>Objective: </strong>In this study, we investigated whether artificial intelligence (AI) analysis of chest radiographs (CXRs) can predict major adverse clinical events in patients visiting the emergency department (ED) with acute cardiopulmonary symptoms.</p><p><strong>Materials and methods: </strong>This secondary analysis of a previous clinical trial included patients who visited the ED with symptoms suggestive of acute cardiopulmonary disease and underwent chest radiography between June 2020 and December 2021. All patients underwent triage upon arrival at ED according to the Korean Triage and Acuity Scale (KTAS). The CXRs were retrospectively analyzed using a commercial AI (Lunit INSIGHT CXR, version 3.1.4.1) capable of detecting seven abnormalities on a single frontal CXR. The predictive performance of the AI analysis for major adverse cardiopulmonary events (any among hospitalization, ED revisits, and death in the ED due to acute cardiopulmonary disease) was compared with that of the KTAS using the area under the receiver operating characteristic curve (AUC). Multivariable (the AI analysis result and KTAS level) logistic regression analysis was conducted to investigate whether the AI analysis result was an independent predictor of the events and whether the combination of the AI analysis and KTAS has additional merit.</p><p><strong>Results: </strong>Among 3576 patients (1966 males; mean age, 64 years), 1148 (32.1%) experienced major adverse cardiopulmonary events. AI analysis of CXRs outperformed the KTAS in predicting these events (AUC, 0.795 vs. 0.610; <i>P</i> < 0.001). The AI analysis result was an independent predictor of these events after adjusting for the KTAS level (adjusted odd ratios of 1.032 and 6.913 for every 1% increase and ≥15%, respectively, in the AI probability score; <i>P</i> < 0.001). The combination of the AI analysis and KTAS showed an AUC that was higher than that of the KTAS alone (0.799; <i>P</i> < 0.001) and in-par with that of the AI analysis only (<i>P</i> = 0.187).</p><p><strong>Conclusion: </strong>AI analysis of CXRs showed greater accuracy than the KTAS did in predicting major adverse cardiopulmonary events in patients visiting the ED with acute cardiopulmonary symptoms. AI analysis may enhance the efficacy of patient triage in the ED.</p>","PeriodicalId":17881,"journal":{"name":"Korean Journal of Radiology","volume":"26 9","pages":"877-887"},"PeriodicalIF":5.3,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12394822/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144959289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yeo Eun Han, Deuk Jae Sung, Hyun Yee Cho, Kyung Sook Yang, Jae Wook Park, Ki Choon Sim, Na Yeon Han, Beom Jin Park, Min Ju Kim
Objective: Plasmacytoid urothelial carcinoma (PUC) is a rare aggressive bladder cancer subtype with limited imaging data owing to its low incidence. This study aimed to report the characteristic features of PUC on multiparametric MRI (mpMRI).
Materials and methods: We retrospectively analyzed 13 patients with histologically confirmed PUC who underwent preoperative mpMRI between January 2019 and August 2024. Two blinded radiologists independently assessed tumor size, morphology, signal intensity, apparent diffusion coefficient (ADC) values, dynamic contrast enhancement patterns, contrast enhancement features, and invasive characteristics. Vesical imaging-reporting and data system (VI-RADS) scores were recorded. Interobserver agreement was evaluated using the kappa statistic.
Results: PUC predominantly exhibited diffuse (6/13, 46.2%) or localized (5/13, 38.5%) bladder wall thickening. Diffuse thickening was often associated with a linitis plastica-like appearance. On high b-value diffusion-weighted imaging (DWI), eight and seven cases depending on readers (61.5% and 53.8%, respectively) showed mild hyperintensity or isointensity, with a mean ADC value of 1.1 × 10⁻³ mm²/s. Dynamic contrast-enhanced MRI revealed progressive and prolonged enhancement in 10 cases (76.9%). VI-RADS scores ≥ 4 were observed in 11 cases (84.6%). Histopathological analysis showed that tumors with progressive and prolonged enhancement contained myxoid stroma and some fibrous tissue. Interobserver agreement was excellent for most imaging features, except for good agreement on DWI signal intensity.
Conclusion: PUC demonstrates notable mpMRI features, including localized or diffuse wall thickening (often with a linitis plastica-like appearance), muscle-invasive and advanced disease, progressive and prolonged enhancement patterns, and mild hyperintensity or isointensity on high b-value DWI. These features, which are potentially linked to the myxoid stromal composition of the tumor, suggest that mpMRI may serve as a noninvasive diagnostic tool for this aggressive malignancy. However, further studies with larger cohorts are required to confirm these findings.
{"title":"Multiparametric MRI Features of Plasmacytoid Urothelial Carcinoma of the Urinary Bladder.","authors":"Yeo Eun Han, Deuk Jae Sung, Hyun Yee Cho, Kyung Sook Yang, Jae Wook Park, Ki Choon Sim, Na Yeon Han, Beom Jin Park, Min Ju Kim","doi":"10.3348/kjr.2025.0419","DOIUrl":"https://doi.org/10.3348/kjr.2025.0419","url":null,"abstract":"<p><strong>Objective: </strong>Plasmacytoid urothelial carcinoma (PUC) is a rare aggressive bladder cancer subtype with limited imaging data owing to its low incidence. This study aimed to report the characteristic features of PUC on multiparametric MRI (mpMRI).</p><p><strong>Materials and methods: </strong>We retrospectively analyzed 13 patients with histologically confirmed PUC who underwent preoperative mpMRI between January 2019 and August 2024. Two blinded radiologists independently assessed tumor size, morphology, signal intensity, apparent diffusion coefficient (ADC) values, dynamic contrast enhancement patterns, contrast enhancement features, and invasive characteristics. Vesical imaging-reporting and data system (VI-RADS) scores were recorded. Interobserver agreement was evaluated using the kappa statistic.</p><p><strong>Results: </strong>PUC predominantly exhibited diffuse (6/13, 46.2%) or localized (5/13, 38.5%) bladder wall thickening. Diffuse thickening was often associated with a linitis plastica-like appearance. On high b-value diffusion-weighted imaging (DWI), eight and seven cases depending on readers (61.5% and 53.8%, respectively) showed mild hyperintensity or isointensity, with a mean ADC value of 1.1 × 10⁻³ mm²/s. Dynamic contrast-enhanced MRI revealed progressive and prolonged enhancement in 10 cases (76.9%). VI-RADS scores ≥ 4 were observed in 11 cases (84.6%). Histopathological analysis showed that tumors with progressive and prolonged enhancement contained myxoid stroma and some fibrous tissue. Interobserver agreement was excellent for most imaging features, except for good agreement on DWI signal intensity.</p><p><strong>Conclusion: </strong>PUC demonstrates notable mpMRI features, including localized or diffuse wall thickening (often with a linitis plastica-like appearance), muscle-invasive and advanced disease, progressive and prolonged enhancement patterns, and mild hyperintensity or isointensity on high b-value DWI. These features, which are potentially linked to the myxoid stromal composition of the tumor, suggest that mpMRI may serve as a noninvasive diagnostic tool for this aggressive malignancy. However, further studies with larger cohorts are required to confirm these findings.</p>","PeriodicalId":17881,"journal":{"name":"Korean Journal of Radiology","volume":"26 9","pages":"832-840"},"PeriodicalIF":5.3,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12394820/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144959343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jun Sung Park, Jisun Hwang, Pyeong Hwa Kim, Woo Hyun Shim, Min Jeong Seo, Dahyun Kim, Jeong In Shin, In Hwa Kim, Hwon Heo, Chong Hyun Suh
Objective: To evaluate the accuracy of multimodal large language models (LLMs) in detecting cases requiring immediate radiology reporting in pediatric radiology.
Materials and methods: Seventy-one publicly available, paraphrased pediatric clinical vignettes with images-sourced from the New England Journal of Medicine, The Lancet, Archives of Pediatrics & Adolescent Medicine, and Radiology-were assessed by seven vision-capable LLMs (temperature levels 0 and 1; t0 and t1) and four human readers (an expert pediatric radiologist, a trainee radiologist, an expert pediatrician, and a trainee pediatrician). Cases were classified as requiring immediate reporting (n = 33) if they corresponded to Korean Triage and Acuity Scale (KTAS) levels 1-2 (n = 24) or met the criteria for a critical value report (CVR) (n = 11). The most accurate LLM was compared with each human reader, with significance set at P < 0.013.
Results: LLMs demonstrated 60.6%-83.1% accuracy in detecting cases requiring immediate radiology reporting: 57.7%-81.7% and 53.5%-87.3% for KTAS levels 1-2 and CVR cases, respectively. Gemini-Flash with t1 showed the highest accuracy among the LLMs: 83.1% (95% confidence interval [CI]: 74.6%-91.5%), 81.7% (95% CI: 71.8%-90.1%), and 87.3% (95% CI: 78.9%-94.4%) for identifying cases requiring immediate reporting, KTAS level 1-2 cases, and CVR cases, respectively, despite its low sensitivity for CVR detection (3/11, 27.3%). Human readers demonstrated 62.0%-84.5% accuracy for immediate radiology reporting, 73.2%-84.5% for KTAS levels 1-2, and 39.4%-94.4% for CVR cases. The accuracy of Gemini-Flash t1 in identifying cases requiring immediate radiology reporting was comparable to that of the most accurate human reader (vs. expert pediatrician: 84.5% [95% CI: 76.1%-93.0%]; P < 0.99).
Conclusion: Multimodal LLMs may achieve overall accuracy comparable to or exceeding that of human readers in identifying cases requiring immediate radiology reporting, supporting their potential use for pediatric radiology worklist prioritization. However, the models' sensitivity in detecting such cases was not reliable.
{"title":"Accuracy of Large Language Models in Detecting Cases Requiring Immediate Reporting in Pediatric Radiology: A Feasibility Study Using Publicly Available Clinical Vignettes.","authors":"Jun Sung Park, Jisun Hwang, Pyeong Hwa Kim, Woo Hyun Shim, Min Jeong Seo, Dahyun Kim, Jeong In Shin, In Hwa Kim, Hwon Heo, Chong Hyun Suh","doi":"10.3348/kjr.2025.0240","DOIUrl":"https://doi.org/10.3348/kjr.2025.0240","url":null,"abstract":"<p><strong>Objective: </strong>To evaluate the accuracy of multimodal large language models (LLMs) in detecting cases requiring immediate radiology reporting in pediatric radiology.</p><p><strong>Materials and methods: </strong>Seventy-one publicly available, paraphrased pediatric clinical vignettes with images-sourced from the <i>New England Journal of Medicine</i>, <i>The Lancet</i>, <i>Archives of Pediatrics & Adolescent Medicine</i>, and <i>Radiology</i>-were assessed by seven vision-capable LLMs (temperature levels 0 and 1; t0 and t1) and four human readers (an expert pediatric radiologist, a trainee radiologist, an expert pediatrician, and a trainee pediatrician). Cases were classified as requiring immediate reporting (n = 33) if they corresponded to Korean Triage and Acuity Scale (KTAS) levels 1-2 (n = 24) or met the criteria for a critical value report (CVR) (n = 11). The most accurate LLM was compared with each human reader, with significance set at <i>P</i> < 0.013.</p><p><strong>Results: </strong>LLMs demonstrated 60.6%-83.1% accuracy in detecting cases requiring immediate radiology reporting: 57.7%-81.7% and 53.5%-87.3% for KTAS levels 1-2 and CVR cases, respectively. Gemini-Flash with t1 showed the highest accuracy among the LLMs: 83.1% (95% confidence interval [CI]: 74.6%-91.5%), 81.7% (95% CI: 71.8%-90.1%), and 87.3% (95% CI: 78.9%-94.4%) for identifying cases requiring immediate reporting, KTAS level 1-2 cases, and CVR cases, respectively, despite its low sensitivity for CVR detection (3/11, 27.3%). Human readers demonstrated 62.0%-84.5% accuracy for immediate radiology reporting, 73.2%-84.5% for KTAS levels 1-2, and 39.4%-94.4% for CVR cases. The accuracy of Gemini-Flash t1 in identifying cases requiring immediate radiology reporting was comparable to that of the most accurate human reader (vs. expert pediatrician: 84.5% [95% CI: 76.1%-93.0%]; <i>P</i> < 0.99).</p><p><strong>Conclusion: </strong>Multimodal LLMs may achieve overall accuracy comparable to or exceeding that of human readers in identifying cases requiring immediate radiology reporting, supporting their potential use for pediatric radiology worklist prioritization. However, the models' sensitivity in detecting such cases was not reliable.</p>","PeriodicalId":17881,"journal":{"name":"Korean Journal of Radiology","volume":"26 9","pages":"855-866"},"PeriodicalIF":5.3,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12394824/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144959245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Selina Chiu, Yvonne Tsitsiou, Andrea Da Silva, Cathy Qin, Christina Fotopoulou, Andrea Rockall
Ovarian cancer (OC) remains one of the leading causes of gynecologic cancer-related mortality, with most patients presenting with disseminated disease, particularly within the peritoneal cavity. Standard treatment includes cytoreductive surgery, platinum-based chemotherapy, and targeted maintenance approaches depending on the patient's and tumor's genetic profile. Despite treatment advancements, approximately 25% of high-grade serous OC cases relapse within a year despite optimal primary treatment with complete tumor clearance at cytoreduction. Advances in contrast-enhanced CT (CE-CT) and MRI have revolutionized the evaluation and treatment planning of advanced OC. CT remains the gold standard for staging and assessing tumor extent, effectively identifying peritoneal, lymphatic, and distant metastases. However, it is less effective in detecting small-volume peritoneal dissemination. MRI, with superior soft-tissue contrast, complements CT by providing a detailed assessment of peritoneal disease, characterizing sonographically indeterminate adnexal masses. Diffusion-weighted imaging and gadolinium-enhanced MRI have improved the diagnostic sensitivity for peritoneal disease but are unable to predict treatment response, recurrence risk, and prognosis. Radiomics, which extracts quantitative tumor features from imaging data, holds promise for personalizing treatment and identifying patients at risk for early recurrence despite optimal therapy. The integration of CT, MRI, and radiomics could enhance surgical planning and improve long-term survival outcomes in patients with advanced OC.
{"title":"CT and MRI in Advanced Ovarian Cancer: Advances in Imaging Techniques.","authors":"Selina Chiu, Yvonne Tsitsiou, Andrea Da Silva, Cathy Qin, Christina Fotopoulou, Andrea Rockall","doi":"10.3348/kjr.2025.0357","DOIUrl":"https://doi.org/10.3348/kjr.2025.0357","url":null,"abstract":"<p><p>Ovarian cancer (OC) remains one of the leading causes of gynecologic cancer-related mortality, with most patients presenting with disseminated disease, particularly within the peritoneal cavity. Standard treatment includes cytoreductive surgery, platinum-based chemotherapy, and targeted maintenance approaches depending on the patient's and tumor's genetic profile. Despite treatment advancements, approximately 25% of high-grade serous OC cases relapse within a year despite optimal primary treatment with complete tumor clearance at cytoreduction. Advances in contrast-enhanced CT (CE-CT) and MRI have revolutionized the evaluation and treatment planning of advanced OC. CT remains the gold standard for staging and assessing tumor extent, effectively identifying peritoneal, lymphatic, and distant metastases. However, it is less effective in detecting small-volume peritoneal dissemination. MRI, with superior soft-tissue contrast, complements CT by providing a detailed assessment of peritoneal disease, characterizing sonographically indeterminate adnexal masses. Diffusion-weighted imaging and gadolinium-enhanced MRI have improved the diagnostic sensitivity for peritoneal disease but are unable to predict treatment response, recurrence risk, and prognosis. Radiomics, which extracts quantitative tumor features from imaging data, holds promise for personalizing treatment and identifying patients at risk for early recurrence despite optimal therapy. The integration of CT, MRI, and radiomics could enhance surgical planning and improve long-term survival outcomes in patients with advanced OC.</p>","PeriodicalId":17881,"journal":{"name":"Korean Journal of Radiology","volume":"26 9","pages":"841-854"},"PeriodicalIF":5.3,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12394823/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144959357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Objective: To develop a deep learning model for estimating newborn gestational age (GA) based on the shape of the lumbar vertebral bodies on cross-table lateral radiographs obtained on the first day after birth.
Materials and methods: This retrospective study included 423 cross-table lateral radiographs of 423 newborns (242 boys and 181 girls) taken within 24 hours after birth at two hospitals. Of these, 256 radiographs (157 boys and 99 girls) obtained from one institution were used for model development, and 167 radiographs (85 boys and 82 girls) from the other institution were used for model external testing. Clinical data, including medical history of underlying disorders, GA determined by ultrasound parameters, birth date, birth weight, sex, examination date, and reason for requesting radiographs, were obtained. The radiographs underwent manual labeling of the five lumbar vertebral bodies, followed by preprocessing steps such as normalization, resizing, denoising, cropping, and augmentation via horizontal flipping and rotation. Subsequently, we trained a deep learning model using a DeepLabv3+ network with a ResNet50 backbone for lumbar segmentation and used a customized AgeClassifier model with two parallel ResNet18 backbones for GA estimation. Model performance was evaluated using an external test dataset after image cropping.
Results: Neither GA nor birth weight differed significantly between boys and girls. In the segmentation model, the mean dice similarity coefficient ± standard deviation (SD) was 0.801 ± 0.031. For GA estimation, the mean absolute error ± SD was 5.2 ± 0.5 days. The Bland-Altman bias (AI-estimated GA - ground truth GA) and 95% limits of agreement were -0.4 days and -13.0 to 12.3 days, respectively.
Conclusion: Our deep learning model showed promising performance in lumbar vertebral body segmentation and GA estimation using plain radiographs, suggesting its potential utility as a supportive tool for neonatal maturity assessment in clinical practice.
目的:建立一种深度学习模型,根据出生后第一天的交叉桌侧位片腰椎椎体形状估计新生儿胎龄(GA)。材料与方法:本回顾性研究纳入两家医院423例新生儿(242例男婴,181例女婴)出生后24小时内的423张横贯台侧位片。其中,从一个机构获得的256张x光片(157名男孩和99名女孩)用于模型开发,从另一个机构获得的167张x光片(85名男孩和82名女孩)用于模型外部测试。获得临床资料,包括基础疾病病史、超声参数确定的GA、出生日期、出生体重、性别、检查日期和要求x线片的原因。x线片对5个腰椎椎体进行手动标记,然后进行预处理,如标准化、调整大小、去噪、裁剪和水平翻转和旋转增强。随后,我们使用带ResNet50骨干网的DeepLabv3+网络训练深度学习模型进行腰椎分割,并使用带两个并行ResNet18骨干网的定制AgeClassifier模型进行GA估计。使用图像裁剪后的外部测试数据集评估模型性能。结果:GA和出生体重在男孩和女孩之间没有显著差异。在分割模型中,平均骰子相似系数±标准差(SD)为0.801±0.031。GA估计的平均绝对误差±SD为5.2±0.5天。Bland-Altman偏差(ai估计的GA - ground truth GA)和95%的一致性限制分别为-0.4天和-13.0至12.3天。结论:我们的深度学习模型在腰椎椎体分割和x线平片GA估计方面表现良好,表明其在临床实践中作为新生儿成熟度评估的辅助工具具有潜在的实用性。
{"title":"Development of a Deep-Learning Model for Estimating Newborn Gestational Age via Lumbar Vertebral Segmentation on Plain Radiography.","authors":"Sungwon Ham, Gayoung Choi, Bo-Kyung Je, Saelin Oh","doi":"10.3348/kjr.2025.0172","DOIUrl":"10.3348/kjr.2025.0172","url":null,"abstract":"<p><strong>Objective: </strong>To develop a deep learning model for estimating newborn gestational age (GA) based on the shape of the lumbar vertebral bodies on cross-table lateral radiographs obtained on the first day after birth.</p><p><strong>Materials and methods: </strong>This retrospective study included 423 cross-table lateral radiographs of 423 newborns (242 boys and 181 girls) taken within 24 hours after birth at two hospitals. Of these, 256 radiographs (157 boys and 99 girls) obtained from one institution were used for model development, and 167 radiographs (85 boys and 82 girls) from the other institution were used for model external testing. Clinical data, including medical history of underlying disorders, GA determined by ultrasound parameters, birth date, birth weight, sex, examination date, and reason for requesting radiographs, were obtained. The radiographs underwent manual labeling of the five lumbar vertebral bodies, followed by preprocessing steps such as normalization, resizing, denoising, cropping, and augmentation via horizontal flipping and rotation. Subsequently, we trained a deep learning model using a DeepLabv3+ network with a ResNet50 backbone for lumbar segmentation and used a customized AgeClassifier model with two parallel ResNet18 backbones for GA estimation. Model performance was evaluated using an external test dataset after image cropping.</p><p><strong>Results: </strong>Neither GA nor birth weight differed significantly between boys and girls. In the segmentation model, the mean dice similarity coefficient ± standard deviation (SD) was 0.801 ± 0.031. For GA estimation, the mean absolute error ± SD was 5.2 ± 0.5 days. The Bland-Altman bias (AI-estimated GA - ground truth GA) and 95% limits of agreement were -0.4 days and -13.0 to 12.3 days, respectively.</p><p><strong>Conclusion: </strong>Our deep learning model showed promising performance in lumbar vertebral body segmentation and GA estimation using plain radiographs, suggesting its potential utility as a supportive tool for neonatal maturity assessment in clinical practice.</p>","PeriodicalId":17881,"journal":{"name":"Korean Journal of Radiology","volume":"26 9","pages":"867-876"},"PeriodicalIF":5.3,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12394821/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144959313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}