Pub Date : 2025-12-01Epub Date: 2025-09-26DOI: 10.1007/s11547-025-02095-8
Giulia Bicchierai, Francesco Amato, Chiara Bellini, Jacopo Nori
{"title":"Reply to the letter to the editor \"preoperative imaging in breast cancer staging: can CEM stand alone?\"","authors":"Giulia Bicchierai, Francesco Amato, Chiara Bellini, Jacopo Nori","doi":"10.1007/s11547-025-02095-8","DOIUrl":"10.1007/s11547-025-02095-8","url":null,"abstract":"","PeriodicalId":20817,"journal":{"name":"Radiologia Medica","volume":" ","pages":"1901-1902"},"PeriodicalIF":4.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145150658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-09-29DOI: 10.1007/s11547-025-02098-5
Hong-Seon Lee, Sungjun Kim, Songsoo Kim, Jeongrok Seo, Won Hwa Kim, Jaeil Kim, Kyunghwa Han, Shin Hye Hwang, Young Han Lee
Purpose: To examine how reading grade levels affect stakeholder preferences based on a trade-off between accuracy and readability.
Material and methods: A retrospective study of 500 radiology reports from academic and community hospitals across five imaging modalities was conducted. Reports were transformed into 11 reading grade levels (7-17) using Gemini. Accuracy, readability, and preference were rated on a 5-point scale by radiologists, physicians, and laypersons. Errors (generalizations, omissions, hallucinations) and potential changes in patient management (PCPM) were identified. Ordinal logistic regression analyzed preference predictors, and weighted kappa measured interobserver reliability.
Results: Preferences varied across reading grade levels depending on stakeholder group, modality, and clinical setting. Overall, preferences peaked at grade 16, but declined at grade 17, particularly among laypersons. Lower reading grades improved readability but increased errors, while higher grades improved accuracy but reduced readability. In multivariable analysis, accuracy was the strongest predictor of preference for all groups (OR: 30.29, 33.05, and 2.16; p <0 .001), followed by readability (OR: 2.73, 1.70, 2.01; p <0.001).
Conclusion: Higher-grade levels were generally preferred due to better accuracy, with a range of 12-17. Further increasing grade levels reduced readability sharply, limiting preference. These findings highlight the limitations of unsupervised LLM transformations and suggest the need for hybrid approaches that maintain original reports while incorporating explanatory content to balance accuracy and readability.
{"title":"Readability versus accuracy in LLM-transformed radiology reports: stakeholder preferences across reading grade levels.","authors":"Hong-Seon Lee, Sungjun Kim, Songsoo Kim, Jeongrok Seo, Won Hwa Kim, Jaeil Kim, Kyunghwa Han, Shin Hye Hwang, Young Han Lee","doi":"10.1007/s11547-025-02098-5","DOIUrl":"10.1007/s11547-025-02098-5","url":null,"abstract":"<p><strong>Purpose: </strong>To examine how reading grade levels affect stakeholder preferences based on a trade-off between accuracy and readability.</p><p><strong>Material and methods: </strong>A retrospective study of 500 radiology reports from academic and community hospitals across five imaging modalities was conducted. Reports were transformed into 11 reading grade levels (7-17) using Gemini. Accuracy, readability, and preference were rated on a 5-point scale by radiologists, physicians, and laypersons. Errors (generalizations, omissions, hallucinations) and potential changes in patient management (PCPM) were identified. Ordinal logistic regression analyzed preference predictors, and weighted kappa measured interobserver reliability.</p><p><strong>Results: </strong>Preferences varied across reading grade levels depending on stakeholder group, modality, and clinical setting. Overall, preferences peaked at grade 16, but declined at grade 17, particularly among laypersons. Lower reading grades improved readability but increased errors, while higher grades improved accuracy but reduced readability. In multivariable analysis, accuracy was the strongest predictor of preference for all groups (OR: 30.29, 33.05, and 2.16; p <0 .001), followed by readability (OR: 2.73, 1.70, 2.01; p <0.001).</p><p><strong>Conclusion: </strong>Higher-grade levels were generally preferred due to better accuracy, with a range of 12-17. Further increasing grade levels reduced readability sharply, limiting preference. These findings highlight the limitations of unsupervised LLM transformations and suggest the need for hybrid approaches that maintain original reports while incorporating explanatory content to balance accuracy and readability.</p>","PeriodicalId":20817,"journal":{"name":"Radiologia Medica","volume":" ","pages":"1986-1999"},"PeriodicalIF":4.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145192611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-09-18DOI: 10.1007/s11547-025-02086-9
Piero Ruscitti, Camilla Gianneramo, Pierpaolo Palumbo, Manfredo Bruni, Martina Gentile, Sabrina Lanzi, Emanuele Vagnozzi, Alessia Loda, Lina Maria Magnanimi, Maria Concetta Fargnoli, Antonio Barile, Paola Cipriani, Maria Esposito
Purpose: To evaluate the effectiveness of IL-17 and IL-23 inhibitors in psoriatic nail and enthesis involvement by ultrasonography with the use of high-frequency probes (HFUS). To correlate the obtained HFUS findings with disease activity of patients with psoriatic arthritis (PsA).
Material and methods: Consecutive early naïve patients with PsA underwent HFUS on nails and entheses before and after 24 weeks of treatment with IL-17 or IL-23 inhibitor. The Brown University Nail Enthesis Scale (BUNES), considering morphometry and Power Doppler (PD), and the Madrid Sonography Enthesitis Index (MASEI) score were used to evaluate these features. HFUS findings were correlated with the extension of the disease on skin by Psoriasis Area and Severity Index (PASI) and joints by Disease Activity Index for Psoriatic Arthritis (DAPSA).
Results: Twenty early naïve patients with PsA were treated for 24 weeks with an IL-17 or IL-23 inhibitor. A significant reduction of BUNES PD was observed considering the whole cohort of patients receiving these drugs (p = 0.044), whereas, despite a trend, no significant difference was reported comparing BUNES morphometry. The BUNES PD correlated with PASI (r = 0.466, p = 0.030) and with DAPSA (r = 0.444, p = 0.032), whereas BUNES morphometry did not. A significant reduction of MASEI was observed considering the whole assessed cohort of patients treated with these drugs (p = 0.045). The MASEI correlated with both PASI (r = 0.429, p = 0.037) and DAPSA (r = 0.499, p = 0.017).
Conclusions: This proof-of-concept study demonstrated that the assessment by HFUS may provide additional accurate information about the effectiveness of IL-17 and IL-23 inhibitors in psoriatic nail and enthesis involvement.
目的:通过超声高频探头(HFUS)评价IL-17和IL-23抑制剂对银屑病甲及甲端受损伤的疗效。目的:将所得的HFUS结果与银屑病关节炎(PsA)患者的疾病活动性联系起来。材料和方法:连续的早期naïve PsA患者在IL-17或IL-23抑制剂治疗前和24周后对指甲和牙套进行HFUS治疗。采用布朗大学指甲内陷量表(BUNES),考虑形态计量学和功率多普勒(PD),以及马德里超声内陷指数(MASEI)评分来评估这些特征。HFUS检查结果与银屑病面积和严重程度指数(PASI)和银屑病关节炎疾病活动指数(DAPSA)的疾病在皮肤上的延伸相关。结果:20例早期naïve PsA患者用IL-17或IL-23抑制剂治疗24周。考虑到接受这些药物的整个队列患者,BUNES PD显著降低(p = 0.044),然而,尽管有趋势,但比较BUNES形态学没有显著差异。BUNES PD与PASI (r = 0.466, p = 0.030)和DAPSA (r = 0.444, p = 0.032)相关,而BUNES形态学不相关。考虑到使用这些药物治疗的患者的整个评估队列,观察到MASEI的显著降低(p = 0.045)。MASEI与PASI (r = 0.429, p = 0.037)和DAPSA (r = 0.499, p = 0.017)均相关。结论:这项概念验证性研究表明,HFUS评估可能提供关于IL-17和IL-23抑制剂在银屑病指甲和椎体受损伤中的有效性的额外准确信息。
{"title":"The evaluation of effectiveness of IL-17 and IL-23 inhibitors on nail and enthesis involvement in early psoriatic arthritis patients by high-frequency ultrasonography: a single-centre prospective proof-of-concept study.","authors":"Piero Ruscitti, Camilla Gianneramo, Pierpaolo Palumbo, Manfredo Bruni, Martina Gentile, Sabrina Lanzi, Emanuele Vagnozzi, Alessia Loda, Lina Maria Magnanimi, Maria Concetta Fargnoli, Antonio Barile, Paola Cipriani, Maria Esposito","doi":"10.1007/s11547-025-02086-9","DOIUrl":"10.1007/s11547-025-02086-9","url":null,"abstract":"<p><strong>Purpose: </strong>To evaluate the effectiveness of IL-17 and IL-23 inhibitors in psoriatic nail and enthesis involvement by ultrasonography with the use of high-frequency probes (HFUS). To correlate the obtained HFUS findings with disease activity of patients with psoriatic arthritis (PsA).</p><p><strong>Material and methods: </strong>Consecutive early naïve patients with PsA underwent HFUS on nails and entheses before and after 24 weeks of treatment with IL-17 or IL-23 inhibitor. The Brown University Nail Enthesis Scale (BUNES), considering morphometry and Power Doppler (PD), and the Madrid Sonography Enthesitis Index (MASEI) score were used to evaluate these features. HFUS findings were correlated with the extension of the disease on skin by Psoriasis Area and Severity Index (PASI) and joints by Disease Activity Index for Psoriatic Arthritis (DAPSA).</p><p><strong>Results: </strong>Twenty early naïve patients with PsA were treated for 24 weeks with an IL-17 or IL-23 inhibitor. A significant reduction of BUNES PD was observed considering the whole cohort of patients receiving these drugs (p = 0.044), whereas, despite a trend, no significant difference was reported comparing BUNES morphometry. The BUNES PD correlated with PASI (r = 0.466, p = 0.030) and with DAPSA (r = 0.444, p = 0.032), whereas BUNES morphometry did not. A significant reduction of MASEI was observed considering the whole assessed cohort of patients treated with these drugs (p = 0.045). The MASEI correlated with both PASI (r = 0.429, p = 0.037) and DAPSA (r = 0.499, p = 0.017).</p><p><strong>Conclusions: </strong>This proof-of-concept study demonstrated that the assessment by HFUS may provide additional accurate information about the effectiveness of IL-17 and IL-23 inhibitors in psoriatic nail and enthesis involvement.</p>","PeriodicalId":20817,"journal":{"name":"Radiologia Medica","volume":" ","pages":"2044-2054"},"PeriodicalIF":4.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12669340/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145081519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Over the past decades, neoadjuvant systemic treatment (NAT) has been increasingly adopted in early-stage breast cancer (BC), highlighting the need for a more accurate assessment of treatment response. Imaging tools such as [18F]2-fluoro-2-deoxy-D-glucose ([18F]FDG) positron emission tomography combined with computed tomography (PET/CT) may enhance diagnostic accuracy in this context. By comprehensively reviewing the available literature, [18F]FDG PET/CT generally shows good sensibility but lower specificity for predicting and evaluating pathological complete response (pCR), respectively, during NAT and preoperatively, in both the breast and lymph nodes. Thereby its use may support timely escalation of systemic treatment or surgery in patients with poor metabolic response. However, definitive conclusions are limited by small, heterogeneous studies with variable patient selection, timing, and response definitions. Consequently, while international guidelines remain inconsistent, further evidence is needed to define its role in response assessment, establish the optimal use in clinical practice, and clarify its integration into (de)-escalation strategies.
{"title":"[<sup>18</sup>F]FDG PET/CT as a biomarker for response evaluation in neoadjuvant treatment of early breast cancer: could it become a game-changer in the scenario of the emerging (de)-escalation strategies?","authors":"Riccardo Gerosa, Fabrizia Gelardi, Paola Tiberio, Flavia Jacobs, Chiara Benvenuti, Mariangela Gaudio, Jacopo Canzian, Benedetta Tinterri, Alberto Zambelli, Armando Santoro, Lidija Antunovic, Rita De Sanctis","doi":"10.1007/s11547-025-02138-0","DOIUrl":"https://doi.org/10.1007/s11547-025-02138-0","url":null,"abstract":"<p><p>Over the past decades, neoadjuvant systemic treatment (NAT) has been increasingly adopted in early-stage breast cancer (BC), highlighting the need for a more accurate assessment of treatment response. Imaging tools such as [<sup>18</sup>F]2-fluoro-2-deoxy-D-glucose ([<sup>18</sup>F]FDG) positron emission tomography combined with computed tomography (PET/CT) may enhance diagnostic accuracy in this context. By comprehensively reviewing the available literature, [<sup>18</sup>F]FDG PET/CT generally shows good sensibility but lower specificity for predicting and evaluating pathological complete response (pCR), respectively, during NAT and preoperatively, in both the breast and lymph nodes. Thereby its use may support timely escalation of systemic treatment or surgery in patients with poor metabolic response. However, definitive conclusions are limited by small, heterogeneous studies with variable patient selection, timing, and response definitions. Consequently, while international guidelines remain inconsistent, further evidence is needed to define its role in response assessment, establish the optimal use in clinical practice, and clarify its integration into (de)-escalation strategies.</p>","PeriodicalId":20817,"journal":{"name":"Radiologia Medica","volume":" ","pages":""},"PeriodicalIF":4.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145649202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-09-23DOI: 10.1007/s11547-025-02076-x
Antonio Esposito, Riccardo Faletti, Anna Palmisano, Marco Gatti, Sara Seitun, Cesare Mantini, Piergiuseppe Agostoni, Daniele Andreini, Francesco Barillà, Andrea Barison, Paolo Calabrò, Matteo Cameli, Scipione Carerj, Carlo Catalano, Marcello Chiocchi, Marco Matteo Ciccone, Antonio Curcio, Fabrizio D'Ascenzo, Serena Dell'Aversana, Fabio Falzea, Marco Francone, Nicola Galea, Andrea Giovagnoni, Marco Guglielmo, Andrea Laghi, Carlo Liguori, Luigi Lovato, Riccardo Marano, Rocco Antonio Montone, Doralisa Morrone, Luigi Natale, Savina Nodari, Michele Oppizzi, Stefania Paolillo, Alberto Polimeni, Gianluca Pontone, Italo Porto, Silvia Pradella, Vincenzo Russo, Vincenzo Russo, Luca Saba, Gianfranco Sinagra, Massimo Slavich, Carmen Spaccarotella, Davide Tore, Davide Vignale, Carmine Dario Vizza, Saverio Muscoli, Pasquale Perrone Filardi, Ciro Indolfi
Acute chest pain is a common and challenging reason for emergency department visits and requires prompt and systematic evaluation to address potential life-threatening conditions, minimize risks and manage emergency department overcrowding. This updated consensus statement outlines the appropriate management of patients presenting to the emergency department with acute chest pain, emphasizing the timing and utility of non-invasive advanced imaging (particularly coronary computed tomography angiography) aiming to improve rapid and accurate diagnosis of both cardiac or non-cardiac causes improving patient safety, outcomes, and resource utilization efficiency. The writing committee was composed of members and experts from both the Italian Society of Cardiology (SIC) and the Italian Society of Medical and Interventional Radiology (SIRM) who worked jointly to create a cohesive approach in the field of acute chest pain. This structured approach may streamline diagnostic workflows in the emergency setting and support earlier, more appropriate patient management.
{"title":"SIRM/SIC consensus document on the management of patients with acute chest pain.","authors":"Antonio Esposito, Riccardo Faletti, Anna Palmisano, Marco Gatti, Sara Seitun, Cesare Mantini, Piergiuseppe Agostoni, Daniele Andreini, Francesco Barillà, Andrea Barison, Paolo Calabrò, Matteo Cameli, Scipione Carerj, Carlo Catalano, Marcello Chiocchi, Marco Matteo Ciccone, Antonio Curcio, Fabrizio D'Ascenzo, Serena Dell'Aversana, Fabio Falzea, Marco Francone, Nicola Galea, Andrea Giovagnoni, Marco Guglielmo, Andrea Laghi, Carlo Liguori, Luigi Lovato, Riccardo Marano, Rocco Antonio Montone, Doralisa Morrone, Luigi Natale, Savina Nodari, Michele Oppizzi, Stefania Paolillo, Alberto Polimeni, Gianluca Pontone, Italo Porto, Silvia Pradella, Vincenzo Russo, Vincenzo Russo, Luca Saba, Gianfranco Sinagra, Massimo Slavich, Carmen Spaccarotella, Davide Tore, Davide Vignale, Carmine Dario Vizza, Saverio Muscoli, Pasquale Perrone Filardi, Ciro Indolfi","doi":"10.1007/s11547-025-02076-x","DOIUrl":"10.1007/s11547-025-02076-x","url":null,"abstract":"<p><p>Acute chest pain is a common and challenging reason for emergency department visits and requires prompt and systematic evaluation to address potential life-threatening conditions, minimize risks and manage emergency department overcrowding. This updated consensus statement outlines the appropriate management of patients presenting to the emergency department with acute chest pain, emphasizing the timing and utility of non-invasive advanced imaging (particularly coronary computed tomography angiography) aiming to improve rapid and accurate diagnosis of both cardiac or non-cardiac causes improving patient safety, outcomes, and resource utilization efficiency. The writing committee was composed of members and experts from both the Italian Society of Cardiology (SIC) and the Italian Society of Medical and Interventional Radiology (SIRM) who worked jointly to create a cohesive approach in the field of acute chest pain. This structured approach may streamline diagnostic workflows in the emergency setting and support earlier, more appropriate patient management.</p>","PeriodicalId":20817,"journal":{"name":"Radiologia Medica","volume":" ","pages":"1936-1948"},"PeriodicalIF":4.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12669346/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145126064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Purpose: Immunotherapy-based neoadjuvant chemoradiotherapy (iNCRT) has recently emerged for proficient mismatch repair/microsatellite stable (pMMR/MSS) locally advanced rectal cancer (LARC). Accurate identification of pathological complete response for primary tumor (ptPCR) post-treatment is critical for selecting patients eligible for watch-and-wait strategies. This study aimed to evaluate arterial-phase mucosal linear enhancement (AMLE) on contrast-enhanced T1-weighted imaging (CE-T1WI) for predicting ptPCR after iNCRT in pMMR/MSS LARC, compared to conventional T2-weighted/diffusion-weighted imaging (T2DWI) and rectal endoscopy.
Methods: This retrospective study included patients with pMMR/MSS LARC who underwent total mesorectal excision after iNCRT between July 2022 and Oct 2024 at a tertiary referral academic center. Preoperative re-staging examinations were rectal endoscopy and MRI, included T2DWI and arterial-phase CE-T1WI for primary tumor assessment. Baseline and post-therapy features associated with ptPCR were identified using univariate and multivariable regression analysis. Diagnostic performance of endoscopy and different MRI protocols to identify ptPCR after iNCRT was evaluated using ROC curves.
Results: In total, 75 patients (mean age, 57 years ± 10 [SD]; 54 male patients) were assessed. At histopathology, 29 patients achieved ptPCR. AMLE was more common in the ptPCR group than in the non-ptPCR group after iNCRT (75.9% vs 15.2%, respectively; P < 0.001). AMLE was associated with higher odds of ptPCR in the multivariable regression analysis (odds ratio, 19.14; 95% CI 4.03, 90.87; P = 0.001). And AMLE exhibited the best diagnostic performance in identifying ptPCR after iNCRT, with highest sensitivity, specificity, PPV, NPV, and AUC (0.80; 95% CI 0.70, 0.89).
Conclusion: AMLE at CE-TlWI of rectal MRI could be a potential indicator of ptPCR after a new iNCRT in pMMR/MSS LARC, suggesting a relatively credible preoperative evaluation strategy for this group of patients in clinical practice to accurately exclude residual tumors and select watch-and-wait approach, avoiding unnecessary surgery.
{"title":"Arterial-phase mucosal linear enhancement as an indicator of pathological complete response after immunotherapy in pMMR/MSS locally advanced rectal cancer.","authors":"Jingjing Liu, Gengyun Miao, Wentao Tang, Lamei Deng, Shengxiang Rao, Mengsu Zeng, Liheng Liu","doi":"10.1007/s11547-025-02099-4","DOIUrl":"10.1007/s11547-025-02099-4","url":null,"abstract":"<p><strong>Purpose: </strong>Immunotherapy-based neoadjuvant chemoradiotherapy (iNCRT) has recently emerged for proficient mismatch repair/microsatellite stable (pMMR/MSS) locally advanced rectal cancer (LARC). Accurate identification of pathological complete response for primary tumor (ptPCR) post-treatment is critical for selecting patients eligible for watch-and-wait strategies. This study aimed to evaluate arterial-phase mucosal linear enhancement (AMLE) on contrast-enhanced T1-weighted imaging (CE-T1WI) for predicting ptPCR after iNCRT in pMMR/MSS LARC, compared to conventional T2-weighted/diffusion-weighted imaging (T2DWI) and rectal endoscopy.</p><p><strong>Methods: </strong>This retrospective study included patients with pMMR/MSS LARC who underwent total mesorectal excision after iNCRT between July 2022 and Oct 2024 at a tertiary referral academic center. Preoperative re-staging examinations were rectal endoscopy and MRI, included T2DWI and arterial-phase CE-T1WI for primary tumor assessment. Baseline and post-therapy features associated with ptPCR were identified using univariate and multivariable regression analysis. Diagnostic performance of endoscopy and different MRI protocols to identify ptPCR after iNCRT was evaluated using ROC curves.</p><p><strong>Results: </strong>In total, 75 patients (mean age, 57 years ± 10 [SD]; 54 male patients) were assessed. At histopathology, 29 patients achieved ptPCR. AMLE was more common in the ptPCR group than in the non-ptPCR group after iNCRT (75.9% vs 15.2%, respectively; P < 0.001). AMLE was associated with higher odds of ptPCR in the multivariable regression analysis (odds ratio, 19.14; 95% CI 4.03, 90.87; P = 0.001). And AMLE exhibited the best diagnostic performance in identifying ptPCR after iNCRT, with highest sensitivity, specificity, PPV, NPV, and AUC (0.80; 95% CI 0.70, 0.89).</p><p><strong>Conclusion: </strong>AMLE at CE-TlWI of rectal MRI could be a potential indicator of ptPCR after a new iNCRT in pMMR/MSS LARC, suggesting a relatively credible preoperative evaluation strategy for this group of patients in clinical practice to accurately exclude residual tumors and select watch-and-wait approach, avoiding unnecessary surgery.</p>","PeriodicalId":20817,"journal":{"name":"Radiologia Medica","volume":" ","pages":"1909-1920"},"PeriodicalIF":4.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145131714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-09-24DOI: 10.1007/s11547-025-02096-7
Maximilian F Russe, Marco Reisert, Anna Fink, Marc Hohenhaus, Julia M Nakagawa, Caroline Wilpert, Carl P Simon, Elmar Kotter, Horst Urbach, Alexander Rau
Purpose: To assess the performance of state-of-the-art large language models in classifying vertebral metastasis stability using the Spinal Instability Neoplastic Score (SINS) compared to human experts, and to evaluate the impact of task-specific refinement including in-context learning on their performance.
Material and methods: This retrospective study analyzed 100 synthetic CT and MRI reports encompassing a broad range of SINS scores. Four human experts (two radiologists and two neurosurgeons) and four large language models (Mistral, Claude, GPT-4 turbo, and GPT-4o) evaluated the reports. Large language models were tested in both generic form and with task-specific refinement. Performance was assessed based on correct SINS category assignment and attributed SINS points.
Results: Human experts demonstrated high median performance in SINS classification (98.5% correct) and points calculation (92% correct), with a median point offset of 0 [0-0]. Generic large language models performed poorly with 26-63% correct category and 4-15% correct SINS points allocation. In-context learning significantly improved chatbot performance to near-human levels (96-98/100 correct for classification, 86-95/100 for scoring, no significant difference to human experts). Refined large language models performed 71-85% better in SINS points allocation.
Conclusion: In-context learning enables state-of-the-art large language models to perform at near-human expert levels in SINS classification, offering potential for automating vertebral metastasis stability assessment. The poor performance of generic large language models highlights the importance of task-specific refinement in medical applications of artificial intelligence.
{"title":"In-context learning enables large language models to achieve human-level performance in spinal instability neoplastic score classification from synthetic CT and MRI reports.","authors":"Maximilian F Russe, Marco Reisert, Anna Fink, Marc Hohenhaus, Julia M Nakagawa, Caroline Wilpert, Carl P Simon, Elmar Kotter, Horst Urbach, Alexander Rau","doi":"10.1007/s11547-025-02096-7","DOIUrl":"10.1007/s11547-025-02096-7","url":null,"abstract":"<p><strong>Purpose: </strong>To assess the performance of state-of-the-art large language models in classifying vertebral metastasis stability using the Spinal Instability Neoplastic Score (SINS) compared to human experts, and to evaluate the impact of task-specific refinement including in-context learning on their performance.</p><p><strong>Material and methods: </strong>This retrospective study analyzed 100 synthetic CT and MRI reports encompassing a broad range of SINS scores. Four human experts (two radiologists and two neurosurgeons) and four large language models (Mistral, Claude, GPT-4 turbo, and GPT-4o) evaluated the reports. Large language models were tested in both generic form and with task-specific refinement. Performance was assessed based on correct SINS category assignment and attributed SINS points.</p><p><strong>Results: </strong>Human experts demonstrated high median performance in SINS classification (98.5% correct) and points calculation (92% correct), with a median point offset of 0 [0-0]. Generic large language models performed poorly with 26-63% correct category and 4-15% correct SINS points allocation. In-context learning significantly improved chatbot performance to near-human levels (96-98/100 correct for classification, 86-95/100 for scoring, no significant difference to human experts). Refined large language models performed 71-85% better in SINS points allocation.</p><p><strong>Conclusion: </strong>In-context learning enables state-of-the-art large language models to perform at near-human expert levels in SINS classification, offering potential for automating vertebral metastasis stability assessment. The poor performance of generic large language models highlights the importance of task-specific refinement in medical applications of artificial intelligence.</p>","PeriodicalId":20817,"journal":{"name":"Radiologia Medica","volume":" ","pages":"2073-2080"},"PeriodicalIF":4.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12669255/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145131753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-09-26DOI: 10.1007/s11547-025-02091-y
Cesare Gagliardo, Paola Feraco, Eleonora Contrino, Costanza D'Angelo, Laura Geraci, Giuseppe Salvaggio, Andrea Gagliardo, Ludovico La Grutta, Massimo Midiri, Maurizio Marrale
Ultra-low-field magnetic resonance imaging (ULF-MRI), operating below 0.2 Tesla, is gaining renewed interest as a re-emerging diagnostic modality in a field dominated by high- and ultra-high-field systems. Recent advances in magnet design, RF coils, pulse sequences, and AI-based reconstruction have significantly enhanced image quality, mitigating traditional limitations such as low signal- and contrast-to-noise ratio and reduced spatial resolution. ULF-MRI offers distinct advantages: reduced susceptibility artifacts, safer imaging in patients with metallic implants, low power consumption, and true portability for point-of-care use. This narrative review synthesizes the physical foundations, technological advances, and emerging clinical applications of ULF-MRI. A focused literature search across PubMed, Scopus, IEEE Xplore, and Google Scholar was conducted up to August 11, 2025, using combined keywords targeting hardware, software, and clinical domains. Inclusion emphasized scientific rigor and thematic relevance. A comparative analysis with other imaging modalities highlights the specific niche ULF-MRI occupies within the broader diagnostic landscape. Future directions and challenges for clinical translation are explored. In a world increasingly polarized between the push for ultra-high-field excellence and the need for accessible imaging, ULF-MRI embodies a modern "David versus Goliath" theme, offering a sustainable, democratizing force capable of expanding MRI access to anyone, anywhere.
{"title":"Ultra-low-field MRI: a David versus Goliath challenge in modern imaging.","authors":"Cesare Gagliardo, Paola Feraco, Eleonora Contrino, Costanza D'Angelo, Laura Geraci, Giuseppe Salvaggio, Andrea Gagliardo, Ludovico La Grutta, Massimo Midiri, Maurizio Marrale","doi":"10.1007/s11547-025-02091-y","DOIUrl":"10.1007/s11547-025-02091-y","url":null,"abstract":"<p><p>Ultra-low-field magnetic resonance imaging (ULF-MRI), operating below 0.2 Tesla, is gaining renewed interest as a re-emerging diagnostic modality in a field dominated by high- and ultra-high-field systems. Recent advances in magnet design, RF coils, pulse sequences, and AI-based reconstruction have significantly enhanced image quality, mitigating traditional limitations such as low signal- and contrast-to-noise ratio and reduced spatial resolution. ULF-MRI offers distinct advantages: reduced susceptibility artifacts, safer imaging in patients with metallic implants, low power consumption, and true portability for point-of-care use. This narrative review synthesizes the physical foundations, technological advances, and emerging clinical applications of ULF-MRI. A focused literature search across PubMed, Scopus, IEEE Xplore, and Google Scholar was conducted up to August 11, 2025, using combined keywords targeting hardware, software, and clinical domains. Inclusion emphasized scientific rigor and thematic relevance. A comparative analysis with other imaging modalities highlights the specific niche ULF-MRI occupies within the broader diagnostic landscape. Future directions and challenges for clinical translation are explored. In a world increasingly polarized between the push for ultra-high-field excellence and the need for accessible imaging, ULF-MRI embodies a modern \"David versus Goliath\" theme, offering a sustainable, democratizing force capable of expanding MRI access to anyone, anywhere.</p>","PeriodicalId":20817,"journal":{"name":"Radiologia Medica","volume":" ","pages":"2012-2029"},"PeriodicalIF":4.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12669372/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145150635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-29DOI: 10.1007/s11547-025-02130-8
Zachary Elijah Stewart, Andrea M Spiker, John S Symanski, Amie Armstrong, Donna G Blankenbaker
Objective: Describe the early non-arthrographic MRI appearance of the acetabular labrum after arthroscopic surgery for femoroacetabular impingement and labrum repair.
Methods: Eleven subjects (12 hips, 8 hips of females; mean age: 25.8 years, SD: 3.0) with a pre-operative MRI demonstrating a labrum tear and symptoms of femoroacetabular impingement were prospectively enrolled. Non-arthrographic images were obtained on a 3 T MRI scanner < 4 weeks after arthroscopic surgery for femoroacetabular impingement. Imaging features of the labrum, capsule, and cartilage were systematically assessed by two independent fellowship-trained musculoskeletal radiologists. Disagreements were resolved through consensus mediated by a musculoskeletal radiologist with 20 + years of experience and expertise in hip imaging.
Results: The appearance of a persistent labral tear and increased intrasubstance signal was observed in all hips. The labrum appeared shortened in 92% (11/12). The geographic distribution of abnormal labral signal corresponded to the same number of labrum quadrants treated surgically in 67% (8/12). There was an even distribution of hips showing abnormal signal across a smaller and larger portion of the labrum than was treated arthroscopically, seen in 17% (2/12), respectively. The appearance of a capsular defect was observed in 92% (11/12).
Conclusion: In the first 4 weeks after arthroscopic labrum repair surgery for femoroacetabular impingement, it is common for the labrum to appear shortened with a persistent appearance of a labrum tear and increased signal in the repaired segment. The capsule often appears discontinuous, even when capsular closure is performed.
{"title":"Early post-operative MR appearance of the acetabular labrum after arthroscopic repair.","authors":"Zachary Elijah Stewart, Andrea M Spiker, John S Symanski, Amie Armstrong, Donna G Blankenbaker","doi":"10.1007/s11547-025-02130-8","DOIUrl":"https://doi.org/10.1007/s11547-025-02130-8","url":null,"abstract":"<p><strong>Objective: </strong>Describe the early non-arthrographic MRI appearance of the acetabular labrum after arthroscopic surgery for femoroacetabular impingement and labrum repair.</p><p><strong>Methods: </strong>Eleven subjects (12 hips, 8 hips of females; mean age: 25.8 years, SD: 3.0) with a pre-operative MRI demonstrating a labrum tear and symptoms of femoroacetabular impingement were prospectively enrolled. Non-arthrographic images were obtained on a 3 T MRI scanner < 4 weeks after arthroscopic surgery for femoroacetabular impingement. Imaging features of the labrum, capsule, and cartilage were systematically assessed by two independent fellowship-trained musculoskeletal radiologists. Disagreements were resolved through consensus mediated by a musculoskeletal radiologist with 20 + years of experience and expertise in hip imaging.</p><p><strong>Results: </strong>The appearance of a persistent labral tear and increased intrasubstance signal was observed in all hips. The labrum appeared shortened in 92% (11/12). The geographic distribution of abnormal labral signal corresponded to the same number of labrum quadrants treated surgically in 67% (8/12). There was an even distribution of hips showing abnormal signal across a smaller and larger portion of the labrum than was treated arthroscopically, seen in 17% (2/12), respectively. The appearance of a capsular defect was observed in 92% (11/12).</p><p><strong>Conclusion: </strong>In the first 4 weeks after arthroscopic labrum repair surgery for femoroacetabular impingement, it is common for the labrum to appear shortened with a persistent appearance of a labrum tear and increased signal in the repaired segment. The capsule often appears discontinuous, even when capsular closure is performed.</p>","PeriodicalId":20817,"journal":{"name":"Radiologia Medica","volume":" ","pages":""},"PeriodicalIF":4.8,"publicationDate":"2025-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145638119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-26DOI: 10.1007/s11547-025-02154-0
Andrea Nitrosi, Paolo Giorgi Rossi, Laura Verzellesi, Martina Creola, Cinzia Campari, Rita Vacondio, Chiara Coriani, Valentina Iotti, Pierpaolo Pattacini, Giulia Besutti, Valeria Trojani, Marco Bertolini, Giulia Paolani, Mauro Iori
Aim: The AI case malignancy score (AI-CMS) represents the AI algorithm's confidence (from 0 to 100%) that a mammography exam is malignant. This work aims to retrospectively evaluate, through simulation on real-world data, a strategy that integrates AI-CMS into a standard screening scenario to reduce the radiologists' workload.
Methods: A total of 89176 consecutive screening exams from the 2023-2024 Reggio Emilia Breast Screening Program (REBSP) were retrospectively considered, which included 479 biopsy-proven cancers (interval cancers were only partially available, therefore false negatives beyond those detected in the real screening workflow could not be assessed). In the proposed strategy, computer-aided detection (CAD) acts as a reader (CR), recalling women with an AI-CMS greater than a predefined threshold (ranging from 5 to 25%). If the first radiologist (HR1) disagrees with CR, the case goes to a second radiologist (HR2) and, in case of human disagreement, to a third radiologist (HR3). For each threshold, final recall rate (RR), cancer detection rate (CDR), number of detected cancers (DC), predictive positive value (PPV) of recalls, false positive rate (FPR), human reading workload, and economic impact were estimated.
Results: At AI-CMS thresholds of 5%, 8%, 10%, 15%, 20%, and 25%, human workload decrease ranged from 13.4% to 36.1%. The final RR decreased between 4.3% and 4.0%, slightly lower than the current 4.4% with human double reading. The PPV ranged from 12.6% to 13.3%, higher than the current PPV of 12.2%. The FPR ranged from 3.8% to 3.5%, down from the current 3.9%. With thresholds up to 5%, no true positive cases were missed, maintaining the CDR of 5.4‰ of those detected by current double reading. Considering CAD payback periods of either 6 or 8 years, financial savings from our strategy ranged from approximately 17800 to over 590,000€.
Conclusion: Integrating AI-CMS support into a standard screening scenario could substantially reduce the screen-reading workload and slightly reduce unnecessary ascertainments without affecting the cancer detection rate. This approach, although limited by its retrospective simulation design and the partial availability of interval cancer data, has also proven to be economically sustainable.
{"title":"Adding artificial intelligence case malignancy scoring to reduce screen-reading workload in breast screening program: results of the retrospective REAI program.","authors":"Andrea Nitrosi, Paolo Giorgi Rossi, Laura Verzellesi, Martina Creola, Cinzia Campari, Rita Vacondio, Chiara Coriani, Valentina Iotti, Pierpaolo Pattacini, Giulia Besutti, Valeria Trojani, Marco Bertolini, Giulia Paolani, Mauro Iori","doi":"10.1007/s11547-025-02154-0","DOIUrl":"https://doi.org/10.1007/s11547-025-02154-0","url":null,"abstract":"<p><strong>Aim: </strong>The AI case malignancy score (AI-CMS) represents the AI algorithm's confidence (from 0 to 100%) that a mammography exam is malignant. This work aims to retrospectively evaluate, through simulation on real-world data, a strategy that integrates AI-CMS into a standard screening scenario to reduce the radiologists' workload.</p><p><strong>Methods: </strong>A total of 89176 consecutive screening exams from the 2023-2024 Reggio Emilia Breast Screening Program (REBSP) were retrospectively considered, which included 479 biopsy-proven cancers (interval cancers were only partially available, therefore false negatives beyond those detected in the real screening workflow could not be assessed). In the proposed strategy, computer-aided detection (CAD) acts as a reader (CR), recalling women with an AI-CMS greater than a predefined threshold (ranging from 5 to 25%). If the first radiologist (HR1) disagrees with CR, the case goes to a second radiologist (HR2) and, in case of human disagreement, to a third radiologist (HR3). For each threshold, final recall rate (RR), cancer detection rate (CDR), number of detected cancers (DC), predictive positive value (PPV) of recalls, false positive rate (FPR), human reading workload, and economic impact were estimated.</p><p><strong>Results: </strong>At AI-CMS thresholds of 5%, 8%, 10%, 15%, 20%, and 25%, human workload decrease ranged from 13.4% to 36.1%. The final RR decreased between 4.3% and 4.0%, slightly lower than the current 4.4% with human double reading. The PPV ranged from 12.6% to 13.3%, higher than the current PPV of 12.2%. The FPR ranged from 3.8% to 3.5%, down from the current 3.9%. With thresholds up to 5%, no true positive cases were missed, maintaining the CDR of 5.4‰ of those detected by current double reading. Considering CAD payback periods of either 6 or 8 years, financial savings from our strategy ranged from approximately 17800 to over 590,000€.</p><p><strong>Conclusion: </strong>Integrating AI-CMS support into a standard screening scenario could substantially reduce the screen-reading workload and slightly reduce unnecessary ascertainments without affecting the cancer detection rate. This approach, although limited by its retrospective simulation design and the partial availability of interval cancer data, has also proven to be economically sustainable.</p>","PeriodicalId":20817,"journal":{"name":"Radiologia Medica","volume":" ","pages":""},"PeriodicalIF":4.8,"publicationDate":"2025-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145605331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}