Pub Date : 2024-09-09Epub Date: 2024-02-20DOI: 10.4274/dir.2024.232604
Burak Koçak, Ali Keleş, Fadime Köse
Purpose: To determine how radiology, nuclear medicine, and medical imaging journals encourage and mandate the use of reporting guidelines for artificial intelligence (AI) in their author and reviewer instructions.
Methods: The primary source of journal information and associated citation data used was the Journal Citation Reports (June 2023 release for 2022 citation data; Clarivate Analytics, UK). The first- and second-quartile journals indexed in the Science Citation Index Expanded and the Emerging Sources Citation Index were included. The author and reviewer instructions were evaluated by two independent readers, followed by an additional reader for consensus, with the assistance of automatic annotation. Encouragement and submission requirements were systematically analyzed. The reporting guidelines were grouped as AI-specific, related to modeling, and unrelated to modeling.
Results: Out of 102 journals, 98 were included in this study, and all of them had author instructions. Only five journals (5%) encouraged the authors to follow AI-specific reporting guidelines. Among these, three required a filled-out checklist. Reviewer instructions were found in 16 journals (16%), among which one journal (6%) encouraged the reviewers to follow AI-specific reporting guidelines without submission requirements. The proportions of author and reviewer encouragement for AI-specific reporting guidelines were statistically significantly lower compared with those for other types of guidelines (P < 0.05 for all).
Conclusion: The findings indicate that AI-specific guidelines are not commonly encouraged and mandated (i.e., requiring a filled-out checklist) by these journals, compared with guidelines related to modeling and unrelated to modeling, leaving vast space for improvement. This meta-research study hopes to contribute to the awareness of the imaging community for AI reporting guidelines and ignite large-scale group efforts by all stakeholders, making AI research less wasteful.
Clinical significance: This meta-research highlights the need for improved encouragement of AI-specific guidelines in radiology, nuclear medicine, and medical imaging journals. This can potentially foster greater awareness among the AI community and motivate various stakeholders to collaborate to promote more efficient and responsible AI research reporting practices.
{"title":"Meta-research on reporting guidelines for artificial intelligence: are authors and reviewers encouraged enough in radiology, nuclear medicine, and medical imaging journals?","authors":"Burak Koçak, Ali Keleş, Fadime Köse","doi":"10.4274/dir.2024.232604","DOIUrl":"10.4274/dir.2024.232604","url":null,"abstract":"<p><strong>Purpose: </strong>To determine how radiology, nuclear medicine, and medical imaging journals encourage and mandate the use of reporting guidelines for artificial intelligence (AI) in their author and reviewer instructions.</p><p><strong>Methods: </strong>The primary source of journal information and associated citation data used was the Journal Citation Reports (June 2023 release for 2022 citation data; Clarivate Analytics, UK). The first- and second-quartile journals indexed in the Science Citation Index Expanded and the Emerging Sources Citation Index were included. The author and reviewer instructions were evaluated by two independent readers, followed by an additional reader for consensus, with the assistance of automatic annotation. Encouragement and submission requirements were systematically analyzed. The reporting guidelines were grouped as AI-specific, related to modeling, and unrelated to modeling.</p><p><strong>Results: </strong>Out of 102 journals, 98 were included in this study, and all of them had author instructions. Only five journals (5%) encouraged the authors to follow AI-specific reporting guidelines. Among these, three required a filled-out checklist. Reviewer instructions were found in 16 journals (16%), among which one journal (6%) encouraged the reviewers to follow AI-specific reporting guidelines without submission requirements. The proportions of author and reviewer encouragement for AI-specific reporting guidelines were statistically significantly lower compared with those for other types of guidelines (<i>P</i> < 0.05 for all).</p><p><strong>Conclusion: </strong>The findings indicate that AI-specific guidelines are not commonly encouraged and mandated (i.e., requiring a filled-out checklist) by these journals, compared with guidelines related to modeling and unrelated to modeling, leaving vast space for improvement. This meta-research study hopes to contribute to the awareness of the imaging community for AI reporting guidelines and ignite large-scale group efforts by all stakeholders, making AI research less wasteful.</p><p><strong>Clinical significance: </strong>This meta-research highlights the need for improved encouragement of AI-specific guidelines in radiology, nuclear medicine, and medical imaging journals. This can potentially foster greater awareness among the AI community and motivate various stakeholders to collaborate to promote more efficient and responsible AI research reporting practices.</p>","PeriodicalId":11341,"journal":{"name":"Diagnostic and interventional radiology","volume":" ","pages":"291-298"},"PeriodicalIF":1.4,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11590734/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139905258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-09Epub Date: 2024-06-05DOI: 10.4274/dir.2024.242825
Richard J Fagan, Dane Eskildsen, Tara Catanzano, Rachel Stanietzky, Serageldin Kamel, Mohamed Eltaher, Khaled M Elsayes
Burnout is a widespread issue among physicians, including radiologists and radiology trainees. Long hours, isolation, and substantial stress levels contribute to healthcare workers experiencing a substantially higher rate of burnout compared with other professionals. Resident physicians, continuously exposed to stressors such as new clinical situations and performance feedback, are particularly susceptible. Mentorship has proven to be an effective strategy in mitigating burnout. Various mentorship delivery models exist, all aiming to have mentors serve as role models to mentees, thereby alleviating stress and anxiety. Physician groups and healthcare enterprises have actively implemented these programs, recognizing them as both successful and cost-effective. This article explores different mentorship models, their implementation processes, and the effectiveness of these programs as a standard component of academic departments.
{"title":"Burnout and the role of mentorship for radiology trainees and early career radiologists","authors":"Richard J Fagan, Dane Eskildsen, Tara Catanzano, Rachel Stanietzky, Serageldin Kamel, Mohamed Eltaher, Khaled M Elsayes","doi":"10.4274/dir.2024.242825","DOIUrl":"10.4274/dir.2024.242825","url":null,"abstract":"<p><p>Burnout is a widespread issue among physicians, including radiologists and radiology trainees. Long hours, isolation, and substantial stress levels contribute to healthcare workers experiencing a substantially higher rate of burnout compared with other professionals. Resident physicians, continuously exposed to stressors such as new clinical situations and performance feedback, are particularly susceptible. Mentorship has proven to be an effective strategy in mitigating burnout. Various mentorship delivery models exist, all aiming to have mentors serve as role models to mentees, thereby alleviating stress and anxiety. Physician groups and healthcare enterprises have actively implemented these programs, recognizing them as both successful and cost-effective. This article explores different mentorship models, their implementation processes, and the effectiveness of these programs as a standard component of academic departments.</p>","PeriodicalId":11341,"journal":{"name":"Diagnostic and interventional radiology","volume":" ","pages":"313-317"},"PeriodicalIF":1.4,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11590732/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141247644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Evrim Özmen, Hande Özen Atalay, Evren Uzer, Mert Veznikli
Purpose: This study aimed to evaluate the validity of two artificial intelligence (AI)-based bone age assessment programs, BoneXpert and VUNO Med-Bone Age (VUNO), compared with manual assessments using the Greulich-Pyle method in Turkish children.
Methods: This study included a cohort of 292 pediatric cases, ranging in age from 1 to 15 years with an equal gender and number distribution in each age group. Two radiologists, who were unaware of the bone age determined by AI, independently evaluated the bone age. The statistical study involved using the intraclass correlation coefficient (ICC) to measure the level of agreement between the manual and AI-based assessments.
Results: The ICC coefficients for the agreement between the manual measurements of two radiologists indicate almost perfect agreement. When all cases, regardless of gender and age group, were analyzed, a nearly perfect positive agreement was observed between the manual and software measurements. When bone age calculations were separated and analyzed separately for girls and boys, there was no statistically significant difference between the two AI-based methods for boys; however, ICC coefficients of 0.990 and 0.982 were calculated for VUNO and BoneXpert, respectively, and this difference of 0.008 was significant (z = 2.528, P = 0.012) for girls. Accordingly, VUNO showed higher agreement with manual measurements compared with BoneXpert. The difference between the agreements demonstrated by the two software packages with manual measurements in the prepubescent group was much more pronounced in girls compared with boys. After the age of 8 years for girls and 9 years for boys, the agreement between manual measurements and both AI software packages was equal.
Conclusion: Both BoneXpert and VUNO showed high validity in assessing bone age. Furthermore, VUNO has a statistically higher correlation with manual assessment in prepubertal girls. These results suggest that VUNO may be slightly more effective in determining bone age, indicating its potential as a highly reliable tool for bone age assessment in Turkish children.
Clinical significance: Investigating the most suitable AI program for the Turkish population could be clinically significant.
{"title":"A comparison of two artificial intelligence-based methods for assessing bone age in Turkish children: BoneXpert and VUNO Med-Bone Age.","authors":"Evrim Özmen, Hande Özen Atalay, Evren Uzer, Mert Veznikli","doi":"10.4274/dir.2024.242790","DOIUrl":"https://doi.org/10.4274/dir.2024.242790","url":null,"abstract":"<p><strong>Purpose: </strong>This study aimed to evaluate the validity of two artificial intelligence (AI)-based bone age assessment programs, BoneXpert and VUNO Med-Bone Age (VUNO), compared with manual assessments using the Greulich-Pyle method in Turkish children.</p><p><strong>Methods: </strong>This study included a cohort of 292 pediatric cases, ranging in age from 1 to 15 years with an equal gender and number distribution in each age group. Two radiologists, who were unaware of the bone age determined by AI, independently evaluated the bone age. The statistical study involved using the intraclass correlation coefficient (ICC) to measure the level of agreement between the manual and AI-based assessments.</p><p><strong>Results: </strong>The ICC coefficients for the agreement between the manual measurements of two radiologists indicate almost perfect agreement. When all cases, regardless of gender and age group, were analyzed, a nearly perfect positive agreement was observed between the manual and software measurements. When bone age calculations were separated and analyzed separately for girls and boys, there was no statistically significant difference between the two AI-based methods for boys; however, ICC coefficients of 0.990 and 0.982 were calculated for VUNO and BoneXpert, respectively, and this difference of 0.008 was significant (<i>z</i> = 2.528, <i>P</i> = 0.012) for girls. Accordingly, VUNO showed higher agreement with manual measurements compared with BoneXpert. The difference between the agreements demonstrated by the two software packages with manual measurements in the prepubescent group was much more pronounced in girls compared with boys. After the age of 8 years for girls and 9 years for boys, the agreement between manual measurements and both AI software packages was equal.</p><p><strong>Conclusion: </strong>Both BoneXpert and VUNO showed high validity in assessing bone age. Furthermore, VUNO has a statistically higher correlation with manual assessment in prepubertal girls. These results suggest that VUNO may be slightly more effective in determining bone age, indicating its potential as a highly reliable tool for bone age assessment in Turkish children.</p><p><strong>Clinical significance: </strong>Investigating the most suitable AI program for the Turkish population could be clinically significant.</p>","PeriodicalId":11341,"journal":{"name":"Diagnostic and interventional radiology","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142105422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Semra Delibalta, Barış Genç, Meltem Ceyhan Bilgici, Kerim Aslan
Purpose: To evaluate the diagnostic efficacy of computed diffusion-weighted imaging (DWI) in pediatric posterior fossa tumors generated using high b-values.
Methods: We retrospectively performed our study on 32 pediatric patients who had undergone brain magnetic resonance imaging for a posterior fossa tumor between January 2016 and January 2022. The DWIs were evaluated for each patient by two blinded radiologists. The computed DWI (cDWI) was mathematically derived using a mono-exponential model from images with b = 0 and 1,000 s/mm2 and high b-values of 1,500, 2,000, 3,000, and 5,000 s/mm2. The posterior fossa tumors were divided into two groups, low grade and high grade, and the tumor/thalamus signal intensity (SI) ratios were compared. The Mann-Whitney U test and receiver operating characteristic (ROC) curves were used to compare the diagnostic performance of the acquired DWI (DWI1000), apparent diffusion coefficient (ADC)1000 maps, and cDWI (cDWI1500, cDWI2000, cDWI3000, and cDWI5000).
Results: The comparison of the two tumor groups revealed that the tumor/thalamus SI ratio on the DWI1000 and cDWI (cDWI1500, cDWI2000, cDWI3000, and cDWI5000) was statistically significantly higher in high-grade tumors (P < 0.001). In the ROC curve analysis, higher sensitivity and specificity were detected in the cDWI1500, cDWI2000, cDWI3000, and ADC1000 maps (100%, 90.90%) compared with the DWI1000 (80%, 81.80%). cDWI3000 had the highest area under the curve (AUC) value compared with other parameters (AUC: 0.976).
Conclusion: cDWI generated using high b-values was successful in differentiating between low-grade and high-grade posterior fossa tumors without increasing imaging time.
Clinical significance: cDWI created using high b-values can provide additional information about tumor grade in pediatric posterior fossa tumors without requiring additional imaging time.
目的:评估使用高b值的计算弥散加权成像(DWI)对小儿后窝肿瘤的诊断效果:我们对2016年1月至2022年1月期间因后窝肿瘤接受脑磁共振成像的32名儿科患者进行了回顾性研究。每名患者的 DWI 均由两名双盲放射科医生进行评估。计算的 DWI(cDWI)使用单指数模型从 b = 0 和 1,000 s/mm2 以及高 b 值(1,500、2,000、3,000 和 5,000 s/mm2)的图像中进行数学推导。将后窝肿瘤分为低级别和高级别两组,并比较肿瘤/丘脑信号强度(SI)比率。采用曼-惠特尼 U 检验和接收器操作特征曲线(ROC)比较获得的 DWI(DWI1000)、表观弥散系数(ADC)1000 图和 cDWI(cDWI1500、cDWI2000、cDWI3000 和 cDWI5000)的诊断性能:对两组肿瘤进行比较后发现,DWI1000 和 cDWI(cDWI1500、cDWI2000、cDWI3000 和 cDWI5000)上的肿瘤/thalamus SI 比值在统计学上显著高于高级别肿瘤(P < 0.001)。在 ROC 曲线分析中,与 DWI1000(80%,81.80%)相比,cDWI1500、cDWI2000、cDWI3000 和 ADC1000 图谱的灵敏度和特异性更高(100%,90.90%)。结论:使用高b值生成的cDWI能成功区分低级别和高级别后窝肿瘤,且不增加成像时间。临床意义:使用高b值生成的cDWI能提供有关小儿后窝肿瘤级别的额外信息,且不需要额外的成像时间。
{"title":"Feasibility study of computed high b-value diffusion-weighted magnetic resonance imaging for pediatric posterior fossa tumors.","authors":"Semra Delibalta, Barış Genç, Meltem Ceyhan Bilgici, Kerim Aslan","doi":"10.4274/dir.2024.242720","DOIUrl":"https://doi.org/10.4274/dir.2024.242720","url":null,"abstract":"<p><strong>Purpose: </strong>To evaluate the diagnostic efficacy of computed diffusion-weighted imaging (DWI) in pediatric posterior fossa tumors generated using high b-values.</p><p><strong>Methods: </strong>We retrospectively performed our study on 32 pediatric patients who had undergone brain magnetic resonance imaging for a posterior fossa tumor between January 2016 and January 2022. The DWIs were evaluated for each patient by two blinded radiologists. The computed DWI (cDWI) was mathematically derived using a mono-exponential model from images with b = 0 and 1,000 s/mm<sup>2</sup> and high b-values of 1,500, 2,000, 3,000, and 5,000 s/mm<sup>2</sup>. The posterior fossa tumors were divided into two groups, low grade and high grade, and the tumor/thalamus signal intensity (SI) ratios were compared. The Mann-Whitney U test and receiver operating characteristic (ROC) curves were used to compare the diagnostic performance of the acquired DWI (DWI<sub>1000</sub>), apparent diffusion coefficient (ADC)<sub>1000</sub> maps, and cDWI (cDWI1500, cDWI<sub>2000</sub>, cDWI<sub>3000</sub>, and cDWI<sub>5000</sub>).</p><p><strong>Results: </strong>The comparison of the two tumor groups revealed that the tumor/thalamus SI ratio on the DWI<sub>1000</sub> and cDWI (cDWI1500, cDWI<sub>2000</sub>, cDWI<sub>3000</sub>, and cDWI<sub>5000</sub>) was statistically significantly higher in high-grade tumors (<i>P</i> < 0.001). In the ROC curve analysis, higher sensitivity and specificity were detected in the cDWI1500, cDWI<sub>2000</sub>, cDWI<sub>3000</sub>, and ADC<sub>1000</sub> maps (100%, 90.90%) compared with the DWI<sub>1000</sub> (80%, 81.80%). cDWI<sub>3000</sub> had the highest area under the curve (AUC) value compared with other parameters (AUC: 0.976).</p><p><strong>Conclusion: </strong>cDWI generated using high b-values was successful in differentiating between low-grade and high-grade posterior fossa tumors without increasing imaging time.</p><p><strong>Clinical significance: </strong>cDWI created using high b-values can provide additional information about tumor grade in pediatric posterior fossa tumors without requiring additional imaging time.</p>","PeriodicalId":11341,"journal":{"name":"Diagnostic and interventional radiology","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142105425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Beyza Nur Kuzan, İsmail Meşe, Servan Yaşar, Taha Yusuf Kuzan
<p><strong>Purpose: </strong>Stroke is a neurological emergency requiring rapid, accurate diagnosis to prevent severe consequences. Early diagnosis is crucial for reducing morbidity and mortality. Artificial intelligence (AI) diagnosis support tools, such as Chat Generative Pre-trained Transformer (ChatGPT), offer rapid diagnostic advantages. This study assesses ChatGPT's accuracy in interpreting diffusion-weighted imaging (DWI) for acute stroke diagnosis.</p><p><strong>Methods: </strong>A retrospective analysis was conducted to identify the presence of stroke using DWI and apparent diffusion coefficient (ADC) map images. Patients aged >18 years who exhibited diffusion restriction and had a clinically explainable condition were included in the study. Patients with artifacts that affected image homogeneity, accuracy, and clarity, as well as those who had undergone previous surgery or had a history of stroke, were excluded from the study. ChatGPT was asked four consecutive questions regarding the identification of the magnetic resonance imaging (MRI) sequence, the demonstration of diffusion restriction on the ADC map after sequence recognition, and the identification of hemispheres and specific lobes. Each question was repeated 10 times to ensure consistency. Senior radiologists subsequently verified the accuracy of ChatGPT's responses, classifying them as either correct or incorrect. We assumed a response to be incorrect if it was partially correct or suggested multiple answers. These responses were systematically recorded. We also recorded non-responses from ChatGPT-4V when it failed to provide an answer to a query. We assessed ChatGPT-4V's performance by calculating the number and percentage of correct responses, incorrect responses, and non-responses across all images and questions, a metric known as "accuracy." ChatGPT-4V was considered successful if it answered ≥80% of the examples correctly.</p><p><strong>Results: </strong>A total of 530 diffusion MRI, of which 266 were stroke images and 264 were normal, were evaluated in the study. For the initial query identifying MRI sequence type, ChatGPT-4V's accuracy was 88.3% for stroke and 90.1% for normal images. For detecting diffusion restriction, ChatGPT-4V had an accuracy of 79.5% for stroke images, with a 15% false positive rate for normal images. Regarding identifying the brain or cerebellar hemisphere involved, ChatGPT-4V correctly identified the hemisphere in 26.2% of stroke images. For identifying the specific brain lobe or cerebellar area affected, ChatGPT-4V had a 20.4% accuracy for stroke images. The diagnostic sensitivity of ChatGPT-4V in acute stroke was found to be 79.57%, with a specificity of 84.87%, a positive predictive value of 83.86%, a negative predictive value of 80.80%, and a diagnostic odds ratio of 21.86.</p><p><strong>Conclusion: </strong>Despite limitations, ChatGPT shows potential as a supportive tool for healthcare professionals in interpreting diffusion examinations in
{"title":"A retrospective evaluation of the potential of ChatGPT in the accurate diagnosis of acute stroke.","authors":"Beyza Nur Kuzan, İsmail Meşe, Servan Yaşar, Taha Yusuf Kuzan","doi":"10.4274/dir.2024.242892","DOIUrl":"https://doi.org/10.4274/dir.2024.242892","url":null,"abstract":"<p><strong>Purpose: </strong>Stroke is a neurological emergency requiring rapid, accurate diagnosis to prevent severe consequences. Early diagnosis is crucial for reducing morbidity and mortality. Artificial intelligence (AI) diagnosis support tools, such as Chat Generative Pre-trained Transformer (ChatGPT), offer rapid diagnostic advantages. This study assesses ChatGPT's accuracy in interpreting diffusion-weighted imaging (DWI) for acute stroke diagnosis.</p><p><strong>Methods: </strong>A retrospective analysis was conducted to identify the presence of stroke using DWI and apparent diffusion coefficient (ADC) map images. Patients aged >18 years who exhibited diffusion restriction and had a clinically explainable condition were included in the study. Patients with artifacts that affected image homogeneity, accuracy, and clarity, as well as those who had undergone previous surgery or had a history of stroke, were excluded from the study. ChatGPT was asked four consecutive questions regarding the identification of the magnetic resonance imaging (MRI) sequence, the demonstration of diffusion restriction on the ADC map after sequence recognition, and the identification of hemispheres and specific lobes. Each question was repeated 10 times to ensure consistency. Senior radiologists subsequently verified the accuracy of ChatGPT's responses, classifying them as either correct or incorrect. We assumed a response to be incorrect if it was partially correct or suggested multiple answers. These responses were systematically recorded. We also recorded non-responses from ChatGPT-4V when it failed to provide an answer to a query. We assessed ChatGPT-4V's performance by calculating the number and percentage of correct responses, incorrect responses, and non-responses across all images and questions, a metric known as \"accuracy.\" ChatGPT-4V was considered successful if it answered ≥80% of the examples correctly.</p><p><strong>Results: </strong>A total of 530 diffusion MRI, of which 266 were stroke images and 264 were normal, were evaluated in the study. For the initial query identifying MRI sequence type, ChatGPT-4V's accuracy was 88.3% for stroke and 90.1% for normal images. For detecting diffusion restriction, ChatGPT-4V had an accuracy of 79.5% for stroke images, with a 15% false positive rate for normal images. Regarding identifying the brain or cerebellar hemisphere involved, ChatGPT-4V correctly identified the hemisphere in 26.2% of stroke images. For identifying the specific brain lobe or cerebellar area affected, ChatGPT-4V had a 20.4% accuracy for stroke images. The diagnostic sensitivity of ChatGPT-4V in acute stroke was found to be 79.57%, with a specificity of 84.87%, a positive predictive value of 83.86%, a negative predictive value of 80.80%, and a diagnostic odds ratio of 21.86.</p><p><strong>Conclusion: </strong>Despite limitations, ChatGPT shows potential as a supportive tool for healthcare professionals in interpreting diffusion examinations in","PeriodicalId":11341,"journal":{"name":"Diagnostic and interventional radiology","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142105423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Saman Fouladirad, Jasper Yoo, Behrang Homayoon, Jun Wang, Pedro Lourenço
This study assesses the efficacy of the quadratus lumborum block (QLB) in the management of procedural and periprocedural pain associated with small renal mass cryoablation. To the best of our knowledge, this is the first study that examines the use of QLB for pain management during percutaneous cryoablation of renal cell carcinoma (RCC). A single-center retrospective review was conducted for patients who underwent cryoablation for RCC with QLB between October 2020 and October 2021. The primary study endpoint included a total dose of procedural conscious sedation and administered, postprocedural analgesia. Technical success in cryoablation was achieved in every case. No patients required additional analgesic during or after the procedure, and no complications resulted from the use of the QLB. The QLB procedure appears to be an effective locoregional block for the management of procedural and periprocedural pain associated with renal mass cryoablation.
{"title":"Quadratus lumborum block for procedural and postprocedural analgesia in renal cell carcinoma percutaneous cryoablation.","authors":"Saman Fouladirad, Jasper Yoo, Behrang Homayoon, Jun Wang, Pedro Lourenço","doi":"10.4274/dir.2024.232100","DOIUrl":"https://doi.org/10.4274/dir.2024.232100","url":null,"abstract":"<p><p>This study assesses the efficacy of the quadratus lumborum block (QLB) in the management of procedural and periprocedural pain associated with small renal mass cryoablation. To the best of our knowledge, this is the first study that examines the use of QLB for pain management during percutaneous cryoablation of renal cell carcinoma (RCC). A single-center retrospective review was conducted for patients who underwent cryoablation for RCC with QLB between October 2020 and October 2021. The primary study endpoint included a total dose of procedural conscious sedation and administered, postprocedural analgesia. Technical success in cryoablation was achieved in every case. No patients required additional analgesic during or after the procedure, and no complications resulted from the use of the QLB. The QLB procedure appears to be an effective locoregional block for the management of procedural and periprocedural pain associated with renal mass cryoablation.</p>","PeriodicalId":11341,"journal":{"name":"Diagnostic and interventional radiology","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142105426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As an umbrella term, artificial intelligence (AI) covers machine learning and deep learning. This review aimed to elaborate on these terms to act as a primer for radiologists to learn more about the algorithms commonly used in musculoskeletal radiology. It also aimed to familiarize them with the common practices and issues in the use of AI in this domain.
{"title":"Artificial intelligence in musculoskeletal applications: a primer for radiologists.","authors":"Michelle W Tong, Jiamin Zhou, Zehra Akkaya, Sharmila Majumdar, Rupsa Bhattacharjee","doi":"10.4274/dir.2024.242830","DOIUrl":"https://doi.org/10.4274/dir.2024.242830","url":null,"abstract":"<p><p>As an umbrella term, artificial intelligence (AI) covers machine learning and deep learning. This review aimed to elaborate on these terms to act as a primer for radiologists to learn more about the algorithms commonly used in musculoskeletal radiology. It also aimed to familiarize them with the common practices and issues in the use of AI in this domain.</p>","PeriodicalId":11341,"journal":{"name":"Diagnostic and interventional radiology","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141999601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
<p><strong>Purpose: </strong>We investigated the diagnostic accuracy of simplified intravoxel incoherent motion (IVIM) imaging for detecting synovial inflammation in the sacroiliac joint (SIJ) in a population with active sacroiliitis.</p><p><strong>Methods: </strong>In accordance with the Assessment of Spondyloarthritis International Society criteria, 86 SIJs of 46 patients with active sacroiliitis were included in this retrospective study conducted between November 2020 and January 2022. Based on T1-weighted post-gadolinium images, the SIJs were divided into two groups: synovial inflammation positive (SIP) (n = 28) and synovial inflammation negative (SIN) (n = 58). Synovial areas in the SIJ space were independently and blindly reviewed for the presence of inflammation by two radiologists with differing levels of expertise in radiology. Using four b values, apparent diffusion coefficient (ADC)= ADC (0, 800) and the simplified 3T IVIM method parameters true diffusion coefficient (D<sub>1</sub>)= ADC (50, 800), D= ADC (400, 800), f<sub>1</sub>= f (0, 50, 800), f<sub>2</sub>= f (0, 400, 800), pseudodiffusion coefficient (D*)= D* (0, 50, 400, 800), ADC<sub>low</sub> = ADC (0, 50), and ADC<sub>diff</sub>= ADC<sub>low</sub> - D were generated voxel by voxel for each patient. The IVIM and ADC parameters at the SIN and SIP joints were compared.</p><p><strong>Results: </strong>The D parameter was significantly increased in SIP areas (1.23 ± 0.34 × 10<sup>-3</sup> mm<sup>2</sup>/s) compared with SIN areas (1.02 ± 0.16 × 10<sup>-3</sup> mm<sup>2</sup>/s) (<i>P</i> = 0.004). Conversely, the D* parameter was significantly decreased in SIP areas (21.78 ± 3.77 × 10<sup>-3</sup> mm<sup>2</sup>/s) compared with SIN areas (16.19 ± 4.58 × 10<sup>-3</sup> mm<sup>2</sup>/s) (<i>P</i> < 0.001). When the optimal cut-off value of 1.11 × 10<sup>-3</sup> mm<sup>2</sup>/s was selected, the sensitivity for the D value was 71% and the specificity was 72% [area under the curve (AUC): 0.716)]. When the optimal cut-off value of 21.06 × 10<sup>-3</sup> mm<sup>2</sup>/s was selected, the sensitivity for the D* value was 78.6%, and the specificity was 79.3% (AUC: 0.829). The interclass correlation coefficient was excellent for f<sub>1</sub>, f<sub>2</sub> D*, D, and ADC<sub>diff</sub>, good for ADC<sub>low</sub> and D<sub>1</sub>, but reasonable for ADC.</p><p><strong>Conclusion: </strong>The presence of synovial inflammation in the SIJ can be evaluated with high sensitivity and specificity using only four b values through the simplified IVIM method without the need for a contrast agent.</p><p><strong>Clinical significance: </strong>IVIM imaging is a technique that allows us to gain insights into tissue perfusion without the administration of contrast agents, utilizing diffusion-weighted images. In this study, for the first time, we demonstrated the potential of detecting synovial inflammation in the SIJ using IVIM, specifically through the pseudodiffusion (D*) parameter, without
{"title":"Detection of synovial inflammation in the sacroiliac joint space through intravoxel incoherent motion imaging: an alternative to contrast agents.","authors":"Murat Ağırlar, Barış Genç, Aysu Başak Özbalcı","doi":"10.4274/dir.2024.242749","DOIUrl":"https://doi.org/10.4274/dir.2024.242749","url":null,"abstract":"<p><strong>Purpose: </strong>We investigated the diagnostic accuracy of simplified intravoxel incoherent motion (IVIM) imaging for detecting synovial inflammation in the sacroiliac joint (SIJ) in a population with active sacroiliitis.</p><p><strong>Methods: </strong>In accordance with the Assessment of Spondyloarthritis International Society criteria, 86 SIJs of 46 patients with active sacroiliitis were included in this retrospective study conducted between November 2020 and January 2022. Based on T1-weighted post-gadolinium images, the SIJs were divided into two groups: synovial inflammation positive (SIP) (n = 28) and synovial inflammation negative (SIN) (n = 58). Synovial areas in the SIJ space were independently and blindly reviewed for the presence of inflammation by two radiologists with differing levels of expertise in radiology. Using four b values, apparent diffusion coefficient (ADC)= ADC (0, 800) and the simplified 3T IVIM method parameters true diffusion coefficient (D<sub>1</sub>)= ADC (50, 800), D= ADC (400, 800), f<sub>1</sub>= f (0, 50, 800), f<sub>2</sub>= f (0, 400, 800), pseudodiffusion coefficient (D*)= D* (0, 50, 400, 800), ADC<sub>low</sub> = ADC (0, 50), and ADC<sub>diff</sub>= ADC<sub>low</sub> - D were generated voxel by voxel for each patient. The IVIM and ADC parameters at the SIN and SIP joints were compared.</p><p><strong>Results: </strong>The D parameter was significantly increased in SIP areas (1.23 ± 0.34 × 10<sup>-3</sup> mm<sup>2</sup>/s) compared with SIN areas (1.02 ± 0.16 × 10<sup>-3</sup> mm<sup>2</sup>/s) (<i>P</i> = 0.004). Conversely, the D* parameter was significantly decreased in SIP areas (21.78 ± 3.77 × 10<sup>-3</sup> mm<sup>2</sup>/s) compared with SIN areas (16.19 ± 4.58 × 10<sup>-3</sup> mm<sup>2</sup>/s) (<i>P</i> < 0.001). When the optimal cut-off value of 1.11 × 10<sup>-3</sup> mm<sup>2</sup>/s was selected, the sensitivity for the D value was 71% and the specificity was 72% [area under the curve (AUC): 0.716)]. When the optimal cut-off value of 21.06 × 10<sup>-3</sup> mm<sup>2</sup>/s was selected, the sensitivity for the D* value was 78.6%, and the specificity was 79.3% (AUC: 0.829). The interclass correlation coefficient was excellent for f<sub>1</sub>, f<sub>2</sub> D*, D, and ADC<sub>diff</sub>, good for ADC<sub>low</sub> and D<sub>1</sub>, but reasonable for ADC.</p><p><strong>Conclusion: </strong>The presence of synovial inflammation in the SIJ can be evaluated with high sensitivity and specificity using only four b values through the simplified IVIM method without the need for a contrast agent.</p><p><strong>Clinical significance: </strong>IVIM imaging is a technique that allows us to gain insights into tissue perfusion without the administration of contrast agents, utilizing diffusion-weighted images. In this study, for the first time, we demonstrated the potential of detecting synovial inflammation in the SIJ using IVIM, specifically through the pseudodiffusion (D*) parameter, without ","PeriodicalId":11341,"journal":{"name":"Diagnostic and interventional radiology","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141999602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Purpose: To evaluate the performance of Microsoft Bing with ChatGPT-4 technology in analyzing abdominal computed tomography (CT) and magnetic resonance images (MRI).
Methods: A comparative and descriptive analysis was conducted using the institutional picture archiving and communication systems. A total of 80 abdominal images (44 CT, 36 MRI) that showed various entities affecting the abdominal structures were included. Microsoft Bing's interpretations were compared with the impressions of radiologists in terms of recognition of the imaging modality, identification of the imaging planes (axial, coronal, and sagittal), sequences (in the case of MRI), contrast media administration, correct identification of the anatomical region depicted in the image, and detection of abnormalities.
Results: Microsoft Bing detected that the images were CT scans with 95.4% accuracy (42/44) and that the images were MRI scans with 86.1% accuracy (31/36). However, it failed to detect one CT image (2.3%) and misidentified another CT image as an MRI (2.3%). On the other hand, it also misidentified four MRI as CT images (11.1%) and one as an X-ray (2.7%). Bing achieved an 83.75% success rate in correctly identifying abdominal regions, with 90% accuracy for CT scans (40/44) and 77.7% for MRI scans (28/36). Concerning the identification of imaging planes, Bing achieved a success rate of 95.4% for CT images and 83.3% for MRI. Regarding the identification of MRI sequences (T1-weighted and T2-weighted), the success rate was 68.75%. In the identification of the use of contrast media for CT scans, the success rate was 64.2%. Bing detected abnormalities in 35% of the images but achieved a correct interpretation rate of 10.7% for the definite diagnosis.
Conclusion: While Microsoft Bing, leveraging ChatGPT-4 technology, demonstrates proficiency in basic task identification on abdominal CT and MRI, its inability to reliably interpret abnormalities highlights the need for continued refinement to enhance its clinical applicability.
Clinical significance: The contribution of large language models (LLMs) to the diagnostic process in radiology is still being explored. However, with a comprehensive understanding of their capabilities and limitations, LLMs can significantly support radiologists during diagnosis and improve the overall efficiency of abdominal radiology practices. Acknowledging the limitations of current studies related to ChatGPT in this field, our work provides a foundation for future clinical research, paving the way for more integrated and effective diagnostic tools.
{"title":"Evaluating Microsoft Bing with ChatGPT-4 for the assessment of abdominal computed tomography and magnetic resonance images.","authors":"Alperen Elek, Duygu Doğa Ekizalioğlu, Ezgi Güler","doi":"10.4274/dir.2024.232680","DOIUrl":"https://doi.org/10.4274/dir.2024.232680","url":null,"abstract":"<p><strong>Purpose: </strong>To evaluate the performance of Microsoft Bing with ChatGPT-4 technology in analyzing abdominal computed tomography (CT) and magnetic resonance images (MRI).</p><p><strong>Methods: </strong>A comparative and descriptive analysis was conducted using the institutional picture archiving and communication systems. A total of 80 abdominal images (44 CT, 36 MRI) that showed various entities affecting the abdominal structures were included. Microsoft Bing's interpretations were compared with the impressions of radiologists in terms of recognition of the imaging modality, identification of the imaging planes (axial, coronal, and sagittal), sequences (in the case of MRI), contrast media administration, correct identification of the anatomical region depicted in the image, and detection of abnormalities.</p><p><strong>Results: </strong>Microsoft Bing detected that the images were CT scans with 95.4% accuracy (42/44) and that the images were MRI scans with 86.1% accuracy (31/36). However, it failed to detect one CT image (2.3%) and misidentified another CT image as an MRI (2.3%). On the other hand, it also misidentified four MRI as CT images (11.1%) and one as an X-ray (2.7%). Bing achieved an 83.75% success rate in correctly identifying abdominal regions, with 90% accuracy for CT scans (40/44) and 77.7% for MRI scans (28/36). Concerning the identification of imaging planes, Bing achieved a success rate of 95.4% for CT images and 83.3% for MRI. Regarding the identification of MRI sequences (T1-weighted and T2-weighted), the success rate was 68.75%. In the identification of the use of contrast media for CT scans, the success rate was 64.2%. Bing detected abnormalities in 35% of the images but achieved a correct interpretation rate of 10.7% for the definite diagnosis.</p><p><strong>Conclusion: </strong>While Microsoft Bing, leveraging ChatGPT-4 technology, demonstrates proficiency in basic task identification on abdominal CT and MRI, its inability to reliably interpret abnormalities highlights the need for continued refinement to enhance its clinical applicability.</p><p><strong>Clinical significance: </strong>The contribution of large language models (LLMs) to the diagnostic process in radiology is still being explored. However, with a comprehensive understanding of their capabilities and limitations, LLMs can significantly support radiologists during diagnosis and improve the overall efficiency of abdominal radiology practices. Acknowledging the limitations of current studies related to ChatGPT in this field, our work provides a foundation for future clinical research, paving the way for more integrated and effective diagnostic tools.</p>","PeriodicalId":11341,"journal":{"name":"Diagnostic and interventional radiology","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141999603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Katelyn Gill, Sarah Aleman, Alexandra H Fairchild, Bahri Üstünsöz, Dan Laney, Alison A Smith, Hector Ferral
Purpose: To describe the experience of a single level 1 trauma center in the management of blunt splenic injuries (BSI).
Methods: This is a retrospective study with Institutional Review Board approval. The medical records of 450 patients with BSI treated between January 2016 and December 2022 were reviewed. Seventy-two patients were treated with splenic artery embolization (SAE), met the study criteria, and were eligible for data analysis. Spleen injuries were graded in accordance with the American Association for the Surgery of Trauma Organ Injury Scale. Univariate data analysis was performed, with P < 0.05 considered statistically significant.
Results: The splenic salvage rate was 90.3% (n = 65/72). Baseline demographics were similar between the groups (P > 0.05). Distal embolization with Gelfoam® had similar rates of splenic salvage to proximal embolization with coils (90% vs. 94.1%, P > 0.05). There was no significant difference in the rate of splenic infarction between distal embolization with Gelfoam® (20%, 4/20) and proximal embolization with coils (17.6%, 3/17) (P > 0.05). There was no significant difference in procedure length (68 vs. 75.8 min) or splenic salvage rate (88.5% vs. 92.1%) between proximal and distal embolization (P > 0.05). There was no significant difference in procedure length (69.1 vs. 73.6 min) or splenic salvage rate (93.1% vs. 86.4%) between Gelfoam® and coil embolization (P > 0.05). Combined proximal and distal embolization was associated with a higher rate of splenic abscess formation (25%, 2/8) when compared with proximal (0%, 0/26) or distal (0%, 0/38) embolization alone (P = 0.0003). The rate of asymptomatic and symptomatic splenic infarction was significantly higher in patients embolized at combined proximal and distal locations (P = 0.04, P = 0.01).
Conclusion: The endovascular management of BSI is safe and effective. The overall splenic salvage rate was 90.3%. Distal embolization with Gelfoam® was not associated with higher rates of splenic infarction when compared with proximal embolization with coils. Combined proximal and distal embolization was associated with a higher incidence of splenic infarction and splenic abscess formation.
Clinical significance: Distal splenic embolization with Gelfoam® is safe and may be beneficial in the setting of blunt splenic trauma.
{"title":"Splenic artery embolization in the treatment of blunt splenic injury: single level 1 trauma center experience.","authors":"Katelyn Gill, Sarah Aleman, Alexandra H Fairchild, Bahri Üstünsöz, Dan Laney, Alison A Smith, Hector Ferral","doi":"10.4274/dir.2024.242789","DOIUrl":"https://doi.org/10.4274/dir.2024.242789","url":null,"abstract":"<p><strong>Purpose: </strong>To describe the experience of a single level 1 trauma center in the management of blunt splenic injuries (BSI).</p><p><strong>Methods: </strong>This is a retrospective study with Institutional Review Board approval. The medical records of 450 patients with BSI treated between January 2016 and December 2022 were reviewed. Seventy-two patients were treated with splenic artery embolization (SAE), met the study criteria, and were eligible for data analysis. Spleen injuries were graded in accordance with the American Association for the Surgery of Trauma Organ Injury Scale. Univariate data analysis was performed, with <i>P</i> < 0.05 considered statistically significant.</p><p><strong>Results: </strong>The splenic salvage rate was 90.3% (n = 65/72). Baseline demographics were similar between the groups (<i>P</i> > 0.05). Distal embolization with Gelfoam<sup>®</sup> had similar rates of splenic salvage to proximal embolization with coils (90% vs. 94.1%, <i>P</i> > 0.05). There was no significant difference in the rate of splenic infarction between distal embolization with Gelfoam<sup>®</sup> (20%, 4/20) and proximal embolization with coils (17.6%, 3/17) (<i>P</i> > 0.05). There was no significant difference in procedure length (68 vs. 75.8 min) or splenic salvage rate (88.5% vs. 92.1%) between proximal and distal embolization (<i>P</i> > 0.05). There was no significant difference in procedure length (69.1 vs. 73.6 min) or splenic salvage rate (93.1% vs. 86.4%) between Gelfoam<sup>®</sup> and coil embolization (<i>P</i> > 0.05). Combined proximal and distal embolization was associated with a higher rate of splenic abscess formation (25%, 2/8) when compared with proximal (0%, 0/26) or distal (0%, 0/38) embolization alone (<i>P</i> = 0.0003). The rate of asymptomatic and symptomatic splenic infarction was significantly higher in patients embolized at combined proximal and distal locations (<i>P</i> = 0.04, <i>P</i> = 0.01).</p><p><strong>Conclusion: </strong>The endovascular management of BSI is safe and effective. The overall splenic salvage rate was 90.3%. Distal embolization with Gelfoam<sup>®</sup> was not associated with higher rates of splenic infarction when compared with proximal embolization with coils. Combined proximal and distal embolization was associated with a higher incidence of splenic infarction and splenic abscess formation.</p><p><strong>Clinical significance: </strong>Distal splenic embolization with Gelfoam<sup>®</sup> is safe and may be beneficial in the setting of blunt splenic trauma.</p>","PeriodicalId":11341,"journal":{"name":"Diagnostic and interventional radiology","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141579209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}