首页 > 最新文献

JMIR Medical Informatics最新文献

英文 中文
Iterative Large Language Model-Guided Sampling and Expert-Annotated Benchmark Corpus for Harmful Suicide Content Detection: Development and Validation Study. 基于迭代大语言模型引导采样和专家标注基准语料库的有害自杀内容检测:开发与验证研究。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-02-05 DOI: 10.2196/73725
Kyumin Park, Myung Jae Baik, YeongJun Hwang, Yen Shin, HoJae Lee, Ruda Lee, Sang Min Lee, Je Young Hannah Sun, Ah Rah Lee, Si Yeun Yoon, Dong-Ho Lee, Jihyung Moon, JinYeong Bak, Kyunghyun Cho, Jong-Woo Paik, Sungjoon Park
<p><strong>Background: </strong>Harmful suicide content on the internet poses significant risks, as it can induce suicidal thoughts and behaviors, particularly among vulnerable populations. Despite global efforts, existing moderation approaches remain insufficient, especially in high-risk regions such as South Korea, which has the highest suicide rate among Organisation for Economic Co-operation and Development countries. Previous research has primarily focused on assessing the suicide risk of the authors who wrote the content rather than the harmfulness of content itself which potentially leads the readers to self-harm or suicide, highlighting a critical gap in current approaches. Our study addresses this gap by shifting the focus from assessing the suicide risk of content authors to evaluating the harmfulness of the content itself and its potential to induce suicide risk among readers.</p><p><strong>Objective: </strong>This study aimed to develop an artificial intelligence (AI)-driven system for classifying online suicide-related content into 5 levels: illegal, harmful, potentially harmful, harmless, and non-suicide-related. In addition, the researchers construct a multimodal benchmark dataset with expert annotations to improve content moderation and assist AI models in detecting and regulating harmful content more effectively.</p><p><strong>Methods: </strong>We collected 43,244 user-generated posts from various online sources, including social media, question and answer (Q&A) platforms, and online communities. To reduce the workload on human annotators, GPT-4 was used for preannotation, filtering, and categorizing content before manual review by medical professionals. A task description document ensured consistency in classification. Ultimately, a benchmark dataset of 452 manually labeled entries was developed, including both Korean and English versions, to support AI-based moderation. The study also evaluated zero-shot and few-shot learning to determine the best AI approach for detecting harmful content.</p><p><strong>Results: </strong>The multimodal benchmark dataset showed that GPT-4 achieved the highest F1-scores (66.46 for illegal and 77.09 for harmful content detection). Image descriptions improved classification accuracy, while directly using raw images slightly decreased performance. Few-shot learning significantly enhanced detection, demonstrating that small but high-quality datasets could improve AI-driven moderation. However, translation challenges were observed, particularly in suicide-related slang and abbreviations, which were sometimes inaccurately conveyed in the English benchmark.</p><p><strong>Conclusions: </strong>This study provides a high-quality benchmark for AI-based suicide content detection, proving that large language models can effectively assist in content moderation while reducing the burden on human moderators. Future work will focus on enhancing real-time detection and improving the handling of subtle or disguise
背景:互联网上有害的自杀内容带来了巨大的风险,因为它可以诱发自杀的想法和行为,特别是在弱势群体中。尽管全球都在努力,但现有的节制措施仍然不够,尤其是在韩国等高风险地区。韩国是经合组织(oecd)成员国中自杀率最高的国家。以前的研究主要集中在评估撰写内容的作者的自杀风险,而不是内容本身的危害性,这可能导致读者自残或自杀,这突出了当前方法的一个关键差距。我们的研究通过将重点从评估内容作者的自杀风险转移到评估内容本身的危害性及其在读者中引发自杀风险的可能性来解决这一差距。目的:本研究旨在开发一个人工智能驱动的系统,将网络自杀相关内容分为5个级别:非法、有害、潜在有害、无害和非自杀相关。此外,研究人员构建了一个带有专家注释的多模态基准数据集,以改善内容审核,并帮助人工智能模型更有效地检测和监管有害内容。方法:我们从社交媒体、问答平台和网络社区等各种网络来源收集了43244篇用户帖子。为了减少人工注释员的工作量,在医疗专业人员手动审阅之前,使用GPT-4对内容进行预注释、过滤和分类。任务描述文档确保了分类的一致性。最终,开发了一个包含452个手动标记条目的基准数据集,包括韩语和英语版本,以支持基于人工智能的审核。该研究还评估了零射击和少射击学习,以确定检测有害内容的最佳人工智能方法。结果:多模态基准数据集显示,GPT-4获得了最高的f1分(非法检测66.46分,有害成分检测77.09分)。图像描述提高了分类精度,而直接使用原始图像会略微降低分类性能。Few-shot学习显著增强了检测,表明小而高质量的数据集可以改善人工智能驱动的适度。然而,我们观察到翻译上的挑战,特别是在自杀相关的俚语和缩写中,这些在英语基准中有时传达不准确。结论:本研究为基于人工智能的自杀内容检测提供了一个高质量的基准,证明了大型语言模型可以有效地辅助内容审核,同时减轻人类审核员的负担。未来的工作将侧重于加强实时检测和改进对微妙或伪装的有害内容的处理。
{"title":"Iterative Large Language Model-Guided Sampling and Expert-Annotated Benchmark Corpus for Harmful Suicide Content Detection: Development and Validation Study.","authors":"Kyumin Park, Myung Jae Baik, YeongJun Hwang, Yen Shin, HoJae Lee, Ruda Lee, Sang Min Lee, Je Young Hannah Sun, Ah Rah Lee, Si Yeun Yoon, Dong-Ho Lee, Jihyung Moon, JinYeong Bak, Kyunghyun Cho, Jong-Woo Paik, Sungjoon Park","doi":"10.2196/73725","DOIUrl":"10.2196/73725","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Harmful suicide content on the internet poses significant risks, as it can induce suicidal thoughts and behaviors, particularly among vulnerable populations. Despite global efforts, existing moderation approaches remain insufficient, especially in high-risk regions such as South Korea, which has the highest suicide rate among Organisation for Economic Co-operation and Development countries. Previous research has primarily focused on assessing the suicide risk of the authors who wrote the content rather than the harmfulness of content itself which potentially leads the readers to self-harm or suicide, highlighting a critical gap in current approaches. Our study addresses this gap by shifting the focus from assessing the suicide risk of content authors to evaluating the harmfulness of the content itself and its potential to induce suicide risk among readers.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aimed to develop an artificial intelligence (AI)-driven system for classifying online suicide-related content into 5 levels: illegal, harmful, potentially harmful, harmless, and non-suicide-related. In addition, the researchers construct a multimodal benchmark dataset with expert annotations to improve content moderation and assist AI models in detecting and regulating harmful content more effectively.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;We collected 43,244 user-generated posts from various online sources, including social media, question and answer (Q&A) platforms, and online communities. To reduce the workload on human annotators, GPT-4 was used for preannotation, filtering, and categorizing content before manual review by medical professionals. A task description document ensured consistency in classification. Ultimately, a benchmark dataset of 452 manually labeled entries was developed, including both Korean and English versions, to support AI-based moderation. The study also evaluated zero-shot and few-shot learning to determine the best AI approach for detecting harmful content.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;The multimodal benchmark dataset showed that GPT-4 achieved the highest F1-scores (66.46 for illegal and 77.09 for harmful content detection). Image descriptions improved classification accuracy, while directly using raw images slightly decreased performance. Few-shot learning significantly enhanced detection, demonstrating that small but high-quality datasets could improve AI-driven moderation. However, translation challenges were observed, particularly in suicide-related slang and abbreviations, which were sometimes inaccurately conveyed in the English benchmark.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;This study provides a high-quality benchmark for AI-based suicide content detection, proving that large language models can effectively assist in content moderation while reducing the burden on human moderators. Future work will focus on enhancing real-time detection and improving the handling of subtle or disguise","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e73725"},"PeriodicalIF":3.8,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12875420/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146127573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Linking Electronic Health Records for Multiple Sclerosis Research: Comparative Study of Deterministic, Probabilistic, and Machine Learning Linkage Methods. 链接多发性硬化症研究的电子健康记录:确定性,概率和机器学习链接方法的比较研究。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-02-04 DOI: 10.2196/79869
Ohoud Almadani, Yasser Albogami, Adel Alrwisan

Background: Data linkage in pharmacoepidemiological research is commonly employed to ascertain exposures and outcomes or to obtain additional information on confounding variables. However, to protect patient confidentiality, unique patient identifiers are not provided, which makes data linkage across multiple sources challenging. The Saudi Real-World Evidence Network (SRWEN) aggregates electronic health records from various hospitals, which may require robust linkage techniques.

Objective: We aimed to evaluate and compare the performance of deterministic, probabilistic, and machine learning (ML) approaches for linking deidentified data of patients with multiple sclerosis (MS) from the SRWEN and Ministry of National Guard Health Affairs electronic health record systems.

Methods: A simulation-based validation framework was applied before linking real-world data sources. Deterministic linkage was based on predefined rules, whereas probabilistic linkage was based on a similarity score-based matching. For ML, both similarity score-based and classification approaches were applied using neural networks, logistic regression, and random forest models. The performance of each approach was assessed using confusion matrices, focusing on sensitivity, positive predictive value, F1 score, and computational efficiency.

Results: The study included linked data of 2247 patients with MS from 2016 to 2023. The deterministic approach resulted in an average F1 score of 97.2% in the simulation and demonstrated varying match rates in real-world linkage: 1046/2247 (46.6%) to 1946/2247 (86.6%). This linkage was computationally efficient, with run times of <1 second per rule. The probabilistic approach provided an average F1 score of 93.9% in the simulation, with real-world match rates ranging from 1472/2247 (65.5%) to 2144/2247 (95.4%) and processing times ranging from approximately 0.1 to 5 seconds per rule. ML approaches achieved high performance (F1 score reached 99.8%) but were computationally expensive. Processing times ranged from approximately 13 to 16,936 seconds for the classification-based approaches and from approximately 13 to 7467 seconds for the similarity score-based approaches. Real-world match rates from ML models were highly variable depending on the method used; the similarity score-based approach identified 789/2247 (35.1%) matched pairs, whereas the classification-based approach identified 2014/2247 (89.6%).

Conclusions: Probabilistic linkage offers high linkage capacity by recovering matches missed by deterministic methods and proved to be both flexible and efficient, particularly in real-world scenarios where unique identifiers are lacking. This method achieved a great balance between recall and precision, enabling better integration of various data sources that could be useful in MS research.

背景:在药物流行病学研究中,数据链接通常用于确定暴露和结果,或获得有关混杂变量的额外信息。然而,为了保护患者的机密性,没有提供唯一的患者标识符,这使得跨多个来源的数据链接变得困难。沙特真实世界证据网络(SRWEN)汇集了来自不同医院的电子健康记录,这可能需要强大的连接技术。目的:我们旨在评估和比较确定性、概率和机器学习(ML)方法的性能,以连接来自SRWEN和国民警卫队卫生事务部电子健康记录系统的多发性硬化症(MS)患者的去识别数据。方法:在连接真实数据源之前,应用基于仿真的验证框架。确定性链接基于预定义的规则,而概率链接基于基于相似性分数的匹配。对于机器学习,使用神经网络、逻辑回归和随机森林模型应用了基于相似性评分和分类的方法。使用混淆矩阵评估每种方法的性能,重点关注灵敏度、阳性预测值、F1评分和计算效率。结果:该研究纳入了2016年至2023年2247例MS患者的相关数据。确定性方法在模拟中的平均F1分数为97.2%,并且在实际链接中显示出不同的匹配率:1046/2247(46.6%)到1946/2247(86.6%)。这种链接的计算效率很高,运行时间为:概率链接通过恢复确定性方法错过的匹配提供了很高的链接容量,并且被证明既灵活又高效,特别是在缺乏唯一标识符的现实场景中。该方法在查全率和查准率之间取得了很好的平衡,能够更好地整合各种数据源,这在质谱研究中是有用的。
{"title":"Linking Electronic Health Records for Multiple Sclerosis Research: Comparative Study of Deterministic, Probabilistic, and Machine Learning Linkage Methods.","authors":"Ohoud Almadani, Yasser Albogami, Adel Alrwisan","doi":"10.2196/79869","DOIUrl":"10.2196/79869","url":null,"abstract":"<p><strong>Background: </strong>Data linkage in pharmacoepidemiological research is commonly employed to ascertain exposures and outcomes or to obtain additional information on confounding variables. However, to protect patient confidentiality, unique patient identifiers are not provided, which makes data linkage across multiple sources challenging. The Saudi Real-World Evidence Network (SRWEN) aggregates electronic health records from various hospitals, which may require robust linkage techniques.</p><p><strong>Objective: </strong>We aimed to evaluate and compare the performance of deterministic, probabilistic, and machine learning (ML) approaches for linking deidentified data of patients with multiple sclerosis (MS) from the SRWEN and Ministry of National Guard Health Affairs electronic health record systems.</p><p><strong>Methods: </strong>A simulation-based validation framework was applied before linking real-world data sources. Deterministic linkage was based on predefined rules, whereas probabilistic linkage was based on a similarity score-based matching. For ML, both similarity score-based and classification approaches were applied using neural networks, logistic regression, and random forest models. The performance of each approach was assessed using confusion matrices, focusing on sensitivity, positive predictive value, F1 score, and computational efficiency.</p><p><strong>Results: </strong>The study included linked data of 2247 patients with MS from 2016 to 2023. The deterministic approach resulted in an average F1 score of 97.2% in the simulation and demonstrated varying match rates in real-world linkage: 1046/2247 (46.6%) to 1946/2247 (86.6%). This linkage was computationally efficient, with run times of <1 second per rule. The probabilistic approach provided an average F1 score of 93.9% in the simulation, with real-world match rates ranging from 1472/2247 (65.5%) to 2144/2247 (95.4%) and processing times ranging from approximately 0.1 to 5 seconds per rule. ML approaches achieved high performance (F1 score reached 99.8%) but were computationally expensive. Processing times ranged from approximately 13 to 16,936 seconds for the classification-based approaches and from approximately 13 to 7467 seconds for the similarity score-based approaches. Real-world match rates from ML models were highly variable depending on the method used; the similarity score-based approach identified 789/2247 (35.1%) matched pairs, whereas the classification-based approach identified 2014/2247 (89.6%).</p><p><strong>Conclusions: </strong>Probabilistic linkage offers high linkage capacity by recovering matches missed by deterministic methods and proved to be both flexible and efficient, particularly in real-world scenarios where unique identifiers are lacking. This method achieved a great balance between recall and precision, enabling better integration of various data sources that could be useful in MS research.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e79869"},"PeriodicalIF":3.8,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12872214/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146121098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prediction of First and Multiple Antiretroviral Therapy Interruptions in People Living With HIV: Comparative Survival Analysis Using Cox and Explainable Machine Learning Models. 预测HIV感染者首次和多次抗逆转录病毒治疗中断:使用Cox和可解释机器学习模型的比较生存分析
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-02-04 DOI: 10.2196/78964
Donald Salami, Emily Koech, Janet M Turan, Kristen A Stafford, Lilly Muthoni Nyagah, Stephen Ohakanu, Anthony K Ngugi, Manhattan Charurat

Background: The Cox proportional hazards (CPH) model is a common choice for analyzing time-to-treatment interruptions in patients on antiretroviral therapy (ART), valued for its straightforward interpretability and flexibility in handling time-dependent covariates. Machine learning (ML) models have increasingly been adapted for handling temporal data, with added advantages of handling complex, nonlinear relationships and large datasets, and providing clear practical interpretations.

Objective: This study aims to compare the predictive performance of the traditional CPH model and ML models in predicting treatment interruptions among patients on ART, while also providing both global and individual-level explanations to support personalized, data-driven interventions for improving treatment retention.

Methods: Using data from 621,115 patients who started ART between 2017 and 2023, in Kenya, we compared the performance of the CPH with the following ML models-gradient boosting machine, extreme gradient boosting, regularized generalized linear models (Ridge, Lasso, and Elastic-Net), and recursive partitioning-in predicting first and multiple treatment interruptions. Explainable surrogate technique (model-agnostic) was applied to interpret the best performing model's predictions globally, using variable importance and partial dependence profiles, and at individual level, using breakdown additive, Shapley Additive Explanations, and ceteris paribus.

Results: The recursive partitioning model achieved the best performance with a predictive concordance index score of 0.81 for first treatment interruptions and 0.89 for multiple interruptions, outperforming the CPH model, which scored 0.78 and 0.87 for the same scenarios, respectively. Recursive partitioning's performance can be attributed to its ability to model nonlinear relationships and automatically detect complex interactions. The global model-agnostic explanations aligned closely with the interpretations offered by hazard ratios in the CPH model, while offering additional insights into the impact of specific features on the model's predictions. The breakdown additive and Shapley Additive Explanations explainers demonstrated how different variables contribute to the predicted risk at the individual patient level. The ceteris paribus profiles further explored the time-varying model to illustrate how changes in a patient's covariates over time could impact their predicted risk of treatment interruption.

Conclusions: Our results highlight the superior predictive performance of ML models and their ability to provide patient-specific risk predictions and insights that can support targeted interventions to reduce treatment interruptions in ART care.

背景:Cox比例风险(CPH)模型是分析抗逆转录病毒治疗(ART)患者治疗中断时间的常用选择,因其直接的可解释性和处理时间相关协变量的灵活性而受到重视。机器学习(ML)模型越来越多地适用于处理时间数据,具有处理复杂、非线性关系和大型数据集的额外优势,并提供清晰的实际解释。目的:本研究旨在比较传统CPH模型和ML模型在预测ART患者治疗中断方面的预测性能,同时提供全球和个人层面的解释,以支持个性化、数据驱动的干预措施,以提高治疗保留率。方法:使用2017年至2023年间在肯尼亚开始ART治疗的621,115例患者的数据,我们比较了CPH与以下ML模型的性能-梯度增强机,极端梯度增强,正则化广义线性模型(Ridge, Lasso和Elastic-Net)以及递归分割-预测首次和多次治疗中断。可解释的替代技术(模型不可知)被应用于解释全球表现最好的模型预测,使用可变重要性和部分依赖概况,在个体水平上,使用分解添加剂,沙普利添加剂解释和其他条件相同。结果:递归划分模型对首次治疗中断的预测一致性指数得分为0.81,对多次治疗中断的预测一致性指数得分为0.89,优于CPH模型,CPH模型在相同情景下的预测一致性指数分别为0.78和0.87。递归划分的性能可归因于其建模非线性关系和自动检测复杂交互的能力。与全球模型无关的解释与CPH模型中的风险比提供的解释密切相关,同时为特定特征对模型预测的影响提供了额外的见解。分解加性解释和沙普利加性解释解释了不同的变量如何影响个体患者水平的预测风险。其他条件下的资料进一步探讨了时变模型,以说明患者协变量随时间的变化如何影响他们预测的治疗中断风险。结论:我们的研究结果突出了ML模型的卓越预测性能,以及它们提供患者特定风险预测和见解的能力,这些预测和见解可以支持有针对性的干预措施,以减少ART护理中的治疗中断。
{"title":"Prediction of First and Multiple Antiretroviral Therapy Interruptions in People Living With HIV: Comparative Survival Analysis Using Cox and Explainable Machine Learning Models.","authors":"Donald Salami, Emily Koech, Janet M Turan, Kristen A Stafford, Lilly Muthoni Nyagah, Stephen Ohakanu, Anthony K Ngugi, Manhattan Charurat","doi":"10.2196/78964","DOIUrl":"10.2196/78964","url":null,"abstract":"<p><strong>Background: </strong>The Cox proportional hazards (CPH) model is a common choice for analyzing time-to-treatment interruptions in patients on antiretroviral therapy (ART), valued for its straightforward interpretability and flexibility in handling time-dependent covariates. Machine learning (ML) models have increasingly been adapted for handling temporal data, with added advantages of handling complex, nonlinear relationships and large datasets, and providing clear practical interpretations.</p><p><strong>Objective: </strong>This study aims to compare the predictive performance of the traditional CPH model and ML models in predicting treatment interruptions among patients on ART, while also providing both global and individual-level explanations to support personalized, data-driven interventions for improving treatment retention.</p><p><strong>Methods: </strong>Using data from 621,115 patients who started ART between 2017 and 2023, in Kenya, we compared the performance of the CPH with the following ML models-gradient boosting machine, extreme gradient boosting, regularized generalized linear models (Ridge, Lasso, and Elastic-Net), and recursive partitioning-in predicting first and multiple treatment interruptions. Explainable surrogate technique (model-agnostic) was applied to interpret the best performing model's predictions globally, using variable importance and partial dependence profiles, and at individual level, using breakdown additive, Shapley Additive Explanations, and ceteris paribus.</p><p><strong>Results: </strong>The recursive partitioning model achieved the best performance with a predictive concordance index score of 0.81 for first treatment interruptions and 0.89 for multiple interruptions, outperforming the CPH model, which scored 0.78 and 0.87 for the same scenarios, respectively. Recursive partitioning's performance can be attributed to its ability to model nonlinear relationships and automatically detect complex interactions. The global model-agnostic explanations aligned closely with the interpretations offered by hazard ratios in the CPH model, while offering additional insights into the impact of specific features on the model's predictions. The breakdown additive and Shapley Additive Explanations explainers demonstrated how different variables contribute to the predicted risk at the individual patient level. The ceteris paribus profiles further explored the time-varying model to illustrate how changes in a patient's covariates over time could impact their predicted risk of treatment interruption.</p><p><strong>Conclusions: </strong>Our results highlight the superior predictive performance of ML models and their ability to provide patient-specific risk predictions and insights that can support targeted interventions to reduce treatment interruptions in ART care.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e78964"},"PeriodicalIF":3.8,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12871577/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146121108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ranking-Aware Multiple Instance Learning for Histopathology Slide Classification: Development and Validation Study. 组织病理学切片分类的分级感知多实例学习:开发与验证研究。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-02-04 DOI: 10.2196/84417
Ho Heon Kim, Gisu Hwang, Won Chan Jeong, Young Sin Ko

Background: Multiple instance learning (MIL) is widely used for slide-level classification in digital pathology without requiring expert annotations. However, even partial expert annotations offer valuable supervision; few studies have effectively leveraged this information within MIL frameworks.

Objective: This study aims to develop and evaluate a ranking-aware MIL framework, called rank induction, that effectively incorporates partial expert annotations to improve slide-level classification performance under realistic annotation constraints.

Methods: We developed rank induction, a MIL approach that incorporates expert annotations using a pairwise rank loss inspired by RankNet. The method encourages the model to assign higher attention scores to annotated regions than to unannotated ones, guiding it to focus on diagnostically relevant patches. We evaluated rank induction on 2 public datasets (Camelyon16 and DigestPath2019) and an in-house dataset (Seegene Medical Foundation-stomach; SMF-stomach) and tested its robustness under 3 real-world conditions: low-data regimes, coarse within-slide annotations, and sparse slide-level annotations.

Results: Rank induction outperformed existing methodologies, achieving an area under the receiver operating characteristic curve (AUROC) of 0.839 on Camelyon16, 0.995 on DigestPath2019, and 0.875 on SMF-stomach. It remained robust under low-data conditions, maintaining an AUROC of 0.761 with only 60.2% (130/216) of the training data. When using coarse annotations (with 2240-pixel padding), performance slightly declined to 0.823. Remarkably, annotating just 20% (18/89) of the slides was enough to reach near-saturated performance (AUROC of 0.806, vs 0.839 with full annotations).

Conclusions: Incorporating expert annotations through ranking-based supervision improves MIL-based classification. Rank induction remains robust even with limited, coarse, or sparsely available annotations, demonstrating its practicality in real-world scenarios.

背景:多实例学习(MIL)被广泛用于数字病理学的幻灯片级分类,而不需要专家注释。然而,即使是部分专家注释也提供了有价值的监督;很少有研究在MIL框架内有效地利用了这些信息。目的:本研究旨在开发和评估一种称为秩归纳的秩感知MIL框架,该框架有效地结合了部分专家注释,以提高现实标注约束下的幻灯片级分类性能。方法:我们开发了排名归纳,这是一种MIL方法,使用受RankNet启发的成对排名损失结合了专家注释。该方法鼓励模型将更高的注意力分数分配给已注释的区域,而不是未注释的区域,从而引导模型专注于诊断相关的补丁。我们在2个公共数据集(Camelyon16和DigestPath2019)和一个内部数据集(Seegene Medical foundation -胃;smf -胃)上评估了排名归纳,并在3个现实条件下测试了其稳健性:低数据机制、粗糙的幻灯片内注释和稀疏的幻灯片级注释。结果:等级归纳优于现有方法,Camelyon16的受试者工作特征曲线下面积(AUROC)为0.839,DigestPath2019为0.995,smf -胃为0.875。它在低数据条件下保持鲁棒性,仅使用60.2%(130/216)的训练数据,AUROC保持在0.761。当使用粗标注(2240像素填充)时,性能略微下降到0.823。值得注意的是,仅注释20%(18/89)的幻灯片就足以达到接近饱和的性能(AUROC为0.806,而完整注释的AUROC为0.839)。结论:通过基于排名的监督将专家注释纳入改进了基于mil的分类。即使使用有限的、粗糙的或稀疏可用的注释,排名归纳仍然是健壮的,这证明了它在现实场景中的实用性。
{"title":"Ranking-Aware Multiple Instance Learning for Histopathology Slide Classification: Development and Validation Study.","authors":"Ho Heon Kim, Gisu Hwang, Won Chan Jeong, Young Sin Ko","doi":"10.2196/84417","DOIUrl":"https://doi.org/10.2196/84417","url":null,"abstract":"<p><strong>Background: </strong>Multiple instance learning (MIL) is widely used for slide-level classification in digital pathology without requiring expert annotations. However, even partial expert annotations offer valuable supervision; few studies have effectively leveraged this information within MIL frameworks.</p><p><strong>Objective: </strong>This study aims to develop and evaluate a ranking-aware MIL framework, called rank induction, that effectively incorporates partial expert annotations to improve slide-level classification performance under realistic annotation constraints.</p><p><strong>Methods: </strong>We developed rank induction, a MIL approach that incorporates expert annotations using a pairwise rank loss inspired by RankNet. The method encourages the model to assign higher attention scores to annotated regions than to unannotated ones, guiding it to focus on diagnostically relevant patches. We evaluated rank induction on 2 public datasets (Camelyon16 and DigestPath2019) and an in-house dataset (Seegene Medical Foundation-stomach; SMF-stomach) and tested its robustness under 3 real-world conditions: low-data regimes, coarse within-slide annotations, and sparse slide-level annotations.</p><p><strong>Results: </strong>Rank induction outperformed existing methodologies, achieving an area under the receiver operating characteristic curve (AUROC) of 0.839 on Camelyon16, 0.995 on DigestPath2019, and 0.875 on SMF-stomach. It remained robust under low-data conditions, maintaining an AUROC of 0.761 with only 60.2% (130/216) of the training data. When using coarse annotations (with 2240-pixel padding), performance slightly declined to 0.823. Remarkably, annotating just 20% (18/89) of the slides was enough to reach near-saturated performance (AUROC of 0.806, vs 0.839 with full annotations).</p><p><strong>Conclusions: </strong>Incorporating expert annotations through ranking-based supervision improves MIL-based classification. Rank induction remains robust even with limited, coarse, or sparsely available annotations, demonstrating its practicality in real-world scenarios.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e84417"},"PeriodicalIF":3.8,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146121125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation of Large Language Models for Radiologists' Support in Multidisciplinary Breast Cancer Teams: Comparative Study. 对多学科乳腺癌团队中放射科医生支持的大型语言模型的评估:比较研究。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-02-02 DOI: 10.2196/68182
Hong Jiang, Chun Yang, Wenbin Zhou, Cheng-Liang Yin, Shan Zhou, Rui He, Guanghui Ran, Wujie Wang, Meixian Wu, Juan Yu
<p><strong>Background: </strong>Artificial intelligence tools, particularly large language models (LLMs), have shown considerable potential across various domains. However, their performance in the diagnosis and treatment of breast cancer remains unknown.</p><p><strong>Objective: </strong>This study aimed to evaluate the performance of LLMs in supporting radiologists within multidisciplinary breast cancer teams, with a focus on their roles in facilitating informed clinical decisions and enhancing patient care.</p><p><strong>Methods: </strong>A set of 50 questions covering radiological and breast cancer guidelines was developed to assess breast cancer. These questions were posed to 9 popular LLMs and clinical physicians, with the expectation of receiving direct "Yes" or "No" answers along with supporting analysis. The performances of the 9 models, including ChatGPT-4.0, ChatGPT-4o, ChatGPT-4o mini, Claude 3 Opus, Claude 3.5 Sonnet, Gemini 1.5 Pro, Tongyi Qianwen 2.5, ChatGLM, and Ernie Bot 3.5, were evaluated against that of radiologists with varying experience levels (resident physicians, fellow physicians, and attending physicians). Responses were assessed for accuracy, confidence, and consistency based on alignment with the 2024 National Comprehensive Cancer Network Breast Cancer Guidelines and the 2013 American College of Radiology Breast Imaging-Reporting and Data System recommendations.</p><p><strong>Results: </strong>Claude 3 Opus and ChatGPT-4 achieved the highest confidence scores of 2.78 and 2.74, respectively, while ChatGPT-4o led in accuracy with a score of 2.92. In terms of response consistency, Claude 3 Opus and Claude 3.5 Sonnet led the pack with scores of 3.0, closely followed by ChatGPT-4o, Gemini 1.5 Pro, and ChatGPT-4o mini, all recording impressive scores exceeding 2.9. ChatGPT-4o mini excelled in clinical diagnostics with a top score of 3.0 among all LLMs, and this score was also higher than all physician groups; however, no statistically significant differences were observed between it and any physician group (all P>.05). ChatGPT-4 also had a higher score than the physician groups but showed comparable statistical performance to them (P>.05). Across radiological diagnostics, clinical diagnosis, and overall performance, ChatGPT-4o mini and the Claude models achieved higher mean scores than all physician groups. However, these differences were statistically significant only when compared to fellow physicians (P<.05). However, ChatGLM and Ernie Bot 3.5 underperformed across diagnostic areas, with lower scores than all physician groups but no statistically significant differences (all P>.05). Among physician groups, attending physicians and resident physicians exhibited comparable high scores in radiological diagnostic performance, whereas fellow physicians scored somewhat lower, though the difference was not statistically significant (P>.05).</p><p><strong>Conclusions: </strong>LLMs such as ChatGPT-4o and Claude 3 Opus showed po
背景:人工智能工具,特别是大型语言模型(llm),已经在各个领域显示出相当大的潜力。然而,它们在乳腺癌诊断和治疗中的表现仍然未知。目的:本研究旨在评估法学硕士在多学科乳腺癌团队中支持放射科医生的表现,重点关注他们在促进知情临床决策和加强患者护理方面的作用。方法:制定了一套涵盖放射学和乳腺癌指南的50个问题来评估乳腺癌。这些问题是向9位受欢迎的法学硕士和临床医生提出的,期望得到直接的“是”或“否”的答案以及支持分析。将ChatGPT-4.0、chatgpt - 40、chatgpt - 40 mini、Claude 3 Opus、Claude 3.5 Sonnet、Gemini 1.5 Pro、同仪千文2.5、ChatGLM、Ernie Bot 3.5等9种型号的性能与不同经验水平的放射科医师(住院医师、同行医师、主治医师)的性能进行比较。根据2024年国家综合癌症网络乳腺癌指南和2013年美国放射学会乳房成像报告和数据系统建议,评估反馈的准确性、置信度和一致性。结果:Claude 3 Opus和ChatGPT-4的置信度得分最高,分别为2.78分和2.74分,chatgpt - 40的准确率最高,为2.92分。在反应一致性方面,克劳德3作品和克劳德3.5十四行诗得分最高,达到3.0分,紧随其后的是chatgpt - 40、Gemini 1.5 Pro、chatgpt - 40 mini,得分均超过2.9分。chatgpt - 40mini在临床诊断方面表现优异,在所有LLMs中得分最高为3.0分,也高于所有医师组;然而,与任何内科医生组之间没有统计学上的显著差异(均P < 0.05)。ChatGPT-4的评分也高于医生组,但在统计学上表现与他们相当(P < 0.05)。在放射诊断、临床诊断和总体表现方面,chatgpt - 40mini和Claude模型的平均得分高于所有医生组。然而,这些差异只有在与同行医生比较时才有统计学意义(p < 0.05)。在医生组中,主治医生和住院医生在放射诊断表现上表现出相当高的得分,而其他医生的得分略低,尽管差异无统计学意义(P < 0.05)。结论:chatgpt - 40和Claude 3 Opus等llm在支持乳腺癌诊断和治疗的多学科团队方面显示出潜力。然而,他们不能完全复制通过临床经验磨练的复杂决策过程,特别是在复杂的病例中。这凸显了持续改进人工智能以确保强大的临床适用性的必要性。
{"title":"Evaluation of Large Language Models for Radiologists' Support in Multidisciplinary Breast Cancer Teams: Comparative Study.","authors":"Hong Jiang, Chun Yang, Wenbin Zhou, Cheng-Liang Yin, Shan Zhou, Rui He, Guanghui Ran, Wujie Wang, Meixian Wu, Juan Yu","doi":"10.2196/68182","DOIUrl":"10.2196/68182","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Artificial intelligence tools, particularly large language models (LLMs), have shown considerable potential across various domains. However, their performance in the diagnosis and treatment of breast cancer remains unknown.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aimed to evaluate the performance of LLMs in supporting radiologists within multidisciplinary breast cancer teams, with a focus on their roles in facilitating informed clinical decisions and enhancing patient care.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;A set of 50 questions covering radiological and breast cancer guidelines was developed to assess breast cancer. These questions were posed to 9 popular LLMs and clinical physicians, with the expectation of receiving direct \"Yes\" or \"No\" answers along with supporting analysis. The performances of the 9 models, including ChatGPT-4.0, ChatGPT-4o, ChatGPT-4o mini, Claude 3 Opus, Claude 3.5 Sonnet, Gemini 1.5 Pro, Tongyi Qianwen 2.5, ChatGLM, and Ernie Bot 3.5, were evaluated against that of radiologists with varying experience levels (resident physicians, fellow physicians, and attending physicians). Responses were assessed for accuracy, confidence, and consistency based on alignment with the 2024 National Comprehensive Cancer Network Breast Cancer Guidelines and the 2013 American College of Radiology Breast Imaging-Reporting and Data System recommendations.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;Claude 3 Opus and ChatGPT-4 achieved the highest confidence scores of 2.78 and 2.74, respectively, while ChatGPT-4o led in accuracy with a score of 2.92. In terms of response consistency, Claude 3 Opus and Claude 3.5 Sonnet led the pack with scores of 3.0, closely followed by ChatGPT-4o, Gemini 1.5 Pro, and ChatGPT-4o mini, all recording impressive scores exceeding 2.9. ChatGPT-4o mini excelled in clinical diagnostics with a top score of 3.0 among all LLMs, and this score was also higher than all physician groups; however, no statistically significant differences were observed between it and any physician group (all P&gt;.05). ChatGPT-4 also had a higher score than the physician groups but showed comparable statistical performance to them (P&gt;.05). Across radiological diagnostics, clinical diagnosis, and overall performance, ChatGPT-4o mini and the Claude models achieved higher mean scores than all physician groups. However, these differences were statistically significant only when compared to fellow physicians (P&lt;.05). However, ChatGLM and Ernie Bot 3.5 underperformed across diagnostic areas, with lower scores than all physician groups but no statistically significant differences (all P&gt;.05). Among physician groups, attending physicians and resident physicians exhibited comparable high scores in radiological diagnostic performance, whereas fellow physicians scored somewhat lower, though the difference was not statistically significant (P&gt;.05).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;LLMs such as ChatGPT-4o and Claude 3 Opus showed po","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e68182"},"PeriodicalIF":3.8,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12910264/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146108623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AI-Enabled Customer Relationship Management Platforms for Patient Services in Health Care, Early Lessons From Governance, and Program-Level Outcomes. 从治理和项目层面成果中获得的早期经验教训。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-02-02 DOI: 10.2196/83564
Anup Kant Gupta

This research letter summarizes early lessons from 4 enterprise implementations of artificial intelligence-enabled customer relationship management platforms in health care and describes governance practices associated with improvements in affordability, adherence, and access at program level.

背景:人工智能支持的CRM平台越来越多地用于医疗保健领域,以改善患者服务,但关于这些系统如何影响可负担性、依从性和访问的现实证据仍然有限。许多采用CRM工作流的企业没有明确的治理、操作定义或度量标准,这就造成了不一致的结果和低采用率。目的:总结四家大型企业实施人工智能CRM平台的早期运营经验,并描述在可负担性支持、治疗开始时间和治疗中断率方面的项目水平变化。方法:对2019年至2024年间四家企业CRM实施情况进行案例知情专题分析。项目包括大型国家医疗机构,每年为超过50万名患者提供服务。审查了汇总的、确定的操作指示板和治理文档。采用被定义为CRM活跃用户在提供的患者服务用户中的比例。基线值取自实施前的行动,并与稳定的实施后时期进行比较。没有使用患者水平或可识别的数据,也不需要机构审查委员会的批准。结果:将CRM工作流程与以患者为中心的结果相结合的程序显示出更高的采用率。活跃用户比例达到85%以上,而在没有结构化管理的项目中,活跃用户比例不到60%。CRM支持的可负担性检查显示,服务团队的完成率有所提高。在使用人工智能辅助分诊的项目中,治疗开始时间有所改善。当主动风险标志被纳入CRM工作流程时,项目级治疗中断率降低。这些变化反映了描述性的行动前后信号,而不是因果估计。结论:在明确的治理和定义良好的指标的支持下,人工智能支持的CRM平台可以支持患者服务操作的改进。观察到的可负担性支持、启动时间和终止率的改善是项目水平的趋势,需要进一步研究更严格的设计。研究结果为在医疗保健领域实施人工智能驱动的CRM系统的组织提供了早期经验。临床试验:
{"title":"AI-Enabled Customer Relationship Management Platforms for Patient Services in Health Care, Early Lessons From Governance, and Program-Level Outcomes.","authors":"Anup Kant Gupta","doi":"10.2196/83564","DOIUrl":"10.2196/83564","url":null,"abstract":"<p><p>This research letter summarizes early lessons from 4 enterprise implementations of artificial intelligence-enabled customer relationship management platforms in health care and describes governance practices associated with improvements in affordability, adherence, and access at program level.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":" ","pages":"e83564"},"PeriodicalIF":3.8,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12910261/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning Algorithms to Predict Venous Thromboembolism in Patients With Sepsis in the Intensive Care Unit: Multicenter Retrospective Study. 机器学习算法预测重症监护病房脓毒症患者静脉血栓栓塞:多中心回顾性研究。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-01-30 DOI: 10.2196/80969
Yan Zhang, Xia Ren, Luojie Liu, Junjie Zha, Yijie Gu, Hongwei Ye
<p><strong>Background: </strong>Venous thromboembolism (VTE) is a common and severe complication in intensive care unit (ICU) patients with sepsis. Conventional risk stratification tools lack sepsis-specific features and may inadequately capture complex, nonlinear interactions among clinical variables.</p><p><strong>Objective: </strong>This study aimed to develop and validate an interpretable machine learning (ML) model for the early prediction of VTE in ICU patients with sepsis.</p><p><strong>Methods: </strong>This multicenter retrospective study used data from the Medical Information Mart for Intensive Care IV database for model development and internal validation, and an independent cohort from Changshu Hospital for external validation. Candidate predictors were selected through univariate analysis, followed by least absolute shrinkage and selection operator regression. Retained variables were used in multivariable logistic regression to identify independent predictors, which were then used to develop 9 ML models, including categorical boosting, decision tree, k-nearest neighbor, light gradient boosting machine, logistic regression, multilayer perceptron, naive Bayes, random forest, and support vector machine. Performance was evaluated by discrimination (area under the curve [AUC]), calibration, and clinical use (decision curve analysis). A subgroup analysis stratified by the Sequential Organ Failure Assessment score was conducted in the external cohort to assess model stability across sepsis severity levels. Model interpretability was assessed using Shapley Additive Explanations (SHAP) to quantify the contribution of features to the predicted risk.</p><p><strong>Results: </strong>A total of 25,197 patients from the Medical Information Mart for Intensive Care IV cohort and 328 patients from the external cohort were included, with VTE incidences of 844 out of 25,197 (3.4%) and 30 out of 328 (9.2%), respectively. The light gradient boosting machine model performed best, achieving an AUC of 0.956 in internal validation. Despite the higher VTE incidence and clinical severity in the external validation, the model maintained robust generalization with an AUC of 0.786. Notably, the model achieved enhanced discriminative ability in the severe sepsis subgroup (Sequential Organ Failure Assessment score >6) with an AUC of 0.816, compared with 0.769 in the mild to moderate sepsis subgroup. Calibration curves indicated strong agreement between predicted and observed outcomes, and decision curve analysis showed superior net benefit across clinically relevant thresholds. SHAP analysis identified central venous catheterization, serum chloride and bicarbonate levels, arterial catheterization, and prolonged partial thromboplastin time as the most influential predictors. Partial dependence plots revealed both linear and nonlinear associations between these variables and VTE risk. Individual-level force plots further enhanced interpretability by visualizing perso
背景:静脉血栓栓塞(VTE)是重症监护病房(ICU)脓毒症患者常见且严重的并发症。传统的风险分层工具缺乏败血症特异性特征,可能无法充分捕捉临床变量之间复杂的非线性相互作用。目的:本研究旨在开发和验证一个可解释的机器学习(ML)模型,用于脓毒症ICU患者静脉血栓栓塞的早期预测。方法:本多中心回顾性研究采用重症监护医学信息市场IV数据库的数据进行模型开发和内部验证,并采用常熟医院的独立队列进行外部验证。通过单变量分析选择候选预测因子,其次是最小绝对收缩和选择算子回归。在多变量逻辑回归中使用保留变量来识别独立预测因子,然后将其用于开发9个ML模型,包括分类增强、决策树、k近邻、轻梯度增强机、逻辑回归、多层感知器、朴素贝叶斯、随机森林和支持向量机。通过鉴别(曲线下面积[AUC])、校准和临床使用(决策曲线分析)来评估其性能。在外部队列中进行了按序贯器官衰竭评估评分分层的亚组分析,以评估脓毒症严重程度的模型稳定性。使用Shapley加性解释(SHAP)来评估模型的可解释性,以量化特征对预测风险的贡献。结果:重症监护医疗信息市场IV队列共纳入25197例患者,外部队列共纳入328例患者,静脉血栓栓塞发生率分别为25197例中844例(3.4%)和328例中30例(9.2%)。其中光梯度增强机模型效果最好,内部验证的AUC为0.956。尽管在外部验证中静脉血栓栓塞发生率和临床严重程度较高,但该模型仍保持稳健的泛化,AUC为0.786。值得注意的是,该模型在严重脓毒症亚组(顺序器官衰竭评估评分>.6)的鉴别能力增强,AUC为0.816,而轻中度脓毒症亚组的AUC为0.769。校准曲线显示预测结果和观察结果之间有很强的一致性,决策曲线分析显示,在临床相关阈值上,净收益更高。SHAP分析确定中心静脉置管、血清氯化物和碳酸氢盐水平、动脉置管和部分凝血活酶时间延长是最具影响的预测因素。偏相关图显示了这些变量与静脉血栓栓塞风险之间的线性和非线性关联。个人层面的力图通过可视化个性化风险概况进一步增强了可解释性。结论:我们建立了一个高性能、可解释的ML模型来预测ICU脓毒症患者的静脉血栓栓塞。该模型显示了跨队列的稳健性,并在严重脓毒症人群中提高了表现。通过整合各种临床数据并利用SHAP进行透明解释,该工具可以支持个性化预防和早期诊断策略。
{"title":"Machine Learning Algorithms to Predict Venous Thromboembolism in Patients With Sepsis in the Intensive Care Unit: Multicenter Retrospective Study.","authors":"Yan Zhang, Xia Ren, Luojie Liu, Junjie Zha, Yijie Gu, Hongwei Ye","doi":"10.2196/80969","DOIUrl":"10.2196/80969","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Venous thromboembolism (VTE) is a common and severe complication in intensive care unit (ICU) patients with sepsis. Conventional risk stratification tools lack sepsis-specific features and may inadequately capture complex, nonlinear interactions among clinical variables.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aimed to develop and validate an interpretable machine learning (ML) model for the early prediction of VTE in ICU patients with sepsis.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;This multicenter retrospective study used data from the Medical Information Mart for Intensive Care IV database for model development and internal validation, and an independent cohort from Changshu Hospital for external validation. Candidate predictors were selected through univariate analysis, followed by least absolute shrinkage and selection operator regression. Retained variables were used in multivariable logistic regression to identify independent predictors, which were then used to develop 9 ML models, including categorical boosting, decision tree, k-nearest neighbor, light gradient boosting machine, logistic regression, multilayer perceptron, naive Bayes, random forest, and support vector machine. Performance was evaluated by discrimination (area under the curve [AUC]), calibration, and clinical use (decision curve analysis). A subgroup analysis stratified by the Sequential Organ Failure Assessment score was conducted in the external cohort to assess model stability across sepsis severity levels. Model interpretability was assessed using Shapley Additive Explanations (SHAP) to quantify the contribution of features to the predicted risk.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;A total of 25,197 patients from the Medical Information Mart for Intensive Care IV cohort and 328 patients from the external cohort were included, with VTE incidences of 844 out of 25,197 (3.4%) and 30 out of 328 (9.2%), respectively. The light gradient boosting machine model performed best, achieving an AUC of 0.956 in internal validation. Despite the higher VTE incidence and clinical severity in the external validation, the model maintained robust generalization with an AUC of 0.786. Notably, the model achieved enhanced discriminative ability in the severe sepsis subgroup (Sequential Organ Failure Assessment score &gt;6) with an AUC of 0.816, compared with 0.769 in the mild to moderate sepsis subgroup. Calibration curves indicated strong agreement between predicted and observed outcomes, and decision curve analysis showed superior net benefit across clinically relevant thresholds. SHAP analysis identified central venous catheterization, serum chloride and bicarbonate levels, arterial catheterization, and prolonged partial thromboplastin time as the most influential predictors. Partial dependence plots revealed both linear and nonlinear associations between these variables and VTE risk. Individual-level force plots further enhanced interpretability by visualizing perso","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e80969"},"PeriodicalIF":3.8,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12905564/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146095016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prospective Diagnostic Accuracy and Technical Feasibility of Artificial Intelligence-Assisted Rib Fracture Detection on Chest Radiographs: Observational Study. 胸片上人工智能辅助肋骨骨折检测的前瞻性诊断准确性和技术可行性:观察研究。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-01-29 DOI: 10.2196/77965
Shu-Tien Huang, Liong-Rung Liu, Ming-Feng Tsai, Ming-Yuan Huang, Hung-Wen Chiu

Background: Rib fractures are present in 10%-15% of thoracic trauma cases but are often missed on chest radiographs, delaying diagnosis and treatment. Artificial intelligence (AI) may improve detection and triage in emergency settings.

Objective: This study aims to evaluate diagnostic accuracy, processing speed, and technical feasibility of an artificial intelligence-assisted rib fracture detection system using prospectively collected data within a real-world, high-volume emergency department workflow.

Methods: We conducted an observational feasibility study with prospective data collection of a faster region-based convolutional neural network-based AI model deployed in the emergency department to analyze 23,251 real-world chest radiographs (22,946 anteroposterior; 305 oblique) from April 1 to July 2, 2023. This study was approved by the Institutional Review Board of MacKay Memorial Hospital (IRB No. 20MMHIS483e). AI operated passively, without influencing clinical decision-making. The reference standard was the final report issued by board-certified radiologists. A subset of discordant cases underwent post hoc computed tomography review for exploratory analysis.

Results: AI achieved 74.5% sensitivity (95% CI 0.708-0.780), 93.3% specificity (95% CI 0.930-0.937), 24.2% positive predictive value, and 99.2% negative predictive value. Median inference time was 10.6 seconds versus 3.3 hours for radiologist reports (paired Wilcoxon signed-rank test W=112 987.5, P<.001). The analysis revealed peak imaging demand between 08:00 and 16:00 and Thursday-Saturday evenings. A 14-day graphics processing unit outage underscored the importance of infrastructure resilience.

Conclusions: The AI system demonstrated strong technical feasibility for real-time rib fracture detection in a high-volume emergency department setting, with rapid inference and stable performance during prospective deployment. Although the system showed high negative predictive value, the observed false-positive and false-negative rates indicate that it should be considered a supportive screening tool rather than a stand-alone diagnostic solution or a replacement for clinical judgment. These findings support further clinician-in-the-loop studies to evaluate clinical feasibility, workflow integration, and impact on diagnostic decision-making. However, interpretation is limited by reliance on radiology reports as the reference standard and the system's passive, non-interventional deployment.

背景:肋骨骨折在10%-15%的胸部创伤病例中存在,但在胸片上经常被遗漏,延误了诊断和治疗。人工智能(AI)可以改善紧急情况下的检测和分类。目的:本研究旨在评估人工智能辅助肋骨骨折检测系统的诊断准确性、处理速度和技术可行性,该系统使用现实世界中大量急诊科工作流程中前瞻性收集的数据。方法:我们对应用于急诊科的基于快速区域卷积神经网络的人工智能模型进行前瞻性数据收集,进行了一项观察性可行性研究,分析了2023年4月1日至7月2日23251张真实胸片(22946张正位片,305张斜位片)。本研究获得MacKay Memorial Hospital机构审查委员会(IRB No. 20MMHIS483e)批准。人工智能被动操作,不影响临床决策。参考标准是由委员会认证的放射科医生发布的最终报告。一部分不一致的病例进行了事后计算机断层扫描检查以进行探索性分析。结果:人工智能的敏感性为74.5% (95% CI 0.708 ~ 0.780),特异性为93.3% (95% CI 0.930 ~ 0.937),阳性预测值为24.2%,阴性预测值为99.2%。中位推断时间为10.6秒,而放射科医生报告的平均推断时间为3.3小时(配对Wilcoxon签名秩检验W=112 987.5, p)。结论:人工智能系统在大容量急诊科环境中显示出强大的实时肋骨骨折检测技术可行性,在预期部署期间具有快速推断和稳定的性能。虽然该系统显示出较高的阴性预测值,但观察到的假阳性和假阴性率表明,它应被视为一种支持性筛查工具,而不是一个独立的诊断解决方案或替代临床判断。这些发现支持进一步的临床循环研究,以评估临床可行性、工作流程整合以及对诊断决策的影响。然而,由于依赖作为参考标准的放射学报告和系统的被动、非介入性部署,解释受到限制。
{"title":"Prospective Diagnostic Accuracy and Technical Feasibility of Artificial Intelligence-Assisted Rib Fracture Detection on Chest Radiographs: Observational Study.","authors":"Shu-Tien Huang, Liong-Rung Liu, Ming-Feng Tsai, Ming-Yuan Huang, Hung-Wen Chiu","doi":"10.2196/77965","DOIUrl":"10.2196/77965","url":null,"abstract":"<p><strong>Background: </strong>Rib fractures are present in 10%-15% of thoracic trauma cases but are often missed on chest radiographs, delaying diagnosis and treatment. Artificial intelligence (AI) may improve detection and triage in emergency settings.</p><p><strong>Objective: </strong>This study aims to evaluate diagnostic accuracy, processing speed, and technical feasibility of an artificial intelligence-assisted rib fracture detection system using prospectively collected data within a real-world, high-volume emergency department workflow.</p><p><strong>Methods: </strong>We conducted an observational feasibility study with prospective data collection of a faster region-based convolutional neural network-based AI model deployed in the emergency department to analyze 23,251 real-world chest radiographs (22,946 anteroposterior; 305 oblique) from April 1 to July 2, 2023. This study was approved by the Institutional Review Board of MacKay Memorial Hospital (IRB No. 20MMHIS483e). AI operated passively, without influencing clinical decision-making. The reference standard was the final report issued by board-certified radiologists. A subset of discordant cases underwent post hoc computed tomography review for exploratory analysis.</p><p><strong>Results: </strong>AI achieved 74.5% sensitivity (95% CI 0.708-0.780), 93.3% specificity (95% CI 0.930-0.937), 24.2% positive predictive value, and 99.2% negative predictive value. Median inference time was 10.6 seconds versus 3.3 hours for radiologist reports (paired Wilcoxon signed-rank test W=112 987.5, P<.001). The analysis revealed peak imaging demand between 08:00 and 16:00 and Thursday-Saturday evenings. A 14-day graphics processing unit outage underscored the importance of infrastructure resilience.</p><p><strong>Conclusions: </strong>The AI system demonstrated strong technical feasibility for real-time rib fracture detection in a high-volume emergency department setting, with rapid inference and stable performance during prospective deployment. Although the system showed high negative predictive value, the observed false-positive and false-negative rates indicate that it should be considered a supportive screening tool rather than a stand-alone diagnostic solution or a replacement for clinical judgment. These findings support further clinician-in-the-loop studies to evaluate clinical feasibility, workflow integration, and impact on diagnostic decision-making. However, interpretation is limited by reliance on radiology reports as the reference standard and the system's passive, non-interventional deployment.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e77965"},"PeriodicalIF":3.8,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12854400/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146088219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving Clinical Decision-Making in Treating Airway Diseases With an Expert System Built Upon the Free AI Tool Google NotebookLM. 基于免费人工智能工具b谷歌NotebookLM®的专家系统改善气道疾病治疗的临床决策
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-01-29 DOI: 10.2196/78567
Cheng-Hao Hsu, Ching-Li Hsu, Chih-Hsiang Tsou, Kuo-Fang Hsu, Hung-Yu Yang

We used the free artificial intelligence (AI) tool Google NotebookLM, powered by the large language model Gemini 2.0, to construct a medical decision-making aid for diagnosing and managing airway diseases and subsequently evaluated its functionality and performance in a clinical workflow. After feeding this tool with relevant published clinical guidelines for these diseases, we evaluated the feasibility of the system regarding its behavior, ability, and potential, and we created simulated cases and used the system to solve associated medical problems. The test and simulation questions were designed by a pulmonologist, and the appropriateness (focusing on accuracy and completeness) of AI responses was judged by 3 pulmonologists independently. The system was then deployed in an emergency department setting, where it was tested by medical staff (n=20) to assess how it affected the process of clinical consultation. Test opinions were collected through a questionnaire. Most (56/84, 67%) of the specialists' ratings regarding AI responses were above average. The interrater reliability was moderate for accuracy (intraclass correlation coefficient=0.612; P<.001) and good on completeness (intraclass correlation coefficient=0.773; P<.001). When deployed in an emergency department (ED) setting, this system could respond with reasonable answers, enhance the literacy of personnel about these diseases. The potential to save the time spent in consultation did not reach statistical significance (Kolmogorov-Smirnov [K-S] D=0.223, P=.24) across all participants, but it indicated a favorable outcome when we analyzed only physicians' responses. We concluded that this system is customizable, cost efficient, and accessible to clinicians and allied health care professionals without any computer coding experience in treating airway diseases. It provides convincing guideline-based recommendations, increases the staff's medical literacy, and potentially saves physicians' time spent on consultation. This system warrants further evaluation in other medical disciplines and health care environments.

目的:采用免费的人工智能(AI)工具谷歌NotebookLM®,基于大语言模型(LLM) Gemini 2.0,构建用于气道疾病诊断和管理的医疗决策辅助系统,并评估其在临床工作流程中的功能和性能。方法:将已发表的相关疾病临床指南输入该工具,从行为、能力、潜力等方面评估该系统的可行性,并制作模拟病例,应用该系统解决相关医疗问题。测试和模拟问题由一名肺科医生设计,人工智能回答的适当性(注重准确性和完整性)由三名肺科医生独立判断。该系统随后被部署在急诊科(ED)环境中,在那里由医务人员(n=20)进行测试,以了解它如何影响临床咨询过程。通过问卷调查收集测试意见。结果:大多数专家(58/84=66.7%)对人工智能反应的评分高于平均水平。在所有参与者中,评估者之间的信度在准确性上是中等的(类内相关系数(ICC)=0.612, P.05),但如果我们只分析医生的反应,则表明结果是有利的。结论:该系统可定制,成本效益高,临床医生和相关专业人员在治疗气道疾病方面没有任何计算机编码经验。它提供了令人信服的基于指南的建议,提高了工作人员的医学素养,并可能节省医生花在咨询上的时间。它值得在其他医学学科和保健环境中进一步评估。
{"title":"Improving Clinical Decision-Making in Treating Airway Diseases With an Expert System Built Upon the Free AI Tool Google NotebookLM.","authors":"Cheng-Hao Hsu, Ching-Li Hsu, Chih-Hsiang Tsou, Kuo-Fang Hsu, Hung-Yu Yang","doi":"10.2196/78567","DOIUrl":"10.2196/78567","url":null,"abstract":"<p><p>We used the free artificial intelligence (AI) tool Google NotebookLM, powered by the large language model Gemini 2.0, to construct a medical decision-making aid for diagnosing and managing airway diseases and subsequently evaluated its functionality and performance in a clinical workflow. After feeding this tool with relevant published clinical guidelines for these diseases, we evaluated the feasibility of the system regarding its behavior, ability, and potential, and we created simulated cases and used the system to solve associated medical problems. The test and simulation questions were designed by a pulmonologist, and the appropriateness (focusing on accuracy and completeness) of AI responses was judged by 3 pulmonologists independently. The system was then deployed in an emergency department setting, where it was tested by medical staff (n=20) to assess how it affected the process of clinical consultation. Test opinions were collected through a questionnaire. Most (56/84, 67%) of the specialists' ratings regarding AI responses were above average. The interrater reliability was moderate for accuracy (intraclass correlation coefficient=0.612; P<.001) and good on completeness (intraclass correlation coefficient=0.773; P<.001). When deployed in an emergency department (ED) setting, this system could respond with reasonable answers, enhance the literacy of personnel about these diseases. The potential to save the time spent in consultation did not reach statistical significance (Kolmogorov-Smirnov [K-S] D=0.223, P=.24) across all participants, but it indicated a favorable outcome when we analyzed only physicians' responses. We concluded that this system is customizable, cost efficient, and accessible to clinicians and allied health care professionals without any computer coding experience in treating airway diseases. It provides convincing guideline-based recommendations, increases the staff's medical literacy, and potentially saves physicians' time spent on consultation. This system warrants further evaluation in other medical disciplines and health care environments.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":" ","pages":"e78567"},"PeriodicalIF":3.8,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12902755/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145896990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Institutional Drug Use Patterns in Hospitalized Older Patients: Retrospective Cross-Sectional Study. 住院老年患者多机构药物使用模式:回顾性横断面研究。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-01-29 DOI: 10.2196/78353
Chung Chun Lee, Grace Juyun Kim, Suhyun Kim, Jee Young Hong, Won Min Hwang, Jong-Yeup Kim, Kye Hwa Lee, Kwangsoo Kim, Mingyu Kang, Ju Han Kim, Suehyun Lee

Background: A rapidly aging population led to an increase in the number of patients with chronic diseases and polypharmacy. Although investigations on the appropriate number of drugs for older patients have been conducted, there is a shortage of studies on polypharmacy criteria in older inpatients from multiple institutions.

Objective: The aim of this study was to examine the patterns of polypharmacy and determine the criteria for the number of drugs defining polypharmacy in the geriatric inpatient population.

Methods: Electronic health records of 4 medical institutions for patients aged 65 years and older hospitalized between January 1, 2012, and December 31, 2020, were analyzed for the study. The maximum number of drugs prescribed was obtained for each patient and, along with a literature review, was used to determine the appropriate polypharmacy level for our population.

Results: We suggest a 4-level polypharmacy category system consisting of nonpolypharmacy, polypharmacy, major polypharmacy, and excessive polypharmacy based on a review of international guidelines and polypharmacy literature. Application of this system to our study population showed that the major polypharmacy category (use of 10-19 concurrent drugs) was an appropriate threshold for polypharmacy in hospitalized patients versus the traditional threshold of 5 or more concurrent drugs. The tendency of our study population to have a higher disease and drug count supports this threshold. Frequently prescribed therapeutic subgroups in this category were antibacterials for systemic use, anesthetics, and cardiac therapy.

Conclusions: This study proposes a polypharmacy categorization system for older inpatients, which differs from the common definition of the concomitant prescription of 5 or more drugs. The older population tends to have severe conditions including those requiring major surgeries; therefore, a drug count corresponding to the definition of major polypharmacy is appropriate.

背景:人口快速老龄化导致慢性病患者和多药患者数量增加。虽然对老年患者合适的药物数量进行了调查,但缺乏对多机构老年住院患者的多种用药标准的研究。目的:本研究旨在探讨老年住院患者的多重用药模式,并确定界定多重用药的药物数量标准。方法:对4家医疗机构2012年1月1日至2020年12月31日住院的65岁及以上患者的电子健康记录进行分析。获得每位患者的最大处方药物数量,并结合文献综述,用于确定适合我们人群的综合用药水平。结果:在回顾国际多药指南和文献的基础上,我们提出了非多药、多药、主要多药和过度多药的4级多药分类体系。该系统在我们研究人群中的应用表明,与传统的5种或5种以上的同时使用药物的阈值相比,主要的多药类别(使用10-19种同时使用药物)是住院患者多药的合适阈值。我们的研究人群有更高的疾病和药物计数的趋势支持这个阈值。在这一类别中,常用的治疗亚组是全身使用的抗菌药、麻醉剂和心脏治疗。结论:本研究提出了一种适用于老年住院患者的多药分类体系,不同于常见的5种及5种以上合用药物的定义。老年人口往往病情严重,包括需要进行大手术的人;因此,与主要多药的定义相对应的药物计数是合适的。
{"title":"Multi-Institutional Drug Use Patterns in Hospitalized Older Patients: Retrospective Cross-Sectional Study.","authors":"Chung Chun Lee, Grace Juyun Kim, Suhyun Kim, Jee Young Hong, Won Min Hwang, Jong-Yeup Kim, Kye Hwa Lee, Kwangsoo Kim, Mingyu Kang, Ju Han Kim, Suehyun Lee","doi":"10.2196/78353","DOIUrl":"10.2196/78353","url":null,"abstract":"<p><strong>Background: </strong>A rapidly aging population led to an increase in the number of patients with chronic diseases and polypharmacy. Although investigations on the appropriate number of drugs for older patients have been conducted, there is a shortage of studies on polypharmacy criteria in older inpatients from multiple institutions.</p><p><strong>Objective: </strong>The aim of this study was to examine the patterns of polypharmacy and determine the criteria for the number of drugs defining polypharmacy in the geriatric inpatient population.</p><p><strong>Methods: </strong>Electronic health records of 4 medical institutions for patients aged 65 years and older hospitalized between January 1, 2012, and December 31, 2020, were analyzed for the study. The maximum number of drugs prescribed was obtained for each patient and, along with a literature review, was used to determine the appropriate polypharmacy level for our population.</p><p><strong>Results: </strong>We suggest a 4-level polypharmacy category system consisting of nonpolypharmacy, polypharmacy, major polypharmacy, and excessive polypharmacy based on a review of international guidelines and polypharmacy literature. Application of this system to our study population showed that the major polypharmacy category (use of 10-19 concurrent drugs) was an appropriate threshold for polypharmacy in hospitalized patients versus the traditional threshold of 5 or more concurrent drugs. The tendency of our study population to have a higher disease and drug count supports this threshold. Frequently prescribed therapeutic subgroups in this category were antibacterials for systemic use, anesthetics, and cardiac therapy.</p><p><strong>Conclusions: </strong>This study proposes a polypharmacy categorization system for older inpatients, which differs from the common definition of the concomitant prescription of 5 or more drugs. The older population tends to have severe conditions including those requiring major surgeries; therefore, a drug count corresponding to the definition of major polypharmacy is appropriate.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e78353"},"PeriodicalIF":3.8,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12902758/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146088250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
JMIR Medical Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1