首页 > 最新文献

Journal of the American Medical Informatics Association最新文献

英文 中文
Toward a unified understanding of drug-drug interactions: mapping Japanese drug codes to RxNorm concepts. 实现对药物间相互作用的统一理解:将日本药物编码映射到 RxNorm 概念。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-06-20 DOI: 10.1093/jamia/ocae094
Yukinobu Kawakami, Takuya Matsuda, Noriaki Hidaka, Mamoru Tanaka, Eizen Kimura

Objectives: Linking information on Japanese pharmaceutical products to global knowledge bases (KBs) would enhance international collaborative research and yield valuable insights. However, public access to mappings of Japanese pharmaceutical products that use international controlled vocabularies remains limited. This study mapped YJ codes to RxNorm ingredient classes, providing new insights by comparing Japanese and international drug-drug interaction (DDI) information using a case study methodology.

Materials and methods: Tables linking YJ codes to RxNorm concepts were created using the application programming interfaces of the Kyoto Encyclopedia of Genes and Genomes and the National Library of Medicine. A comparative analysis of Japanese and international DDI information was thus performed by linking to an international DDI KB.

Results: There was limited agreement between the Japanese and international DDI severity classifications. Cross-tabulation of Japanese and international DDIs by severity showed that 213 combinations classified as serious DDIs by an international KB were missing from the Japanese DDI information.

Discussion: It is desirable that efforts be undertaken to standardize international criteria for DDIs to ensure consistency in the classification of their severity.

Conclusion: The classification of DDI severity remains highly variable. It is imperative to augment the repository of critical DDI information, which would revalidate the utility of fostering collaborations with global KBs.

目的:将日本医药产品的信息与全球知识库(KBs)链接起来,将促进国际合作研究并产生有价值的见解。然而,公众对使用国际控制词汇表的日本医药产品映射的访问仍然有限。本研究将 YJ 代码映射到 RxNorm 成分类别,通过使用案例研究方法比较日本和国际药物相互作用(DDI)信息,提供新的见解:使用《京都基因与基因组百科全书》和美国国家医学图书馆的应用程序接口创建了将 YJ 代码与 RxNorm 概念相联系的表格。通过与国际 DDI KB 的链接,对日本和国际 DDI 信息进行了比较分析:结果:日本和国际 DDI 严重程度分类之间的一致性有限。按严重程度对日本和国际 DDI 进行交叉分析表明,日本 DDI 信息中缺少被国际知识库归类为严重 DDI 的 213 种组合:讨论:应该努力统一国际 DDI 标准,以确保 DDI 严重程度分类的一致性:结论:DDI 严重程度的分类仍然存在很大差异。当务之急是扩充重要的 DDI 信息库,这将重新验证与全球知识库合作的效用。
{"title":"Toward a unified understanding of drug-drug interactions: mapping Japanese drug codes to RxNorm concepts.","authors":"Yukinobu Kawakami, Takuya Matsuda, Noriaki Hidaka, Mamoru Tanaka, Eizen Kimura","doi":"10.1093/jamia/ocae094","DOIUrl":"10.1093/jamia/ocae094","url":null,"abstract":"<p><strong>Objectives: </strong>Linking information on Japanese pharmaceutical products to global knowledge bases (KBs) would enhance international collaborative research and yield valuable insights. However, public access to mappings of Japanese pharmaceutical products that use international controlled vocabularies remains limited. This study mapped YJ codes to RxNorm ingredient classes, providing new insights by comparing Japanese and international drug-drug interaction (DDI) information using a case study methodology.</p><p><strong>Materials and methods: </strong>Tables linking YJ codes to RxNorm concepts were created using the application programming interfaces of the Kyoto Encyclopedia of Genes and Genomes and the National Library of Medicine. A comparative analysis of Japanese and international DDI information was thus performed by linking to an international DDI KB.</p><p><strong>Results: </strong>There was limited agreement between the Japanese and international DDI severity classifications. Cross-tabulation of Japanese and international DDIs by severity showed that 213 combinations classified as serious DDIs by an international KB were missing from the Japanese DDI information.</p><p><strong>Discussion: </strong>It is desirable that efforts be undertaken to standardize international criteria for DDIs to ensure consistency in the classification of their severity.</p><p><strong>Conclusion: </strong>The classification of DDI severity remains highly variable. It is imperative to augment the repository of critical DDI information, which would revalidate the utility of fostering collaborations with global KBs.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"1561-1568"},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187495/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140960525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Are medical history data fit for risk stratification of patients with chest pain in emergency care? Comparing data collected from patients using computerized history taking with data documented by physicians in the electronic health record in the CLEOS-CPDS prospective cohort study. 病史数据是否适合对急诊胸痛患者进行风险分层?在 CLEOS-CPDS 前瞻性队列研究中,将使用电脑病史采集系统收集的患者数据与医生在电子健康记录中记录的数据进行比较。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-06-20 DOI: 10.1093/jamia/ocae110
Helge Brandberg, Carl Johan Sundberg, Jonas Spaak, Sabine Koch, Thomas Kahan

Objective: In acute chest pain management, risk stratification tools, including medical history, are recommended. We compared the fraction of patients with sufficient clinical data obtained using computerized history taking software (CHT) versus physician-acquired medical history to calculate established risk scores and assessed the patient-by-patient agreement between these 2 ways of obtaining medical history information.

Materials and methods: This was a prospective cohort study of clinically stable patients aged ≥ 18 years presenting to the emergency department (ED) at Danderyd University Hospital (Stockholm, Sweden) in 2017-2019 with acute chest pain and non-diagnostic ECG and serum markers. Medical histories were self-reported using CHT on a tablet. Observations on discrete variables in the risk scores were extracted from electronic health records (EHR) and the CHT database. The patient-by-patient agreement was described by Cohen's kappa statistics.

Results: Of the total 1000 patients included (mean age 55.3 ± 17.4 years; 54% women), HEART score, EDACS, and T-MACS could be calculated in 75%, 74%, and 83% by CHT and in 31%, 7%, and 25% by EHR, respectively. The agreement between CHT and EHR was slight to moderate (kappa 0.19-0.70) for chest pain characteristics and moderate to almost perfect (kappa 0.55-0.91) for risk factors.

Conclusions: CHT can acquire and document data for chest pain risk stratification in most ED patients using established risk scores, achieving this goal for a substantially larger number of patients, as compared to EHR data. The agreement between CHT and physician-acquired history taking is high for traditional risk factors and lower for chest pain characteristics.

Clinical trial registration: ClinicalTrials.gov NCT03439449.

目的:在急性胸痛治疗中,建议使用包括病史在内的风险分层工具。我们比较了使用计算机化病史采集软件(CHT)和医生获取病史来计算既定风险评分的患者中拥有足够临床数据的患者比例,并评估了这两种获取病史信息的方式在患者间的一致性:这是一项前瞻性队列研究,研究对象是2017-2019年期间因急性胸痛、心电图和血清标志物无法确诊而到丹德里德大学医院(瑞典斯德哥尔摩)急诊科就诊的临床病情稳定的≥18岁患者。病史是通过平板电脑上的 CHT 自行报告的。风险评分中离散变量的观察结果从电子健康记录(EHR)和CHT数据库中提取。科恩卡帕(Cohen's kappa)统计描述了患者之间的一致性:结果:在纳入的 1000 名患者中(平均年龄 55.3 ± 17.4 岁;54% 为女性),75%、74% 和 83% 的患者可通过 CHT 计算出 HEART 评分、EDACS 和 T-MACS,31%、7% 和 25% 的患者可通过 EHR 计算出 HEART 评分、EDACS 和 T-MACS。在胸痛特征方面,CHT 和 EHR 的一致性为轻微到中等(kappa 0.19-0.70),在危险因素方面,两者的一致性为中等到几乎完美(kappa 0.55-0.91):结论:与电子病历数据相比,CHT 可以获取和记录大多数急诊室患者的胸痛风险分层数据,并使用已建立的风险评分对更多患者进行胸痛风险分层。就传统风险因素而言,CHT 与医生获得的病史记录之间的一致性较高,而就胸痛特征而言,两者之间的一致性较低:临床试验注册:ClinicalTrials.gov NCT03439449。
{"title":"Are medical history data fit for risk stratification of patients with chest pain in emergency care? Comparing data collected from patients using computerized history taking with data documented by physicians in the electronic health record in the CLEOS-CPDS prospective cohort study.","authors":"Helge Brandberg, Carl Johan Sundberg, Jonas Spaak, Sabine Koch, Thomas Kahan","doi":"10.1093/jamia/ocae110","DOIUrl":"10.1093/jamia/ocae110","url":null,"abstract":"<p><strong>Objective: </strong>In acute chest pain management, risk stratification tools, including medical history, are recommended. We compared the fraction of patients with sufficient clinical data obtained using computerized history taking software (CHT) versus physician-acquired medical history to calculate established risk scores and assessed the patient-by-patient agreement between these 2 ways of obtaining medical history information.</p><p><strong>Materials and methods: </strong>This was a prospective cohort study of clinically stable patients aged ≥ 18 years presenting to the emergency department (ED) at Danderyd University Hospital (Stockholm, Sweden) in 2017-2019 with acute chest pain and non-diagnostic ECG and serum markers. Medical histories were self-reported using CHT on a tablet. Observations on discrete variables in the risk scores were extracted from electronic health records (EHR) and the CHT database. The patient-by-patient agreement was described by Cohen's kappa statistics.</p><p><strong>Results: </strong>Of the total 1000 patients included (mean age 55.3 ± 17.4 years; 54% women), HEART score, EDACS, and T-MACS could be calculated in 75%, 74%, and 83% by CHT and in 31%, 7%, and 25% by EHR, respectively. The agreement between CHT and EHR was slight to moderate (kappa 0.19-0.70) for chest pain characteristics and moderate to almost perfect (kappa 0.55-0.91) for risk factors.</p><p><strong>Conclusions: </strong>CHT can acquire and document data for chest pain risk stratification in most ED patients using established risk scores, achieving this goal for a substantially larger number of patients, as compared to EHR data. The agreement between CHT and physician-acquired history taking is high for traditional risk factors and lower for chest pain characteristics.</p><p><strong>Clinical trial registration: </strong>ClinicalTrials.gov NCT03439449.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"1529-1539"},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187423/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141088695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of a multimodal geomarker pipeline to assess the impact of social, economic, and environmental factors on pediatric health outcomes. 开发多模式地理标志物管道,以评估社会、经济和环境因素对儿科健康结果的影响。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-06-20 DOI: 10.1093/jamia/ocae093
Erika Rasnick Manning, Qing Duan, Stuart Taylor, Sarah Ray, Alexandra M S Corley, Joseph Michael, Ryan Gillette, Ndidi Unaka, David Hartley, Andrew F Beck, Cole Brokamp

Objectives: We sought to create a computational pipeline for attaching geomarkers, contextual or geographic measures that influence or predict health, to electronic health records at scale, including developing a tool for matching addresses to parcels to assess the impact of housing characteristics on pediatric health.

Materials and methods: We created a geomarker pipeline to link residential addresses from hospital admissions at Cincinnati Children's Hospital Medical Center (CCHMC) between July 2016 and June 2022 to place-based data. Linkage methods included by date of admission, geocoding to census tract, street range geocoding, and probabilistic address matching. We assessed 4 methods for probabilistic address matching.

Results: We characterized 124 244 hospitalizations experienced by 69 842 children admitted to CCHMC. Of the 55 684 hospitalizations with residential addresses in Hamilton County, Ohio, all were matched to 7 temporal geomarkers, 97% were matched to 79 census tract-level geomarkers and 13 point-level geomarkers, and 75% were matched to 16 parcel-level geomarkers. Parcel-level geomarkers were linked using our exact address matching tool developed using the best-performing linkage method.

Discussion: Our multimodal geomarker pipeline provides a reproducible framework for attaching place-based data to health data while maintaining data privacy. This framework can be applied to other populations and in other regions. We also created a tool for address matching that democratizes parcel-level data to advance precision population health efforts.

Conclusion: We created an open framework for multimodal geomarker assessment by harmonizing and linking a set of over 100 geomarkers to hospitalization data, enabling assessment of links between geomarkers and hospital admissions.

目标:我们试图创建一个计算管道,将地理标志物(影响或预测健康的环境或地理措施)大规模地附加到电子健康记录中,包括开发一个将地址与地块相匹配的工具,以评估住房特征对儿科健康的影响:我们创建了一个地理标志物管道,将 2016 年 7 月至 2022 年 6 月期间辛辛那提儿童医院医疗中心(CCHMC)住院患者的居住地址与基于地点的数据联系起来。链接方法包括入院日期、人口普查区地理编码、街道范围地理编码和概率地址匹配。我们评估了 4 种概率地址匹配方法:我们对儿童医疗中心收治的 69 842 名儿童的 124 244 次住院经历进行了分析。在住址位于俄亥俄州汉密尔顿县的 55 684 例住院病例中,所有病例都与 7 个时间地理标记相匹配,97% 的病例与 79 个人口普查区级地理标记和 13 个点级地理标记相匹配,75% 的病例与 16 个地块级地理标记相匹配。地块级地理标记是利用我们使用表现最好的链接方法开发的精确地址匹配工具进行链接的:我们的多模态地理标志物管道提供了一个可重复的框架,用于将基于地点的数据附加到健康数据上,同时维护数据隐私。这一框架可应用于其他人群和其他地区。我们还创建了一个地址匹配工具,使地块级数据民主化,从而推进精准人口健康工作:我们创建了一个开放的多模态地理标志物评估框架,将一组 100 多个地理标志物与住院数据进行协调和链接,从而能够评估地理标志物与住院之间的联系。
{"title":"Development of a multimodal geomarker pipeline to assess the impact of social, economic, and environmental factors on pediatric health outcomes.","authors":"Erika Rasnick Manning, Qing Duan, Stuart Taylor, Sarah Ray, Alexandra M S Corley, Joseph Michael, Ryan Gillette, Ndidi Unaka, David Hartley, Andrew F Beck, Cole Brokamp","doi":"10.1093/jamia/ocae093","DOIUrl":"10.1093/jamia/ocae093","url":null,"abstract":"<p><strong>Objectives: </strong>We sought to create a computational pipeline for attaching geomarkers, contextual or geographic measures that influence or predict health, to electronic health records at scale, including developing a tool for matching addresses to parcels to assess the impact of housing characteristics on pediatric health.</p><p><strong>Materials and methods: </strong>We created a geomarker pipeline to link residential addresses from hospital admissions at Cincinnati Children's Hospital Medical Center (CCHMC) between July 2016 and June 2022 to place-based data. Linkage methods included by date of admission, geocoding to census tract, street range geocoding, and probabilistic address matching. We assessed 4 methods for probabilistic address matching.</p><p><strong>Results: </strong>We characterized 124 244 hospitalizations experienced by 69 842 children admitted to CCHMC. Of the 55 684 hospitalizations with residential addresses in Hamilton County, Ohio, all were matched to 7 temporal geomarkers, 97% were matched to 79 census tract-level geomarkers and 13 point-level geomarkers, and 75% were matched to 16 parcel-level geomarkers. Parcel-level geomarkers were linked using our exact address matching tool developed using the best-performing linkage method.</p><p><strong>Discussion: </strong>Our multimodal geomarker pipeline provides a reproducible framework for attaching place-based data to health data while maintaining data privacy. This framework can be applied to other populations and in other regions. We also created a tool for address matching that democratizes parcel-level data to advance precision population health efforts.</p><p><strong>Conclusion: </strong>We created an open framework for multimodal geomarker assessment by harmonizing and linking a set of over 100 geomarkers to hospitalization data, enabling assessment of links between geomarkers and hospital admissions.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"1471-1478"},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187418/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140909123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparing penalization methods for linear models on large observational health data. 比较大型健康观测数据线性模型的惩罚方法。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-06-20 DOI: 10.1093/jamia/ocae109
Egill A Fridgeirsson, Ross Williams, Peter Rijnbeek, Marc A Suchard, Jenna M Reps

Objective: This study evaluates regularization variants in logistic regression (L1, L2, ElasticNet, Adaptive L1, Adaptive ElasticNet, Broken adaptive ridge [BAR], and Iterative hard thresholding [IHT]) for discrimination and calibration performance, focusing on both internal and external validation.

Materials and methods: We use data from 5 US claims and electronic health record databases and develop models for various outcomes in a major depressive disorder patient population. We externally validate all models in the other databases. We use a train-test split of 75%/25% and evaluate performance with discrimination and calibration. Statistical analysis for difference in performance uses Friedman's test and critical difference diagrams.

Results: Of the 840 models we develop, L1 and ElasticNet emerge as superior in both internal and external discrimination, with a notable AUC difference. BAR and IHT show the best internal calibration, without a clear external calibration leader. ElasticNet typically has larger model sizes than L1. Methods like IHT and BAR, while slightly less discriminative, significantly reduce model complexity.

Conclusion: L1 and ElasticNet offer the best discriminative performance in logistic regression for healthcare predictions, maintaining robustness across validations. For simpler, more interpretable models, L0-based methods (IHT and BAR) are advantageous, providing greater parsimony and calibration with fewer features. This study aids in selecting suitable regularization techniques for healthcare prediction models, balancing performance, complexity, and interpretability.

研究目的本研究评估了逻辑回归中的正则化变体(L1、L2、ElasticNet、自适应 L1、自适应 ElasticNet、Broken adaptive ridge [BAR] 和 Iterative hard thresholding [IHT])的判别和校准性能,重点是内部和外部验证:我们使用来自美国 5 个索赔和电子健康记录数据库的数据,并针对重度抑郁障碍患者群体的各种结果开发了模型。我们在其他数据库中对所有模型进行了外部验证。我们采用 75%/25% 的训练-测试比例,并通过判别和校准来评估性能。使用弗里德曼检验和临界差异图对性能差异进行统计分析:在我们开发的 840 个模型中,L1 和 ElasticNet 在内部和外部辨别能力方面都更胜一筹,AUC 差异明显。BAR 和 IHT 的内部校准效果最好,但外部校准效果并不明显。ElasticNet 的模型规模通常大于 L1。IHT 和 BAR 等方法虽然判别能力稍差,但却大大降低了模型的复杂性:结论:L1 和 ElasticNet 在用于医疗保健预测的逻辑回归中具有最佳的判别性能,并在各种验证中保持稳健性。对于更简单、可解释性更强的模型,基于 L0 的方法(IHT 和 BAR)更具优势,可以用更少的特征提供更高的解析性和校准性。这项研究有助于为医疗预测模型选择合适的正则化技术,在性能、复杂性和可解释性之间取得平衡。
{"title":"Comparing penalization methods for linear models on large observational health data.","authors":"Egill A Fridgeirsson, Ross Williams, Peter Rijnbeek, Marc A Suchard, Jenna M Reps","doi":"10.1093/jamia/ocae109","DOIUrl":"10.1093/jamia/ocae109","url":null,"abstract":"<p><strong>Objective: </strong>This study evaluates regularization variants in logistic regression (L1, L2, ElasticNet, Adaptive L1, Adaptive ElasticNet, Broken adaptive ridge [BAR], and Iterative hard thresholding [IHT]) for discrimination and calibration performance, focusing on both internal and external validation.</p><p><strong>Materials and methods: </strong>We use data from 5 US claims and electronic health record databases and develop models for various outcomes in a major depressive disorder patient population. We externally validate all models in the other databases. We use a train-test split of 75%/25% and evaluate performance with discrimination and calibration. Statistical analysis for difference in performance uses Friedman's test and critical difference diagrams.</p><p><strong>Results: </strong>Of the 840 models we develop, L1 and ElasticNet emerge as superior in both internal and external discrimination, with a notable AUC difference. BAR and IHT show the best internal calibration, without a clear external calibration leader. ElasticNet typically has larger model sizes than L1. Methods like IHT and BAR, while slightly less discriminative, significantly reduce model complexity.</p><p><strong>Conclusion: </strong>L1 and ElasticNet offer the best discriminative performance in logistic regression for healthcare predictions, maintaining robustness across validations. For simpler, more interpretable models, L0-based methods (IHT and BAR) are advantageous, providing greater parsimony and calibration with fewer features. This study aids in selecting suitable regularization techniques for healthcare prediction models, balancing performance, complexity, and interpretability.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"1514-1521"},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187433/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141066443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The role of information systems in emergency department decision-making-a literature review. 信息系统在急诊科决策中的作用--文献综述。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-06-20 DOI: 10.1093/jamia/ocae096
Cornelius Born, Romy Schwarz, Timo Phillip Böttcher, Andreas Hein, Helmut Krcmar

Objectives: Healthcare providers employ heuristic and analytical decision-making to navigate the high-stakes environment of the emergency department (ED). Despite the increasing integration of information systems (ISs), research on their efficacy is conflicting. Drawing on related fields, we investigate how timing and mode of delivery influence IS effectiveness. Our objective is to reconcile previous contradictory findings, shedding light on optimal IS design in the ED.

Materials and methods: We conducted a systematic review following PRISMA across PubMed, Scopus, and Web of Science. We coded the ISs' timing as heuristic or analytical, their mode of delivery as active for automatic alerts and passive when requiring user-initiated information retrieval, and their effect on process, economic, and clinical outcomes.

Results: Our analysis included 83 studies. During early heuristic decision-making, most active interventions were ineffective, while passive interventions generally improved outcomes. In the analytical phase, the effects were reversed. Passive interventions that facilitate information extraction consistently improved outcomes.

Discussion: Our findings suggest that the effectiveness of active interventions negatively correlates with the amount of information received during delivery. During early heuristic decision-making, when information overload is high, physicians are unresponsive to alerts and proactively consult passive resources. In the later analytical phases, physicians show increased receptivity to alerts due to decreased diagnostic uncertainty and information quantity. Interventions that limit information lead to positive outcomes, supporting our interpretation.

Conclusion: We synthesize our findings into an integrated model that reveals the underlying reasons for conflicting findings from previous reviews and can guide practitioners in designing ISs in the ED.

目标:医疗服务提供者利用启发式和分析式决策来驾驭急诊科(ED)这个高风险的环境。尽管信息系统(IS)的集成度越来越高,但有关其有效性的研究却相互矛盾。我们借鉴相关领域的研究成果,探讨了时间安排和交付方式如何影响信息系统的有效性。我们的目标是调和之前相互矛盾的研究结果,为急诊室的最佳信息系统设计提供启示:我们在 PubMed、Scopus 和 Web of Science 上按照 PRISMA 进行了系统综述。我们对 IS 的时间安排进行了启发式或分析式编码,对其提供模式进行了主动式编码(自动提醒)和被动式编码(需要用户主动检索信息),并对其对流程、经济和临床结果的影响进行了编码:我们的分析包括 83 项研究。在早期启发式决策阶段,大多数主动干预无效,而被动干预一般都能改善结果。在分析阶段,效果则相反。促进信息提取的被动干预始终能改善结果:讨论:我们的研究结果表明,主动干预的效果与实施过程中获得的信息量呈负相关。在早期启发式决策阶段,当信息超载时,医生对警报反应迟钝,并主动咨询被动资源。在后期的分析阶段,由于诊断的不确定性和信息量的减少,医生对警报的接受度有所提高。限制信息量的干预措施带来了积极的结果,支持了我们的解释:我们将研究结果归纳为一个综合模型,该模型揭示了以往研究结果相互矛盾的根本原因,可指导从业人员在急诊室设计信息系统。
{"title":"The role of information systems in emergency department decision-making-a literature review.","authors":"Cornelius Born, Romy Schwarz, Timo Phillip Böttcher, Andreas Hein, Helmut Krcmar","doi":"10.1093/jamia/ocae096","DOIUrl":"10.1093/jamia/ocae096","url":null,"abstract":"<p><strong>Objectives: </strong>Healthcare providers employ heuristic and analytical decision-making to navigate the high-stakes environment of the emergency department (ED). Despite the increasing integration of information systems (ISs), research on their efficacy is conflicting. Drawing on related fields, we investigate how timing and mode of delivery influence IS effectiveness. Our objective is to reconcile previous contradictory findings, shedding light on optimal IS design in the ED.</p><p><strong>Materials and methods: </strong>We conducted a systematic review following PRISMA across PubMed, Scopus, and Web of Science. We coded the ISs' timing as heuristic or analytical, their mode of delivery as active for automatic alerts and passive when requiring user-initiated information retrieval, and their effect on process, economic, and clinical outcomes.</p><p><strong>Results: </strong>Our analysis included 83 studies. During early heuristic decision-making, most active interventions were ineffective, while passive interventions generally improved outcomes. In the analytical phase, the effects were reversed. Passive interventions that facilitate information extraction consistently improved outcomes.</p><p><strong>Discussion: </strong>Our findings suggest that the effectiveness of active interventions negatively correlates with the amount of information received during delivery. During early heuristic decision-making, when information overload is high, physicians are unresponsive to alerts and proactively consult passive resources. In the later analytical phases, physicians show increased receptivity to alerts due to decreased diagnostic uncertainty and information quantity. Interventions that limit information lead to positive outcomes, supporting our interpretation.</p><p><strong>Conclusion: </strong>We synthesize our findings into an integrated model that reveals the underlying reasons for conflicting findings from previous reviews and can guide practitioners in designing ISs in the ED.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"1608-1621"},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187435/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141088797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Addressing methodological and logistical challenges of using electronic health record (EHR) data for research. 应对将电子健康记录(EHR)数据用于研究的方法和后勤挑战。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-06-20 DOI: 10.1093/jamia/ocae126
Suzanne Bakken
{"title":"Addressing methodological and logistical challenges of using electronic health record (EHR) data for research.","authors":"Suzanne Bakken","doi":"10.1093/jamia/ocae126","DOIUrl":"10.1093/jamia/ocae126","url":null,"abstract":"","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":"31 7","pages":"1449-1450"},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187415/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141428026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automating literature screening and curation with applications to computational neuroscience. 将文献筛选和整理自动化,并应用于计算神经科学。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-06-20 DOI: 10.1093/jamia/ocae097
Ziqing Ji, Siyan Guo, Yujie Qiao, Robert A McDougal

Objective: ModelDB (https://modeldb.science) is a discovery platform for computational neuroscience, containing over 1850 published model codes with standardized metadata. These codes were mainly supplied from unsolicited model author submissions, but this approach is inherently limited. For example, we estimate we have captured only around one-third of NEURON models, the most common type of models in ModelDB. To more completely characterize the state of computational neuroscience modeling work, we aim to identify works containing results derived from computational neuroscience approaches and their standardized associated metadata (eg, cell types, research topics).

Materials and methods: Known computational neuroscience work from ModelDB and identified neuroscience work queried from PubMed were included in our study. After pre-screening with SPECTER2 (a free document embedding method), GPT-3.5, and GPT-4 were used to identify likely computational neuroscience work and relevant metadata.

Results: SPECTER2, GPT-4, and GPT-3.5 demonstrated varied but high abilities in identification of computational neuroscience work. GPT-4 achieved 96.9% accuracy and GPT-3.5 improved from 54.2% to 85.5% through instruction-tuning and Chain of Thought. GPT-4 also showed high potential in identifying relevant metadata annotations.

Discussion: Accuracy in identification and extraction might further be improved by dealing with ambiguity of what are computational elements, including more information from papers (eg, Methods section), improving prompts, etc.

Conclusion: Natural language processing and large language model techniques can be added to ModelDB to facilitate further model discovery, and will contribute to a more standardized and comprehensive framework for establishing domain-specific resources.

目的:ModelDB (https://modeldb.science) 是一个用于计算神经科学的发现平台,包含 1850 多个已发布的带有标准化元数据的模型代码。这些代码主要来自模型作者的主动提交,但这种方法本身存在局限性。例如,我们估计只捕获了神经元模型的三分之一左右,而神经元模型是 ModelDB 中最常见的模型类型。为了更全面地描述计算神经科学建模工作的现状,我们的目标是识别包含计算神经科学方法衍生结果的作品及其标准化的相关元数据(如细胞类型、研究课题):我们的研究包括从 ModelDB 中已知的计算神经科学作品和从 PubMed 中查询到的神经科学作品。经过 SPECTER2(一种免费的文档嵌入方法)、GPT-3.5 和 GPT-4 的预筛选,我们确定了可能的计算神经科学工作和相关元数据:结果:SPECTER2、GPT-4 和 GPT-3.5 在识别计算神经科学作品方面表现出了不同但很高的能力。GPT-4 的准确率达到 96.9%,GPT-3.5 通过指令调整和思维链从 54.2% 提高到 85.5%。GPT-4 在识别相关元数据注释方面也表现出了很高的潜力:讨论:通过处理计算要素的模糊性、从论文中纳入更多信息(如方法部分)、改进提示等,识别和提取的准确性可能会进一步提高:结论:自然语言处理和大型语言模型技术可以添加到 ModelDB 中,以促进模型的进一步发现,并将有助于建立一个更加标准化和全面的框架,用于建立特定领域的资源。
{"title":"Automating literature screening and curation with applications to computational neuroscience.","authors":"Ziqing Ji, Siyan Guo, Yujie Qiao, Robert A McDougal","doi":"10.1093/jamia/ocae097","DOIUrl":"10.1093/jamia/ocae097","url":null,"abstract":"<p><strong>Objective: </strong>ModelDB (https://modeldb.science) is a discovery platform for computational neuroscience, containing over 1850 published model codes with standardized metadata. These codes were mainly supplied from unsolicited model author submissions, but this approach is inherently limited. For example, we estimate we have captured only around one-third of NEURON models, the most common type of models in ModelDB. To more completely characterize the state of computational neuroscience modeling work, we aim to identify works containing results derived from computational neuroscience approaches and their standardized associated metadata (eg, cell types, research topics).</p><p><strong>Materials and methods: </strong>Known computational neuroscience work from ModelDB and identified neuroscience work queried from PubMed were included in our study. After pre-screening with SPECTER2 (a free document embedding method), GPT-3.5, and GPT-4 were used to identify likely computational neuroscience work and relevant metadata.</p><p><strong>Results: </strong>SPECTER2, GPT-4, and GPT-3.5 demonstrated varied but high abilities in identification of computational neuroscience work. GPT-4 achieved 96.9% accuracy and GPT-3.5 improved from 54.2% to 85.5% through instruction-tuning and Chain of Thought. GPT-4 also showed high potential in identifying relevant metadata annotations.</p><p><strong>Discussion: </strong>Accuracy in identification and extraction might further be improved by dealing with ambiguity of what are computational elements, including more information from papers (eg, Methods section), improving prompts, etc.</p><p><strong>Conclusion: </strong>Natural language processing and large language model techniques can be added to ModelDB to facilitate further model discovery, and will contribute to a more standardized and comprehensive framework for establishing domain-specific resources.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"1463-1470"},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187430/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140900027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparing natural language processing representations of coded disease sequences for prediction in electronic health records. 比较用于电子健康记录预测的编码疾病序列的自然语言处理表示法。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-06-20 DOI: 10.1093/jamia/ocae091
Thomas Beaney, Sneha Jha, Asem Alaa, Alexander Smith, Jonathan Clarke, Thomas Woodcock, Azeem Majeed, Paul Aylin, Mauricio Barahona

Objective: Natural language processing (NLP) algorithms are increasingly being applied to obtain unsupervised representations of electronic health record (EHR) data, but their comparative performance at predicting clinical endpoints remains unclear. Our objective was to compare the performance of unsupervised representations of sequences of disease codes generated by bag-of-words versus sequence-based NLP algorithms at predicting clinically relevant outcomes.

Materials and methods: This cohort study used primary care EHRs from 6 286 233 people with Multiple Long-Term Conditions in England. For each patient, an unsupervised vector representation of their time-ordered sequences of diseases was generated using 2 input strategies (212 disease categories versus 9462 diagnostic codes) and different NLP algorithms (Latent Dirichlet Allocation, doc2vec, and 2 transformer models designed for EHRs). We also developed a transformer architecture, named EHR-BERT, incorporating sociodemographic information. We compared the performance of each of these representations (without fine-tuning) as inputs into a logistic classifier to predict 1-year mortality, healthcare use, and new disease diagnosis.

Results: Patient representations generated by sequence-based algorithms performed consistently better than bag-of-words methods in predicting clinical endpoints, with the highest performance for EHR-BERT across all tasks, although the absolute improvement was small. Representations generated using disease categories perform similarly to those using diagnostic codes as inputs, suggesting models can equally manage smaller or larger vocabularies for prediction of these outcomes.

Discussion and conclusion: Patient representations produced by sequence-based NLP algorithms from sequences of disease codes demonstrate improved predictive content for patient outcomes compared with representations generated by co-occurrence-based algorithms. This suggests transformer models may be useful for generating multi-purpose representations, even without fine-tuning.

目的:自然语言处理(NLP)算法越来越多地被用于获取电子健康记录(EHR)数据的无监督表示,但它们在预测临床终点方面的比较性能仍不清楚。我们的目的是比较由词袋生成的疾病代码序列的无监督表示与基于序列的 NLP 算法在预测临床相关结果方面的性能:这项队列研究使用了英格兰 6 286 233 名患有多种长期疾病患者的初级保健电子病历。对于每位患者,我们使用两种输入策略(212 种疾病类别和 9462 个诊断代码)和不同的 NLP 算法(潜在 Dirichlet 分配、doc2vec 和 2 个专为电子病历设计的转换器模型)生成了其疾病时间排序序列的无监督向量表示。我们还开发了一种转换器架构,名为 EHR-BERT,其中包含社会人口信息。我们比较了这些表征(未经微调)作为逻辑分类器输入的性能,以预测 1 年死亡率、医疗保健使用情况和新疾病诊断:在预测临床终点方面,基于序列算法生成的患者表征始终优于字袋法,其中 EHR-BERT 在所有任务中的表现最佳,但绝对改进幅度较小。使用疾病类别生成的表征与使用诊断代码作为输入的表征表现类似,这表明模型同样可以管理较小或较大的词汇表来预测这些结果:基于序列的 NLP 算法根据疾病代码序列生成的患者表征与基于共现的算法生成的表征相比,对患者结果的预测内容有所改善。这表明,即使不进行微调,转换器模型也可用于生成多用途表征。
{"title":"Comparing natural language processing representations of coded disease sequences for prediction in electronic health records.","authors":"Thomas Beaney, Sneha Jha, Asem Alaa, Alexander Smith, Jonathan Clarke, Thomas Woodcock, Azeem Majeed, Paul Aylin, Mauricio Barahona","doi":"10.1093/jamia/ocae091","DOIUrl":"10.1093/jamia/ocae091","url":null,"abstract":"<p><strong>Objective: </strong>Natural language processing (NLP) algorithms are increasingly being applied to obtain unsupervised representations of electronic health record (EHR) data, but their comparative performance at predicting clinical endpoints remains unclear. Our objective was to compare the performance of unsupervised representations of sequences of disease codes generated by bag-of-words versus sequence-based NLP algorithms at predicting clinically relevant outcomes.</p><p><strong>Materials and methods: </strong>This cohort study used primary care EHRs from 6 286 233 people with Multiple Long-Term Conditions in England. For each patient, an unsupervised vector representation of their time-ordered sequences of diseases was generated using 2 input strategies (212 disease categories versus 9462 diagnostic codes) and different NLP algorithms (Latent Dirichlet Allocation, doc2vec, and 2 transformer models designed for EHRs). We also developed a transformer architecture, named EHR-BERT, incorporating sociodemographic information. We compared the performance of each of these representations (without fine-tuning) as inputs into a logistic classifier to predict 1-year mortality, healthcare use, and new disease diagnosis.</p><p><strong>Results: </strong>Patient representations generated by sequence-based algorithms performed consistently better than bag-of-words methods in predicting clinical endpoints, with the highest performance for EHR-BERT across all tasks, although the absolute improvement was small. Representations generated using disease categories perform similarly to those using diagnostic codes as inputs, suggesting models can equally manage smaller or larger vocabularies for prediction of these outcomes.</p><p><strong>Discussion and conclusion: </strong>Patient representations produced by sequence-based NLP algorithms from sequences of disease codes demonstrate improved predictive content for patient outcomes compared with representations generated by co-occurrence-based algorithms. This suggests transformer models may be useful for generating multi-purpose representations, even without fine-tuning.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"1451-1462"},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187492/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140892335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Strengthening the use of artificial intelligence within healthcare delivery organizations: balancing regulatory compliance and patient safety. 加强人工智能在医疗机构中的应用:兼顾合规性和患者安全。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-06-20 DOI: 10.1093/jamia/ocae119
Mark P Sendak, Vincent X Liu, Ashley Beecy, David E Vidal, Keo Shaw, Mark A Lifson, Danny Tobey, Alexandra Valladares, Brenna Loufek, Murtaza Mogri, Suresh Balu

Objectives: Surface the urgent dilemma that healthcare delivery organizations (HDOs) face navigating the US Food and Drug Administration (FDA) final guidance on the use of clinical decision support (CDS) software.

Materials and methods: We use sepsis as a case study to highlight the patient safety and regulatory compliance tradeoffs that 6129 hospitals in the United States must navigate.

Results: Sepsis CDS remains in broad, routine use. There is no commercially available sepsis CDS system that is FDA cleared as a medical device. There is no public disclosure of an HDO turning off sepsis CDS due to regulatory compliance concerns. And there is no public disclosure of FDA enforcement action against an HDO for using sepsis CDS that is not cleared as a medical device.

Discussion and conclusion: We present multiple policy interventions that would relieve the current tension to enable HDOs to utilize artificial intelligence to improve patient care while also addressing FDA concerns about product safety, efficacy, and equity.

目标:揭示医疗机构(HDOs)在美国食品药品管理局(FDA)临床决策支持(CDS)软件使用最终指南中面临的紧迫困境:我们以败血症为案例,强调了美国 6129 家医院必须在患者安全和合规性之间做出权衡:结果:脓毒症 CDS 仍在广泛、常规使用。目前还没有商业化的脓毒症 CDS 系统被 FDA 批准为医疗设备。没有公开披露过 HDO 因合规问题而关闭脓毒症 CDS 的情况。也没有公开披露美国食品及药物管理局对使用未被批准为医疗设备的败血症病理诊断系统的人类发展组织采取执法行动:我们提出了多种政策干预措施,以缓解当前的紧张局势,使人类发展组织能够利用人工智能改善患者护理,同时解决 FDA 对产品安全性、有效性和公平性的担忧。
{"title":"Strengthening the use of artificial intelligence within healthcare delivery organizations: balancing regulatory compliance and patient safety.","authors":"Mark P Sendak, Vincent X Liu, Ashley Beecy, David E Vidal, Keo Shaw, Mark A Lifson, Danny Tobey, Alexandra Valladares, Brenna Loufek, Murtaza Mogri, Suresh Balu","doi":"10.1093/jamia/ocae119","DOIUrl":"10.1093/jamia/ocae119","url":null,"abstract":"<p><strong>Objectives: </strong>Surface the urgent dilemma that healthcare delivery organizations (HDOs) face navigating the US Food and Drug Administration (FDA) final guidance on the use of clinical decision support (CDS) software.</p><p><strong>Materials and methods: </strong>We use sepsis as a case study to highlight the patient safety and regulatory compliance tradeoffs that 6129 hospitals in the United States must navigate.</p><p><strong>Results: </strong>Sepsis CDS remains in broad, routine use. There is no commercially available sepsis CDS system that is FDA cleared as a medical device. There is no public disclosure of an HDO turning off sepsis CDS due to regulatory compliance concerns. And there is no public disclosure of FDA enforcement action against an HDO for using sepsis CDS that is not cleared as a medical device.</p><p><strong>Discussion and conclusion: </strong>We present multiple policy interventions that would relieve the current tension to enable HDOs to utilize artificial intelligence to improve patient care while also addressing FDA concerns about product safety, efficacy, and equity.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"1622-1627"},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187419/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141066453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Utilizing ChatGPT as a scientific reasoning engine to differentiate conflicting evidence and summarize challenges in controversial clinical questions. 利用 ChatGPT 作为科学推理引擎,区分相互矛盾的证据,总结有争议的临床问题所面临的挑战。
IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-06-20 DOI: 10.1093/jamia/ocae100
Shiyao Xie, Wenjing Zhao, Guanghui Deng, Guohua He, Na He, Zhenhua Lu, Weihua Hu, Mingming Zhao, Jian Du

Objective: Synthesizing and evaluating inconsistent medical evidence is essential in evidence-based medicine. This study aimed to employ ChatGPT as a sophisticated scientific reasoning engine to identify conflicting clinical evidence and summarize unresolved questions to inform further research.

Materials and methods: We evaluated ChatGPT's effectiveness in identifying conflicting evidence and investigated its principles of logical reasoning. An automated framework was developed to generate a PubMed dataset focused on controversial clinical topics. ChatGPT analyzed this dataset to identify consensus and controversy, and to formulate unsolved research questions. Expert evaluations were conducted 1) on the consensus and controversy for factual consistency, comprehensiveness, and potential harm and, 2) on the research questions for relevance, innovation, clarity, and specificity.

Results: The gpt-4-1106-preview model achieved a 90% recall rate in detecting inconsistent claim pairs within a ternary assertions setup. Notably, without explicit reasoning prompts, ChatGPT provided sound reasoning for the assertions between claims and hypotheses, based on an analysis grounded in relevance, specificity, and certainty. ChatGPT's conclusions of consensus and controversies in clinical literature were comprehensive and factually consistent. The research questions proposed by ChatGPT received high expert ratings.

Discussion: Our experiment implies that, in evaluating the relationship between evidence and claims, ChatGPT considered more detailed information beyond a straightforward assessment of sentimental orientation. This ability to process intricate information and conduct scientific reasoning regarding sentiment is noteworthy, particularly as this pattern emerged without explicit guidance or directives in prompts, highlighting ChatGPT's inherent logical reasoning capabilities.

Conclusion: This study demonstrated ChatGPT's capacity to evaluate and interpret scientific claims. Such proficiency can be generalized to broader clinical research literature. ChatGPT effectively aids in facilitating clinical studies by proposing unresolved challenges based on analysis of existing studies. However, caution is advised as ChatGPT's outputs are inferences drawn from the input literature and could be harmful to clinical practice.

目的:综合和评估不一致的医学证据对于循证医学至关重要。本研究旨在将 ChatGPT 作为一个复杂的科学推理引擎,用于识别相互矛盾的临床证据,并总结尚未解决的问题,为进一步的研究提供信息:我们评估了 ChatGPT 在识别冲突证据方面的有效性,并研究了其逻辑推理原则。我们开发了一个自动框架来生成一个 PubMed 数据集,重点关注有争议的临床话题。ChatGPT 对该数据集进行分析,以确定共识和争议,并提出尚未解决的研究问题。专家评估包括:1)共识和争议的事实一致性、全面性和潜在危害;2)研究问题的相关性、创新性、清晰性和具体性:结果:gpt-4-1106-preview 模型在检测三元断言设置中不一致的断言对方面达到了 90% 的召回率。值得注意的是,在没有明确推理提示的情况下,ChatGPT 基于相关性、具体性和确定性的分析,为主张和假设之间的断言提供了合理的推理。ChatGPT 对临床文献中的共识和争议得出的结论是全面的,与事实相符。ChatGPT 提出的研究问题获得了专家的高度评价:我们的实验表明,在评估证据与主张之间的关系时,ChatGPT 除了直接评估情感取向外,还考虑了更多详细信息。这种处理复杂信息并对情感进行科学推理的能力值得注意,特别是这种模式是在没有明确指导或提示的情况下出现的,这突出了 ChatGPT 固有的逻辑推理能力:本研究证明了 ChatGPT 评估和解释科学主张的能力。这种能力可以推广到更广泛的临床研究文献中。ChatGPT 可以在分析现有研究的基础上提出尚未解决的难题,从而有效促进临床研究。不过,由于 ChatGPT 的输出结果是从输入文献中得出的推论,可能对临床实践有害,因此建议谨慎使用。
{"title":"Utilizing ChatGPT as a scientific reasoning engine to differentiate conflicting evidence and summarize challenges in controversial clinical questions.","authors":"Shiyao Xie, Wenjing Zhao, Guanghui Deng, Guohua He, Na He, Zhenhua Lu, Weihua Hu, Mingming Zhao, Jian Du","doi":"10.1093/jamia/ocae100","DOIUrl":"10.1093/jamia/ocae100","url":null,"abstract":"<p><strong>Objective: </strong>Synthesizing and evaluating inconsistent medical evidence is essential in evidence-based medicine. This study aimed to employ ChatGPT as a sophisticated scientific reasoning engine to identify conflicting clinical evidence and summarize unresolved questions to inform further research.</p><p><strong>Materials and methods: </strong>We evaluated ChatGPT's effectiveness in identifying conflicting evidence and investigated its principles of logical reasoning. An automated framework was developed to generate a PubMed dataset focused on controversial clinical topics. ChatGPT analyzed this dataset to identify consensus and controversy, and to formulate unsolved research questions. Expert evaluations were conducted 1) on the consensus and controversy for factual consistency, comprehensiveness, and potential harm and, 2) on the research questions for relevance, innovation, clarity, and specificity.</p><p><strong>Results: </strong>The gpt-4-1106-preview model achieved a 90% recall rate in detecting inconsistent claim pairs within a ternary assertions setup. Notably, without explicit reasoning prompts, ChatGPT provided sound reasoning for the assertions between claims and hypotheses, based on an analysis grounded in relevance, specificity, and certainty. ChatGPT's conclusions of consensus and controversies in clinical literature were comprehensive and factually consistent. The research questions proposed by ChatGPT received high expert ratings.</p><p><strong>Discussion: </strong>Our experiment implies that, in evaluating the relationship between evidence and claims, ChatGPT considered more detailed information beyond a straightforward assessment of sentimental orientation. This ability to process intricate information and conduct scientific reasoning regarding sentiment is noteworthy, particularly as this pattern emerged without explicit guidance or directives in prompts, highlighting ChatGPT's inherent logical reasoning capabilities.</p><p><strong>Conclusion: </strong>This study demonstrated ChatGPT's capacity to evaluate and interpret scientific claims. Such proficiency can be generalized to broader clinical research literature. ChatGPT effectively aids in facilitating clinical studies by proposing unresolved challenges based on analysis of existing studies. However, caution is advised as ChatGPT's outputs are inferences drawn from the input literature and could be harmful to clinical practice.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"1551-1560"},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187493/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140960509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of the American Medical Informatics Association
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1