首页 > 最新文献

Methods of Information in Medicine最新文献

英文 中文
Clinical Terminology Mapping Service Based on Information Retrieval. 基于信息检索的临床术语映射服务。
IF 1.8 4区 医学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-03 DOI: 10.1055/a-2797-4219
Sungwon Jung, Seung-Jong Yu, Byoung-Kee Yi

Background: Standardized clinical terminology is essential for semantic interoperability. Typically, a hospital's terminology expert manually maps local terminology with international standards such as SNOMED CT. The manual mapping process is demanding, labor-intensive, and time-consuming, and its effectiveness relies on the expertise of the professional handling it.

Objective: We developed a method to map clinical terms to SNOMED CT concept descriptions using an information retrieval (IR) approach with rich synonyms. We also provide a free mapping support service to help terminology experts alleviate the challenges of manual mapping without the need for additional manipulation.

Methods: We created indexes using edge n-grams and synonyms. We adopted Elasticsearch for indexing and query processing, incorporating data from the SPECIALIST Lexicon to enrich the synonym database. Eight different indexes were initially created, but only four were retained based on performance. We tested indexes individually and in combination, using a dataset of 1,753 one-to-one mapped instances from the National Library of Medicine ICD-9-CM Procedure codes to the SNOMED CT Map. We compared our approach with MetaMap for evaluation.

Results: We found that using rich synonyms and edge n-gram indexing significantly improved the accuracy of mapping clinical terms to SNOMED CT. The indexes incorporating synonyms and edge n-grams performed better than those using either technique alone. Combining these methods captured more relevant terms and synonyms, resulting in more precise mappings. Our method outperformed the baseline provided by MetaMap, demonstrating enhanced capability in handling complex medical terminology and improving the overall mapping quality.

Conclusions: Our study introduced an IR method with rich synonyms for mapping clinical terms to SNOMED CT, analyzing 42 unmapped terms, and identifying key issues. The approach shows promise in improving terminology mapping, and future work will explore advanced methods to enhance accuracy further, aiming to reduce manual mapping efforts and improve result evaluation.

背景:标准化临床术语对语义互操作性至关重要。通常,医院的术语专家手动将本地术语与国际标准(如SNOMED CT)进行映射。手动映射过程要求高、劳动密集且耗时,其有效性依赖于处理它的专业人员的专业知识。目的:我们开发了一种利用具有丰富同义词的信息检索(IR)方法将临床术语映射到SNOMED CT概念描述的方法。我们还提供免费的映射支持服务,以帮助术语专家减轻手动映射的挑战,而无需额外的操作。方法:我们使用边n-grams和同义词创建索引。我们采用Elasticsearch进行索引和查询处理,结合SPECIALIST Lexicon的数据来丰富同义词数据库。最初创建了8个不同的索引,但是基于性能只保留了4个。我们使用1,753个一对一映射实例的数据集,从国家医学图书馆ICD-9-CM程序代码到SNOMED CT地图,分别和组合测试了索引。我们将我们的方法与MetaMap进行了比较。结果:我们发现使用丰富的同义词和边缘n-图索引显著提高了临床术语映射到SNOMED CT的准确性。结合同义词和边n-图的索引比单独使用任何一种技术的索引表现得更好。结合这些方法可以捕获更多相关的术语和同义词,从而产生更精确的映射。我们的方法优于MetaMap提供的基线,增强了处理复杂医学术语的能力,提高了整体制图质量。结论:我们的研究引入了一种具有丰富同义词的临床术语映射到SNOMED CT的IR方法,分析了42个未映射的术语,并确定了关键问题。该方法有望改善术语映射,未来的工作将探索更先进的方法来进一步提高准确性,旨在减少手工映射的工作量并改进结果评估。
{"title":"Clinical Terminology Mapping Service Based on Information Retrieval.","authors":"Sungwon Jung, Seung-Jong Yu, Byoung-Kee Yi","doi":"10.1055/a-2797-4219","DOIUrl":"https://doi.org/10.1055/a-2797-4219","url":null,"abstract":"<p><strong>Background: </strong>Standardized clinical terminology is essential for semantic interoperability. Typically, a hospital's terminology expert manually maps local terminology with international standards such as SNOMED CT. The manual mapping process is demanding, labor-intensive, and time-consuming, and its effectiveness relies on the expertise of the professional handling it.</p><p><strong>Objective: </strong>We developed a method to map clinical terms to SNOMED CT concept descriptions using an information retrieval (IR) approach with rich synonyms. We also provide a free mapping support service to help terminology experts alleviate the challenges of manual mapping without the need for additional manipulation.</p><p><strong>Methods: </strong>We created indexes using edge n-grams and synonyms. We adopted Elasticsearch for indexing and query processing, incorporating data from the SPECIALIST Lexicon to enrich the synonym database. Eight different indexes were initially created, but only four were retained based on performance. We tested indexes individually and in combination, using a dataset of 1,753 one-to-one mapped instances from the National Library of Medicine ICD-9-CM Procedure codes to the SNOMED CT Map. We compared our approach with MetaMap for evaluation.</p><p><strong>Results: </strong>We found that using rich synonyms and edge n-gram indexing significantly improved the accuracy of mapping clinical terms to SNOMED CT. The indexes incorporating synonyms and edge n-grams performed better than those using either technique alone. Combining these methods captured more relevant terms and synonyms, resulting in more precise mappings. Our method outperformed the baseline provided by MetaMap, demonstrating enhanced capability in handling complex medical terminology and improving the overall mapping quality.</p><p><strong>Conclusions: </strong>Our study introduced an IR method with rich synonyms for mapping clinical terms to SNOMED CT, analyzing 42 unmapped terms, and identifying key issues. The approach shows promise in improving terminology mapping, and future work will explore advanced methods to enhance accuracy further, aiming to reduce manual mapping efforts and improve result evaluation.</p>","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146114864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Multicentre Evaluation of the Impact of Computerised Physician Order Entry on Medication Documentation Workflow, Time, and Quality. 计算机化医嘱录入对药物文档工作流程、时间和质量影响的多中心评估。
IF 1.8 4区 医学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-12-27 DOI: 10.1055/a-2744-7059
Viktoria Jungreithmayr, Alina Fischer, Jan Leininger, Raphael Kotte, Nils Keiner, Hanna M Seidling

Multicentre evaluations are key to provide generalisable results on medication safety interventions. Yet, for computerised physician order entry (CPOE) assessment, evaluations are mostly singe-centred and poorly comparable.We performed a multicentre simulation-based lab study at three independent study sites implementing varying CPOE systems with the aim to compare the effects of CPOE implementation on workflow changes, time requirement, and quality of medication documentation.At each study site, medication documentation processes with and without CPOE usage were analysed. Based on patient case scenarios, a simulation-based lab study was performed where the time required to document medication according to the analysed processes was measured. Additionally, the quality of medication documentation with and without CPOE usage was evaluated. Results were compared between the three study sites.At two hospitals, CPOE implementation led to a streamlining of the medication documentation process with less documentation systems and professional groups involved and an elimination of double documentation. However, at one hospital multiple documentation of medication orders persisted even after CPOE implementation. In total, the time required to document medication according to the case scenarios was faster without than with a CPOE system (median time without CPOE = 13:56 minutes [range = 07:29-32:17], median time with CPOE = 16:51 minutes [09:18-33:41], p = 0.047, +20.8%). One hospital was taking considerably longer for medication documentation than the two others, both with and without CPOE usage. Medication documentation quality rose significantly and to a median of 100.0% at each study site with CPOE usage.This study showed that a simulation-based lab study methodology is suitable for comparing CPOE effects across study sites. Furthermore, it provides evidence on the changes in medication documentation workflow, time requirement, and quality that occur with a CPOE implementation.

多中心评价是提供关于药物安全干预措施的可推广结果的关键。然而,对于计算机化医嘱输入(CPOE)评估,评估大多是单一中心的,可比性很差。我们在三个独立的研究地点进行了一项基于多中心模拟的实验室研究,实施了不同的CPOE系统,目的是比较CPOE实施对工作流程变化、时间要求和药物文件质量的影响。在每个研究地点,分析了使用和不使用CPOE的药物记录过程。根据患者病例情况,进行了基于模拟的实验室研究,根据分析的过程测量记录药物所需的时间。此外,评估了使用和不使用CPOE的药物文件的质量。对三个研究地点的结果进行比较。在两家医院,CPOE的实施简化了药物记录流程,减少了文件系统和专业团体的参与,并消除了重复记录。然而,在一家医院,即使在CPOE实施后,药物订单的多个文件仍然存在。总的来说,不使用CPOE系统时,根据病例情景记录用药所需时间比使用CPOE系统时要快(不使用CPOE系统的中位时间= 13:56分钟[范围= 07:29-32:17],使用CPOE系统的中位时间= 16:51分钟[09:18-33:41],p = 0.047, +20.8%)。一家医院的用药记录比另外两家医院要长得多,无论是否使用CPOE。在使用CPOE的每个研究点,药物记录质量显著提高,中位数达到100.0%。本研究表明,基于模拟的实验室研究方法适用于比较不同研究地点的CPOE效果。此外,它还提供了药物文档工作流程、时间要求和质量变化的证据,这些变化是由CPOE实现引起的。
{"title":"A Multicentre Evaluation of the Impact of Computerised Physician Order Entry on Medication Documentation Workflow, Time, and Quality.","authors":"Viktoria Jungreithmayr, Alina Fischer, Jan Leininger, Raphael Kotte, Nils Keiner, Hanna M Seidling","doi":"10.1055/a-2744-7059","DOIUrl":"https://doi.org/10.1055/a-2744-7059","url":null,"abstract":"<p><p>Multicentre evaluations are key to provide generalisable results on medication safety interventions. Yet, for computerised physician order entry (CPOE) assessment, evaluations are mostly singe-centred and poorly comparable.We performed a multicentre simulation-based lab study at three independent study sites implementing varying CPOE systems with the aim to compare the effects of CPOE implementation on workflow changes, time requirement, and quality of medication documentation.At each study site, medication documentation processes with and without CPOE usage were analysed. Based on patient case scenarios, a simulation-based lab study was performed where the time required to document medication according to the analysed processes was measured. Additionally, the quality of medication documentation with and without CPOE usage was evaluated. Results were compared between the three study sites.At two hospitals, CPOE implementation led to a streamlining of the medication documentation process with less documentation systems and professional groups involved and an elimination of double documentation. However, at one hospital multiple documentation of medication orders persisted even after CPOE implementation. In total, the time required to document medication according to the case scenarios was faster without than with a CPOE system (median time without CPOE = 13:56 minutes [range = 07:29-32:17], median time with CPOE = 16:51 minutes [09:18-33:41], <i>p</i> = 0.047, +20.8%). One hospital was taking considerably longer for medication documentation than the two others, both with and without CPOE usage. Medication documentation quality rose significantly and to a median of 100.0% at each study site with CPOE usage.This study showed that a simulation-based lab study methodology is suitable for comparing CPOE effects across study sites. Furthermore, it provides evidence on the changes in medication documentation workflow, time requirement, and quality that occur with a CPOE implementation.</p>","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145846650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Clustering Breast Cancer Patients Based on Their Treatment Courses Using German Cancer Registry Data. 基于德国癌症登记数据的治疗过程聚类乳腺癌患者。
IF 1.8 4区 医学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-12-10 DOI: 10.1055/a-2753-9631
Kolja Blohm, David Korfkamp, Florian Oesterling, Klaas Dählmann, Stefanie Schulze, Andreas Hein

Cancer registries collect extensive data on cancer patients, including diagnoses, treatments, and disease progression. These data offer valuable insights into cancer care, but it is challenging to analyze due to its complexity. Machine learning techniques, particularly clustering, enable the exploration of treatment data to uncover previously unknown patterns and relationships.This work aimed to develop a method for clustering breast cancer patients in cancer registries based on their treatment courses, to demonstrate the usefulness of clustering for gaining insights, improving data quality, and identifying clinically relevant patterns.We developed a similarity measure adapted from the Levenshtein distance to compare treatment courses, incorporating cancer diagnosis, surgeries, radiotherapies, and systemic therapies. The method was evaluated on 17,822 breast cancer cases diagnosed in 2019 from the cancer registry of North Rhine-Westphalia. Evaluation involved two stages: first, domain experts reviewed the clustering results to assess clinical relevance and interpretability. Second, an intercluster survival analysis was performed to identify clinically relevant differences between treatment patterns.Expert evaluations confirmed that clustering produced clinically plausible groups while also uncovering unexpected treatment patterns and potential data inconsistencies. The survival analysis showed differences in survival between clusters in both prognostically favorable and unfavorable subgroups. These results demonstrate that treatment-course clustering can identify patient groups with differing survival outcomes. However, registry data incompleteness and unmeasured confounders may influence these findings.Clustering treatment courses in cancer registries can reveal data quality issues, distinguish groups with different prognostic profiles, and support exploratory analyses of treatment patterns. While these findings are not intended to guide clinical decision making or evaluate treatment effectiveness, they can help generate hypotheses, identify unexpected care pathways, and support quality monitoring within cancer registries. Future work should focus on improving treatment data completeness, incorporating additional clinical variables, and refining clustering methods for broader applicability.

癌症登记处收集癌症患者的大量数据,包括诊断、治疗和疾病进展。这些数据为癌症治疗提供了有价值的见解,但由于其复杂性,分析起来很有挑战性。机器学习技术,特别是聚类技术,使探索治疗数据能够揭示以前未知的模式和关系。这项工作旨在开发一种基于治疗过程的乳腺癌患者癌症登记聚类方法,以证明聚类在获得见解,提高数据质量和识别临床相关模式方面的有用性。我们根据Levenshtein距离开发了一种相似性测量方法来比较治疗过程,包括癌症诊断、手术、放射治疗和全身治疗。该方法在北莱茵-威斯特伐利亚州癌症登记处2019年诊断出的17,822例乳腺癌病例中进行了评估。评估包括两个阶段:首先,领域专家审查聚类结果以评估临床相关性和可解释性。其次,进行集群间生存分析,以确定治疗模式之间的临床相关差异。专家评估证实,聚类产生了临床可信的组,同时也揭示了意想不到的治疗模式和潜在的数据不一致。生存分析显示预后有利亚组和不利亚组的生存差异。这些结果表明,疗程聚类可以识别具有不同生存结果的患者组。然而,注册数据不完整和未测量的混杂因素可能会影响这些发现。癌症登记处的聚类疗程可以揭示数据质量问题,区分具有不同预后概况的组,并支持治疗模式的探索性分析。虽然这些发现并不打算指导临床决策或评估治疗效果,但它们可以帮助产生假设,确定意外的护理途径,并支持癌症登记处的质量监测。未来的工作应侧重于提高治疗数据的完整性,纳入额外的临床变量,并改进聚类方法以获得更广泛的适用性。
{"title":"Clustering Breast Cancer Patients Based on Their Treatment Courses Using German Cancer Registry Data.","authors":"Kolja Blohm, David Korfkamp, Florian Oesterling, Klaas Dählmann, Stefanie Schulze, Andreas Hein","doi":"10.1055/a-2753-9631","DOIUrl":"https://doi.org/10.1055/a-2753-9631","url":null,"abstract":"<p><p>Cancer registries collect extensive data on cancer patients, including diagnoses, treatments, and disease progression. These data offer valuable insights into cancer care, but it is challenging to analyze due to its complexity. Machine learning techniques, particularly clustering, enable the exploration of treatment data to uncover previously unknown patterns and relationships.This work aimed to develop a method for clustering breast cancer patients in cancer registries based on their treatment courses, to demonstrate the usefulness of clustering for gaining insights, improving data quality, and identifying clinically relevant patterns.We developed a similarity measure adapted from the Levenshtein distance to compare treatment courses, incorporating cancer diagnosis, surgeries, radiotherapies, and systemic therapies. The method was evaluated on 17,822 breast cancer cases diagnosed in 2019 from the cancer registry of North Rhine-Westphalia. Evaluation involved two stages: first, domain experts reviewed the clustering results to assess clinical relevance and interpretability. Second, an intercluster survival analysis was performed to identify clinically relevant differences between treatment patterns.Expert evaluations confirmed that clustering produced clinically plausible groups while also uncovering unexpected treatment patterns and potential data inconsistencies. The survival analysis showed differences in survival between clusters in both prognostically favorable and unfavorable subgroups. These results demonstrate that treatment-course clustering can identify patient groups with differing survival outcomes. However, registry data incompleteness and unmeasured confounders may influence these findings.Clustering treatment courses in cancer registries can reveal data quality issues, distinguish groups with different prognostic profiles, and support exploratory analyses of treatment patterns. While these findings are not intended to guide clinical decision making or evaluate treatment effectiveness, they can help generate hypotheses, identify unexpected care pathways, and support quality monitoring within cancer registries. Future work should focus on improving treatment data completeness, incorporating additional clinical variables, and refining clustering methods for broader applicability.</p>","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145726956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging Electronic Health Record Data and Up-to-Date Clinical Guidelines for High-Accuracy Clinical Diabetes Drug and Dosage Recommendation. 利用电子病历数据和最新的临床指南,高度准确和实用的临床糖尿病药物和剂量推荐系统。
IF 1.8 4区 医学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-10-10 DOI: 10.1055/a-2707-2862
Jhing-Fa Wang, Ming-Jun Wei, Te-Ming Chiang, Tzu-Chun Yeh, Eric Cheng, Yuan-Teh Lee, Hong-I Chen

Existing drug recommendation systems lack integration with up-to-date clinical guidelines (the latest diabetes association standards of care and clinical guidelines that align with local government health care regulations) and lack high-precision drug interaction processing, explainability, and dynamic dosage adjustment. As a result, the recommendations generated by these systems are often inaccurate and do not align with local standards, greatly limiting their practicality.To develop a personalized drug recommendation and dosage optimization system named Diabetes Drug Recommendation System (DDRs), integrating Fast Healthcare Interoperability Resources-standardized electronic health record (EHR) data and up-to-date clinical guidelines for accurate and practical recommendations.We analyzed patients' EHR and International Classification of Diseases-tenth edition codes and integrated them with a drug interaction database to reduce adverse reactions. ADA guidelines and Taiwan's National Health Insurance (NHI) chronic disease guidelines served as data sources. Bio-GPT and Retrieval-Augmented Generation (RAG) were used to build the clinical guideline database and ensure recommendations align with the latest standards, with references provided for interpretability. Finally, optimal dosage was dynamically calculated by integrating patient disease progression trends from the EHR.DDRs achieved superior drug recommendation accuracy (Precision-Recall Area Under the Curve = 0.7951, Jaccard = 0.5632, F1-score = 0.7158), with a low drug-drug interaction rate (4.73%) and dosage error (±6.21%). Faithfulness of recommendations reached 0.850. Field validation with three physicians showed that the system reduced literature review time by 30 to 40% and delivered clinically actionable recommendations.DDRs is the first system to integrate EHR data, LLMs, RAG, ADA guidelines, and Taiwan NHI policies for diabetes treatment. The system demonstrates high accuracy, safety, and interpretability, offering practical decision support in routine clinical settings.

背景:现有的药物推荐系统缺乏与最新临床指南(最新的糖尿病协会护理标准和符合当地政府医疗法规的临床指南)的整合,缺乏高精度的药物相互作用处理、可解释性和动态剂量调整。因此,这些系统产生的建议往往是不准确的,与当地标准不一致,极大地限制了它们的实用性。目的:开发糖尿病药物推荐系统(DDRs),整合fhir标准化EHR数据和最新临床指南,提供准确实用的推荐。方法:分析患者的EHR和ICD-10代码,并将其与药物相互作用数据库进行整合,以减少不良反应。ADA指南和台湾NHI慢性病指南作为数据来源。使用Bio-GPT和RAG建立临床指南数据库,确保建议与最新标准一致,并提供可解释性参考文献。最后,通过整合来自EHR的患者疾病进展趋势,动态计算最佳剂量。结果:DDRs具有较好的推荐准确率(PRAUC = 0.7951, Jaccard = 0.5632, f1评分= 0.7158),DDI率(4.73%)和剂量误差(±6.21%)较低。推荐信度达到0.850。三名医生的现场验证表明,该系统将文献回顾时间缩短了30-40%,并提供了临床可操作的建议。结论:ddr是第一个整合EHR数据、LLMs、RAG、ADA指南和台湾NHI政策的糖尿病治疗系统。该系统具有较高的准确性、安全性和可解释性,可在常规临床环境中提供实用的决策支持。
{"title":"Leveraging Electronic Health Record Data and Up-to-Date Clinical Guidelines for High-Accuracy Clinical Diabetes Drug and Dosage Recommendation.","authors":"Jhing-Fa Wang, Ming-Jun Wei, Te-Ming Chiang, Tzu-Chun Yeh, Eric Cheng, Yuan-Teh Lee, Hong-I Chen","doi":"10.1055/a-2707-2862","DOIUrl":"10.1055/a-2707-2862","url":null,"abstract":"<p><p>Existing drug recommendation systems lack integration with up-to-date clinical guidelines (the latest diabetes association standards of care and clinical guidelines that align with local government health care regulations) and lack high-precision drug interaction processing, explainability, and dynamic dosage adjustment. As a result, the recommendations generated by these systems are often inaccurate and do not align with local standards, greatly limiting their practicality.To develop a personalized drug recommendation and dosage optimization system named Diabetes Drug Recommendation System (DDRs), integrating Fast Healthcare Interoperability Resources-standardized electronic health record (EHR) data and up-to-date clinical guidelines for accurate and practical recommendations.We analyzed patients' EHR and International Classification of Diseases-tenth edition codes and integrated them with a drug interaction database to reduce adverse reactions. ADA guidelines and Taiwan's National Health Insurance (NHI) chronic disease guidelines served as data sources. Bio-GPT and Retrieval-Augmented Generation (RAG) were used to build the clinical guideline database and ensure recommendations align with the latest standards, with references provided for interpretability. Finally, optimal dosage was dynamically calculated by integrating patient disease progression trends from the EHR.DDRs achieved superior drug recommendation accuracy (Precision-Recall Area Under the Curve = 0.7951, Jaccard = 0.5632, F1-score = 0.7158), with a low drug-drug interaction rate (4.73%) and dosage error (±6.21%). Faithfulness of recommendations reached 0.850. Field validation with three physicians showed that the system reduced literature review time by 30 to 40% and delivered clinically actionable recommendations.DDRs is the first system to integrate EHR data, LLMs, RAG, ADA guidelines, and Taiwan NHI policies for diabetes treatment. The system demonstrates high accuracy, safety, and interpretability, offering practical decision support in routine clinical settings.</p>","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145138728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Why Synthetic Discoveries are Not Only a Problem of Differentially Private Synthetic Data? 为什么合成发现不仅仅是不同私有合成数据的问题。
IF 1.3 4区 医学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-01 Epub Date: 2025-04-15 DOI: 10.1055/a-2540-8284
Heidelinde Dehaene, Alexander Decruyenaere, Christiaan Polet, Johan Decruyenaere, Paloma Rabaey, Thomas Demeester, Stijn Vansteelandt
{"title":"Why Synthetic Discoveries are Not Only a Problem of Differentially Private Synthetic Data?","authors":"Heidelinde Dehaene, Alexander Decruyenaere, Christiaan Polet, Johan Decruyenaere, Paloma Rabaey, Thomas Demeester, Stijn Vansteelandt","doi":"10.1055/a-2540-8284","DOIUrl":"10.1055/a-2540-8284","url":null,"abstract":"","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"203-204"},"PeriodicalIF":1.3,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144036735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-lingual Natural Language Processing on Limited Annotated Case/Radiology Reports in English and Japanese: Insights from the Real-MedNLP Workshop. 英语和日语有限注释病例/放射学报告的跨语言自然语言处理:Real-MedNLP 研讨会的启示。
IF 1.3 4区 医学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-01 Epub Date: 2024-08-29 DOI: 10.1055/a-2405-2489
Shuntaro Yada, Yuta Nakamura, Shoko Wakamiya, Eiji Aramaki

Background:  Textual datasets (corpora) are crucial for the application of natural language processing (NLP) models. However, corpus creation in the medical field is challenging, primarily because of privacy issues with raw clinical data such as health records. Thus, the existing clinical corpora are generally small and scarce. Medical NLP (MedNLP) methodologies perform well with limited data availability.

Objectives:  We present the outcomes of the Real-MedNLP workshop, which was conducted using limited and parallel medical corpora. Real-MedNLP exhibits three distinct characteristics: (1) limited annotated documents: the training data comprise only a small set (∼100) of case reports (CRs) and radiology reports (RRs) that have been annotated. (2) Bilingually parallel: the constructed corpora are parallel in Japanese and English. (3) Practical tasks: the workshop addresses fundamental tasks, such as named entity recognition (NER) and applied practical tasks.

Methods:  We propose three tasks: NER of ∼100 available documents (Task 1), NER based only on annotation guidelines for humans (Task 2), and clinical applications (Task 3) consisting of adverse drug effect (ADE) detection for CRs and identical case identification (CI) for RRs.

Results:  Nine teams participated in this study. The best systems achieved 0.65 and 0.89 F1-scores for CRs and RRs in Task 1, whereas the top scores in Task 2 decreased by 50 to 70%. In Task 3, ADE reports were detected by up to 0.64 F1-score, and CI scored up to 0.96 binary accuracy.

Conclusion:  Most systems adopt medical-domain-specific pretrained language models using data augmentation methods. Despite the challenge of limited corpus size in Tasks 1 and 2, recent approaches are promising because the partial match scores reached ∼0.8-0.9 F1-scores. Task 3 applications revealed that the different availabilities of external language resources affected the performance per language.

背景:文本数据集(语料库)对于自然语言处理(NLP)模型的应用至关重要。然而,在医疗领域创建语料库是一项挑战,主要是因为原始临床数据(如健康记录)存在隐私问题。因此,现有的临床语料库通常规模较小,数量稀少。医学 NLP(MedNLP)方法在数据可用性有限的情况下表现良好:我们介绍了 "真实-MedNLP "研讨会的成果,该研讨会使用了有限的并行医疗语料库。Real-MedNLP 有三个显著特点:(1)有限的注释文档:训练数据只包括一小部分(约 100 份)已注释的病例报告 (CR) 和放射报告 (RR)。(2) 双语平行:所构建的语料库在日语和英语中是平行的。(3) 实用任务:研讨会讨论了命名实体识别等基本任务和应用实践任务:我们提出了三项任务:对约 100 篇可用文档进行命名实体识别(NER)(任务 1);仅基于人类注释指南进行 NER(任务 2);以及临床应用(任务 3),包括针对 CR 的药物不良反应(ADE)检测和针对 RR 的相同病例识别(CI):九个团队参加了这项研究。在任务 1 中,最佳系统在 CR 和 RR 方面的 F1 分数分别为 0.65 和 0.89,而在任务 2 中的最高分则下降了 50-70%。在任务 3 中,ADE 报告的检测 F1 分数高达 0.64,CI 的二进制准确率高达 0.96:大多数系统都采用了针对特定医疗领域的预训练语言模型,并使用了数据增强方法。尽管在任务 1 和 2 中面临着语料库规模有限的挑战,但最近的方法还是很有前景的,因为部分匹配得分达到了约 0.8-0.9 F1 分数。任务 3 的应用表明,外部语言资源的可用性不同会影响每种语言的性能。
{"title":"Cross-lingual Natural Language Processing on Limited Annotated Case/Radiology Reports in English and Japanese: Insights from the Real-MedNLP Workshop.","authors":"Shuntaro Yada, Yuta Nakamura, Shoko Wakamiya, Eiji Aramaki","doi":"10.1055/a-2405-2489","DOIUrl":"10.1055/a-2405-2489","url":null,"abstract":"<p><strong>Background: </strong> Textual datasets (corpora) are crucial for the application of natural language processing (NLP) models. However, corpus creation in the medical field is challenging, primarily because of privacy issues with raw clinical data such as health records. Thus, the existing clinical corpora are generally small and scarce. Medical NLP (MedNLP) methodologies perform well with limited data availability.</p><p><strong>Objectives: </strong> We present the outcomes of the Real-MedNLP workshop, which was conducted using limited and parallel medical corpora. Real-MedNLP exhibits three distinct characteristics: (1) limited annotated documents: the training data comprise only a small set (∼100) of case reports (CRs) and radiology reports (RRs) that have been annotated. (2) Bilingually parallel: the constructed corpora are parallel in Japanese and English. (3) Practical tasks: the workshop addresses fundamental tasks, such as named entity recognition (NER) and applied practical tasks.</p><p><strong>Methods: </strong> We propose three tasks: NER of ∼100 available documents (Task 1), NER based only on annotation guidelines for humans (Task 2), and clinical applications (Task 3) consisting of adverse drug effect (ADE) detection for CRs and identical case identification (CI) for RRs.</p><p><strong>Results: </strong> Nine teams participated in this study. The best systems achieved 0.65 and 0.89 F1-scores for CRs and RRs in Task 1, whereas the top scores in Task 2 decreased by 50 to 70%. In Task 3, ADE reports were detected by up to 0.64 F1-score, and CI scored up to 0.96 binary accuracy.</p><p><strong>Conclusion: </strong> Most systems adopt medical-domain-specific pretrained language models using data augmentation methods. Despite the challenge of limited corpus size in Tasks 1 and 2, recent approaches are promising because the partial match scores reached ∼0.8-0.9 F1-scores. Task 3 applications revealed that the different availabilities of external language resources affected the performance per language.</p>","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"145-163"},"PeriodicalIF":1.3,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12196824/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142114054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deciphering Abbreviations in Malaysian Clinical Notes Using Machine Learning. 使用机器学习破译马来西亚临床记录中的缩写。
IF 1.3 4区 医学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-01 Epub Date: 2025-01-22 DOI: 10.1055/a-2521-4372
Ismat Mohd Sulaiman, Awang Bulgiba, Sameem Abdul Kareem, Abdul Aziz Latip

Objective:  This is the first Malaysian machine learning model to detect and disambiguate abbreviations in clinical notes. The model has been designed to be incorporated into MyHarmony, a natural language processing system, that extracts clinical information for health care management. The model utilizes word embedding to ensure feasibility of use, not in real-time but for secondary analysis, within the constraints of low-resource settings.

Methods:  A Malaysian clinical embedding, based on Word2Vec model, was developed using 29,895 electronic discharge summaries. The embedding was compared against conventional rule-based and FastText embedding on two tasks: abbreviation detection and abbreviation disambiguation. Machine learning classifiers were applied to assess performance.

Results:  The Malaysian clinical word embedding contained 7 million word tokens, 24,352 unique vocabularies, and 100 dimensions. For abbreviation detection, the Decision Tree classifier augmented with the Malaysian clinical embedding showed the best performance (F-score of 0.9519). For abbreviation disambiguation, the classifier with the Malaysian clinical embedding had the best performance for most of the abbreviations (F-score of 0.9903).

Conclusion:  Despite having a smaller vocabulary and dimension, our local clinical word embedding performed better than the larger nonclinical FastText embedding. Word embedding with simple machine learning algorithms can decipher abbreviations well. It also requires lower computational resources and is suitable for implementation in low-resource settings such as Malaysia. The integration of this model into MyHarmony will improve recognition of clinical terms, thus improving the information generated for monitoring Malaysian health care services and policymaking.

目的:这是马来西亚第一个检测和消除临床记录中缩写词歧义的机器学习模型。该模型被设计成与MyHarmony相结合,MyHarmony是一个自然语言处理系统,可以为医疗管理提取临床信息。该模型利用词嵌入来确保使用的可行性,不是实时的,而是在低资源设置的约束下进行二次分析。方法:利用29,895份电子病历摘要,建立基于Word2Vec模型的马来西亚临床嵌入。在缩写检测和缩写消歧两个任务上,将该嵌入与传统的基于规则的嵌入和FastText嵌入进行了比较。使用机器学习分类器来评估性能。结果:马来语临床词嵌入包含700万个词标记,24352个唯一词汇,100个维度。对于缩略语的检测,Decision Tree分类器增强了马来西亚临床嵌入,表现出最好的性能(f得分为0.9519)。在缩略词消歧方面,采用马来西亚临床嵌入的分类器对大多数缩略词的消歧效果最好(f值为0.9903)。结论:尽管我们的局部临床词嵌入具有较小的词汇量和维度,但其表现优于较大的非临床快速文本嵌入。用简单的机器学习算法嵌入词可以很好地解译缩略语。它还需要更少的计算资源,适合在马来西亚等资源匮乏的环境中实现。将该模型集成到MyHarmony将提高对临床术语的认识,从而改善监测马来西亚医疗保健服务和决策所产生的信息。
{"title":"Deciphering Abbreviations in Malaysian Clinical Notes Using Machine Learning.","authors":"Ismat Mohd Sulaiman, Awang Bulgiba, Sameem Abdul Kareem, Abdul Aziz Latip","doi":"10.1055/a-2521-4372","DOIUrl":"10.1055/a-2521-4372","url":null,"abstract":"<p><strong>Objective: </strong> This is the first Malaysian machine learning model to detect and disambiguate abbreviations in clinical notes. The model has been designed to be incorporated into MyHarmony, a natural language processing system, that extracts clinical information for health care management. The model utilizes word embedding to ensure feasibility of use, not in real-time but for secondary analysis, within the constraints of low-resource settings.</p><p><strong>Methods: </strong> A Malaysian clinical embedding, based on Word2Vec model, was developed using 29,895 electronic discharge summaries. The embedding was compared against conventional rule-based and FastText embedding on two tasks: abbreviation detection and abbreviation disambiguation. Machine learning classifiers were applied to assess performance.</p><p><strong>Results: </strong> The Malaysian clinical word embedding contained 7 million word tokens, 24,352 unique vocabularies, and 100 dimensions. For abbreviation detection, the Decision Tree classifier augmented with the Malaysian clinical embedding showed the best performance (F-score of 0.9519). For abbreviation disambiguation, the classifier with the Malaysian clinical embedding had the best performance for most of the abbreviations (F-score of 0.9903).</p><p><strong>Conclusion: </strong> Despite having a smaller vocabulary and dimension, our local clinical word embedding performed better than the larger nonclinical FastText embedding. Word embedding with simple machine learning algorithms can decipher abbreviations well. It also requires lower computational resources and is suitable for implementation in low-resource settings such as Malaysia. The integration of this model into MyHarmony will improve recognition of clinical terms, thus improving the information generated for monitoring Malaysian health care services and policymaking.</p>","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"195-202"},"PeriodicalIF":1.3,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12196825/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143025162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TCMSF: A Construction Framework of Traditional Chinese Medicine Syndrome Ancient Book Knowledge Graph. 中医证候古籍知识图谱构建框架。
IF 1.3 4区 医学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-01 Epub Date: 2025-04-17 DOI: 10.1055/a-2590-6348
Ziling Zeng, Lin Tong, Bing Li, Wenjing Zong, Qikai Niu, Sihong Liu, Lei Zhang, Jialun Wang, Siqi Zhang, Siwei Tian, Jing'ai Wang, Wei Zhang, Huamin Zhang

Syndrome is a unique and crucial concept in traditional Chinese medicine (TCM). However, much of the syndrome knowledge lacks systematic organization and correlation, and current information technologies are unsuitable for TCM ancient texts.We aimed to develop a knowledge graph that presents this knowledge in a more orderly, structured, and semantically oriented manner, providing a foundation for computer-aided diagnosis and treatment.We developed a construction framework of TCM syndrome knowledge from ancient books, using a pretrained model and rules (TCMSF). We conducted fine-tuning training on Enhanced Representation through Knowledge Integration (ERNIE), Bidirectional Encoder Representation from Transformers pretrained language models, and chatGLM3-6b large language models for named entity recognition (NER) tasks. Furthermore, we employed the progressive entity relationship extraction method based on the dual pattern feature combination to extract and standardize entities and relationships between entities in these books.We selected Yin deficiency syndrome as a case study and constructed a model layer suitable for the expression of knowledge in these books. Compared with multiple NER methods, the combination of ERNIE and Conditional Random Fields performs the best. By utilizing this combination, we completed the entity extraction of Yin deficiency syndrome, achieving an average F1 value of 0.77. The relationship extraction method we proposed reduces the number of incorrectly connected relationships compared with fully connected pattern layers. We successfully constructed a knowledge graph of ancient books on Yin deficiency syndrome, including over 120,000 entities and over 1.18 million relationships.We developed TCMSF in line with the knowledge characteristics of ancient TCM books and improved the accuracy of knowledge graph construction.

背景:证候是中医中一个独特而重要的概念。然而,许多证候知识缺乏系统的组织和相关性,现有的信息技术也不适合中医古籍。目的:我们旨在开发一个知识图谱,以更有序、结构化和面向语义的方式呈现这些知识,为计算机辅助诊断和治疗提供基础。方法:采用预训练模型和规则(TCMSF)构建古籍中医证候知识构建框架。我们通过知识集成(ERINE)、变形器双向编码器表示(BERT)预训练语言模型和chatGLM3-6b大型语言模型(llm)对命名实体识别(NER)任务进行了微调训练。在此基础上,我们采用基于双模式特征组合的渐进式实体关系提取方法,对图书中的实体及实体间关系进行提取和规范。结果:我们以阴虚证为个案,构建了适合这些书中知识表达的模型层。与多种NER方法相比,ERNIE与条件随机场(Conditional Random field, CRF)相结合的效果最好。利用这一组合,我们完成了阴虚证的实体提取,平均F1值为0.77。与完全连接的模式层相比,我们提出的关系提取方法减少了错误连接关系的数量。我们成功构建了一个包含12万多个实体、118万多个关系的阴虚证古籍知识图谱。结论:开发的中医古籍知识图谱符合中医古籍知识特征,提高了知识图谱构建的准确性。
{"title":"TCMSF: A Construction Framework of Traditional Chinese Medicine Syndrome Ancient Book Knowledge Graph.","authors":"Ziling Zeng, Lin Tong, Bing Li, Wenjing Zong, Qikai Niu, Sihong Liu, Lei Zhang, Jialun Wang, Siqi Zhang, Siwei Tian, Jing'ai Wang, Wei Zhang, Huamin Zhang","doi":"10.1055/a-2590-6348","DOIUrl":"10.1055/a-2590-6348","url":null,"abstract":"<p><p>Syndrome is a unique and crucial concept in traditional Chinese medicine (TCM). However, much of the syndrome knowledge lacks systematic organization and correlation, and current information technologies are unsuitable for TCM ancient texts.We aimed to develop a knowledge graph that presents this knowledge in a more orderly, structured, and semantically oriented manner, providing a foundation for computer-aided diagnosis and treatment.We developed a construction framework of TCM syndrome knowledge from ancient books, using a pretrained model and rules (TCMSF). We conducted fine-tuning training on Enhanced Representation through Knowledge Integration (ERNIE), Bidirectional Encoder Representation from Transformers pretrained language models, and chatGLM3-6b large language models for named entity recognition (NER) tasks. Furthermore, we employed the progressive entity relationship extraction method based on the dual pattern feature combination to extract and standardize entities and relationships between entities in these books.We selected Yin deficiency syndrome as a case study and constructed a model layer suitable for the expression of knowledge in these books. Compared with multiple NER methods, the combination of ERNIE and Conditional Random Fields performs the best. By utilizing this combination, we completed the entity extraction of Yin deficiency syndrome, achieving an average F1 value of 0.77. The relationship extraction method we proposed reduces the number of incorrectly connected relationships compared with fully connected pattern layers. We successfully constructed a knowledge graph of ancient books on Yin deficiency syndrome, including over 120,000 entities and over 1.18 million relationships.We developed TCMSF in line with the knowledge characteristics of ancient TCM books and improved the accuracy of knowledge graph construction.</p>","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"183-194"},"PeriodicalIF":1.3,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12196822/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144036734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ISPO: An Integrated Ontology of Symptom Phenotypes for Semantic Integration of Traditional Chinese Medical Data. 面向中医数据语义整合的综合症状表型本体。
IF 1.3 4区 医学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-01 Epub Date: 2025-05-06 DOI: 10.1055/a-2576-1847
Zixin Shu, Rui Hua, Dengying Yan, Chenxia Lu, Meng Ren, Hong Gao, Ning Xu, Jun Li, Hui Zhu, Jia Zhang, Dan Zhao, Chenyang Hui, Chu Liao, Junqiu Ye, Qi Hao, Xinyan Wang, Xiaodong Li, Baoyan Liu, Xiaji Zhou, Runshun Zhang, Min Xu, Xuezhong Zhou

Symptom phenotypes are crucial for diagnosing and treating various disease conditions. However, the diversity of symptom terminologies poses a significant challenge to analyzing and sharing of symptom-related medical data, particularly in the field of traditional Chinese medicine (TCM). This study aims to construct an Integrated Symptom Phenotype Ontology (ISPO) to support data mining of Chinese electronic medical records (EMRs) and real-world studies in the TCM field.We manually annotated and extracted symptom terms from 21 classical TCM textbooks and 78,696 inpatient EMRs, and integrated them with five publicly available symptom-related biomedical vocabularies. Through a human-machine collaborative approach for terminology editing and ontology development, including term screening, semantic mapping, and concept classification, we constructed a high-quality symptom ontology that integrates both TCM and Western medical terminology.ISPO provides 3,147 concepts, 23,475 terms, and 23,363 hierarchical relationships. Compared with international symptom-related ontologies such as the Symptom Ontology, ISPO offers significant improvements in the number of terms and synonymous relationships. Furthermore, evaluation across three independent curated clinical datasets demonstrated that ISPO achieved over 90% coverage of symptom terms, highlighting its strong clinical usability and completeness.ISPO represents the first clinical ontology globally dedicated to the systematic representation of symptoms. It integrates symptom terminologies from historical and contemporary sources, encompassing both TCM and Western medicine, thereby enhancing semantic interoperability across heterogeneous medical data sources and clinical decision support systems in TCM.

症状表型对于诊断和治疗各种疾病状况至关重要。然而,症状术语的多样性给分析和共享与症状相关的医疗数据带来了重大挑战,特别是在中医领域。本研究旨在构建一个综合症状表型本体(ISPO),以支持中国电子病历(EMRs)的数据挖掘和中医领域的现实研究。我们对21本经典中医教科书和78,696份住院病历中的症状术语进行人工标注和提取,并将其与5个公开的症状相关生物医学词汇进行整合。通过人机协作的方式进行术语编辑和本体开发,包括术语筛选、语义映射和概念分类,我们构建了一个高质量的中医和西医术语的症状本体。ISPO提供了3,147个概念、23,475个术语和23,363个层次关系。与国际症状相关的本体(如症状本体)相比,ISPO在术语数量和同义关系方面有了显著的改进。此外,对三个独立的临床数据集的评估表明,ISPO实现了超过90%的症状术语覆盖率,突出了其强大的临床可用性和完整性。ISPO代表了全球第一个致力于系统表征症状的临床本体论。它整合了来自历史和现代的症状术语,包括中医和西医,从而增强了中医异构医疗数据源和临床决策支持系统的语义互操作性。
{"title":"ISPO: An Integrated Ontology of Symptom Phenotypes for Semantic Integration of Traditional Chinese Medical Data.","authors":"Zixin Shu, Rui Hua, Dengying Yan, Chenxia Lu, Meng Ren, Hong Gao, Ning Xu, Jun Li, Hui Zhu, Jia Zhang, Dan Zhao, Chenyang Hui, Chu Liao, Junqiu Ye, Qi Hao, Xinyan Wang, Xiaodong Li, Baoyan Liu, Xiaji Zhou, Runshun Zhang, Min Xu, Xuezhong Zhou","doi":"10.1055/a-2576-1847","DOIUrl":"10.1055/a-2576-1847","url":null,"abstract":"<p><p>Symptom phenotypes are crucial for diagnosing and treating various disease conditions. However, the diversity of symptom terminologies poses a significant challenge to analyzing and sharing of symptom-related medical data, particularly in the field of traditional Chinese medicine (TCM). This study aims to construct an Integrated Symptom Phenotype Ontology (ISPO) to support data mining of Chinese electronic medical records (EMRs) and real-world studies in the TCM field.We manually annotated and extracted symptom terms from 21 classical TCM textbooks and 78,696 inpatient EMRs, and integrated them with five publicly available symptom-related biomedical vocabularies. Through a human-machine collaborative approach for terminology editing and ontology development, including term screening, semantic mapping, and concept classification, we constructed a high-quality symptom ontology that integrates both TCM and Western medical terminology.ISPO provides 3,147 concepts, 23,475 terms, and 23,363 hierarchical relationships. Compared with international symptom-related ontologies such as the Symptom Ontology, ISPO offers significant improvements in the number of terms and synonymous relationships. Furthermore, evaluation across three independent curated clinical datasets demonstrated that ISPO achieved over 90% coverage of symptom terms, highlighting its strong clinical usability and completeness.ISPO represents the first clinical ontology globally dedicated to the systematic representation of symptoms. It integrates symptom terminologies from historical and contemporary sources, encompassing both TCM and Western medicine, thereby enhancing semantic interoperability across heterogeneous medical data sources and clinical decision support systems in TCM.</p>","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"164-175"},"PeriodicalIF":1.3,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144022609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Response to Letter by Dehaene et al. on Synthetic Discovery is not only a Problem of Differentially Private Synthetic Data. 对Dehaene等人关于合成发现的评论的回应不仅仅是一个差异私有合成数据的问题。
IF 1.3 4区 医学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-01 Epub Date: 2025-04-15 DOI: 10.1055/a-2540-8346
Ileana Montoya Perez, Parisa Movahedi, Valtteri Nieminen, Antti Airola, Tapio Pahikkala
{"title":"Response to Letter by Dehaene et al. on Synthetic Discovery is not only a Problem of Differentially Private Synthetic Data.","authors":"Ileana Montoya Perez, Parisa Movahedi, Valtteri Nieminen, Antti Airola, Tapio Pahikkala","doi":"10.1055/a-2540-8346","DOIUrl":"10.1055/a-2540-8346","url":null,"abstract":"","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"205-206"},"PeriodicalIF":1.3,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143993782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Methods of Information in Medicine
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1