Frank J Manion, Jingcheng Du, Dong Wang, Long He, Bin Lin, Jingqi Wang, Siwei Wang, David Eckels, Jan Cervenka, Peter C Fiduccia, Nicole Cossrow, Lixia Yao
Background: Systematic literature review (SLR), a robust method to identify and summarize evidence from published sources, is considered to be a complex, time-consuming, labor-intensive, and expensive task.
Objective: This study aimed to present a solution based on natural language processing (NLP) that accelerates and streamlines the SLR process for observational studies using real-world data.
Methods: We followed an agile software development and iterative software engineering methodology to build a customized intelligent end-to-end living NLP-assisted solution for observational SLR tasks. Multiple machine learning-based NLP algorithms were adopted to automate article screening and data element extraction processes. The NLP prediction results can be further reviewed and verified by domain experts, following the human-in-the-loop design. The system integrates explainable articificial intelligence to provide evidence for NLP algorithms and add transparency to extracted literature data elements. The system was developed based on 3 existing SLR projects of observational studies, including the epidemiology studies of human papillomavirus-associated diseases, the disease burden of pneumococcal diseases, and cost-effectiveness studies on pneumococcal vaccines.
Results: Our Intelligent SLR Platform covers major SLR steps, including study protocol setting, literature retrieval, abstract screening, full-text screening, data element extraction from full-text articles, results summary, and data visualization. The NLP algorithms achieved accuracy scores of 0.86-0.90 on article screening tasks (framed as text classification tasks) and macroaverage F1 scores of 0.57-0.89 on data element extraction tasks (framed as named entity recognition tasks).
Conclusions: Cutting-edge NLP algorithms expedite SLR for observational studies, thus allowing scientists to have more time to focus on the quality of data and the synthesis of evidence in observational studies. Aligning the living SLR concept, the system has the potential to update literature data and enable scientists to easily stay current with the literature related to observational studies prospectively and continuously.
{"title":"Accelerating Evidence Synthesis in Observational Studies: Development of a Living Natural Language Processing-Assisted Intelligent Systematic Literature Review System.","authors":"Frank J Manion, Jingcheng Du, Dong Wang, Long He, Bin Lin, Jingqi Wang, Siwei Wang, David Eckels, Jan Cervenka, Peter C Fiduccia, Nicole Cossrow, Lixia Yao","doi":"10.2196/54653","DOIUrl":"10.2196/54653","url":null,"abstract":"<p><strong>Background: </strong>Systematic literature review (SLR), a robust method to identify and summarize evidence from published sources, is considered to be a complex, time-consuming, labor-intensive, and expensive task.</p><p><strong>Objective: </strong>This study aimed to present a solution based on natural language processing (NLP) that accelerates and streamlines the SLR process for observational studies using real-world data.</p><p><strong>Methods: </strong>We followed an agile software development and iterative software engineering methodology to build a customized intelligent end-to-end living NLP-assisted solution for observational SLR tasks. Multiple machine learning-based NLP algorithms were adopted to automate article screening and data element extraction processes. The NLP prediction results can be further reviewed and verified by domain experts, following the human-in-the-loop design. The system integrates explainable articificial intelligence to provide evidence for NLP algorithms and add transparency to extracted literature data elements. The system was developed based on 3 existing SLR projects of observational studies, including the epidemiology studies of human papillomavirus-associated diseases, the disease burden of pneumococcal diseases, and cost-effectiveness studies on pneumococcal vaccines.</p><p><strong>Results: </strong>Our Intelligent SLR Platform covers major SLR steps, including study protocol setting, literature retrieval, abstract screening, full-text screening, data element extraction from full-text articles, results summary, and data visualization. The NLP algorithms achieved accuracy scores of 0.86-0.90 on article screening tasks (framed as text classification tasks) and macroaverage F1 scores of 0.57-0.89 on data element extraction tasks (framed as named entity recognition tasks).</p><p><strong>Conclusions: </strong>Cutting-edge NLP algorithms expedite SLR for observational studies, thus allowing scientists to have more time to focus on the quality of data and the synthesis of evidence in observational studies. Aligning the living SLR concept, the system has the potential to update literature data and enable scientists to easily stay current with the literature related to observational studies prospectively and continuously.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e54653"},"PeriodicalIF":3.1,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11523763/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142513879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: With the aging population on the rise, the demand for effective health care solutions to address adverse drug events is becoming increasingly urgent. Telemedicine has emerged as a promising solution for strengthening health care delivery in home care settings and mitigating drug errors. Due to the indispensable role of family caregivers in daily patient care, integrating digital health tools has the potential to streamline medication management processes and enhance the overall quality of patient care.
Objective: This study aims to explore health care professionals' perspectives on the use of a medication and care support system (MCSS) and collect recommendations for designing a similar tool for family caregivers.
Methods: Fifteen interviews with health care professionals in a home care center were conducted. Thematic analysis was used, and 5 key themes highlighting the importance of using the MCSS tool to improve medication management in home care were identified.
Results: All participants emphasized the necessity of direct communication between health care professionals and family caregivers and stated that family caregivers need comprehensive information about medication administration, patient conditions, and symptoms. Furthermore, the health care professionals recommended features and functions customized for family caregivers.
Conclusions: This study underscored the importance of clear communication between health care professionals and family caregivers and the provision of comprehensive instructions to promote safe medication practices. By equipping family caregivers with essential information via a tool similar to the MCSS, a proactive approach to preventing errors and improving outcomes is advocated.
{"title":"Exploring Health Care Professionals' Perspectives on the Use of a Medication and Care Support System and Recommendations for Designing a Similar Tool for Family Caregivers: Interview Study Among Health Care Professionals.","authors":"Aimerence Ashimwe, Nadia Davoody","doi":"10.2196/63456","DOIUrl":"10.2196/63456","url":null,"abstract":"<p><strong>Background: </strong>With the aging population on the rise, the demand for effective health care solutions to address adverse drug events is becoming increasingly urgent. Telemedicine has emerged as a promising solution for strengthening health care delivery in home care settings and mitigating drug errors. Due to the indispensable role of family caregivers in daily patient care, integrating digital health tools has the potential to streamline medication management processes and enhance the overall quality of patient care.</p><p><strong>Objective: </strong>This study aims to explore health care professionals' perspectives on the use of a medication and care support system (MCSS) and collect recommendations for designing a similar tool for family caregivers.</p><p><strong>Methods: </strong>Fifteen interviews with health care professionals in a home care center were conducted. Thematic analysis was used, and 5 key themes highlighting the importance of using the MCSS tool to improve medication management in home care were identified.</p><p><strong>Results: </strong>All participants emphasized the necessity of direct communication between health care professionals and family caregivers and stated that family caregivers need comprehensive information about medication administration, patient conditions, and symptoms. Furthermore, the health care professionals recommended features and functions customized for family caregivers.</p><p><strong>Conclusions: </strong>This study underscored the importance of clear communication between health care professionals and family caregivers and the provision of comprehensive instructions to promote safe medication practices. By equipping family caregivers with essential information via a tool similar to the MCSS, a proactive approach to preventing errors and improving outcomes is advocated.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e63456"},"PeriodicalIF":3.1,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11541148/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142513880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Miguel Nunes, Joao Bone, Joao C Ferreira, Luis B Elvas
<p><strong>Background: </strong>In response to the intricate language, specialized terminology outside everyday life, and the frequent presence of abbreviations and acronyms inherent in health care text data, domain adaptation techniques have emerged as crucial to transformer-based models. This refinement in the knowledge of the language models (LMs) allows for a better understanding of the medical textual data, which results in an improvement in medical downstream tasks, such as information extraction (IE). We have identified a gap in the literature regarding health care LMs. Therefore, this study presents a scoping literature review investigating domain adaptation methods for transformers in health care, differentiating between English and non-English languages, focusing on Portuguese. Most specifically, we investigated the development of health care LMs, with the aim of comparing Portuguese with other more developed languages to guide the path of a non-English-language with fewer resources.</p><p><strong>Objective: </strong>This study aimed to research health care IE models, regardless of language, to understand the efficacy of transformers and what are the medical entities most commonly extracted.</p><p><strong>Methods: </strong>This scoping review was conducted using the PRISMA-ScR (Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews) methodology on Scopus and Web of Science Core Collection databases. Only studies that mentioned the creation of health care LMs or health care IE models were included, while large language models (LLMs) were excluded. The latest were not included since we wanted to research LMs and not LLMs, which are architecturally different and have distinct purposes.</p><p><strong>Results: </strong>Our search query retrieved 137 studies, 60 of which met the inclusion criteria, and none of them were systematic literature reviews. English and Chinese are the languages with the most health care LMs developed. These languages already have disease-specific LMs, while others only have general-health care LMs. European Portuguese does not have any public health care LM and should take examples from other languages to develop, first, general-health care LMs and then, in an advanced phase, disease-specific LMs. Regarding IE models, transformers were the most commonly used method, and named entity recognition was the most popular topic, with only a few studies mentioning Assertion Status or addressing medical lexical problems. The most extracted entities were diagnosis, posology, and symptoms.</p><p><strong>Conclusions: </strong>The findings indicate that domain adaptation is beneficial, achieving better results in downstream tasks. Our analysis allowed us to understand that the use of transformers is more developed for the English and Chinese languages. European Portuguese lacks relevant studies and should draw examples from other non-English languages to develop these models and drive pr
背景:针对医疗文本数据中固有的错综复杂的语言、日常生活之外的专业术语以及频繁出现的缩略语和首字母缩写词,领域适应技术已成为基于转换器的模型的关键。语言模型(LMs)知识的这种完善可以更好地理解医疗文本数据,从而改进医疗下游任务,如信息提取(IE)。我们发现有关医疗保健语言模型的文献存在空白。因此,本研究对医疗保健领域转换器的领域适应方法进行了文献综述,区分了英语和非英语语言,重点关注葡萄牙语。最具体地说,我们调查了医疗保健 LM 的发展情况,目的是将葡萄牙语与其他更发达的语言进行比较,以指导资源较少的非英语语言的发展道路:本研究旨在研究医疗保健 IE 模型(无论使用哪种语言),以了解转换器的功效以及最常提取的医疗实体:本范围综述采用 PRISMA-ScR(Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews)方法在 Scopus 和 Web of Science Core Collection 数据库中进行。只有提及创建医疗保健 LM 或医疗保健 IE 模型的研究才被纳入,而大型语言模型 (LLM) 则被排除在外。最新的研究未被包括在内,因为我们想研究的是 LM,而不是 LLM,它们在结构上不同,有不同的目的:我们的搜索查询检索到 137 项研究,其中 60 项符合纳入标准,但没有一项是系统性文献综述。英语和汉语是开发了最多医疗保健 LM 的语言。这些语言已经有了针对特定疾病的 LM,而其他语言只有普通保健 LM。欧洲葡萄牙语没有任何公共保健 LM,应借鉴其他语言的例子,首先开发普通保健 LM,然后在高级阶段开发特定疾病 LM。关于 IE 模型,转换器是最常用的方法,命名实体识别是最受欢迎的主题,只有少数研究提到了断言状态或解决医学词汇问题。提取最多的实体是诊断、姿势和症状:研究结果表明,领域适应是有益的,可以在下游任务中取得更好的结果。我们通过分析了解到,转换器的使用在英语和汉语中更为成熟。欧洲葡萄牙语缺乏相关研究,应借鉴其他非英语语言的例子来开发这些模型,推动人工智能的进步。突出显示医疗相关信息并优化文本数据的阅读,或将这些信息用于创建患者医疗时间表,从而进行特征分析,都能让医疗保健专业人员从中受益。
{"title":"Health Care Language Models and Their Fine-Tuning for Information Extraction: Scoping Review.","authors":"Miguel Nunes, Joao Bone, Joao C Ferreira, Luis B Elvas","doi":"10.2196/60164","DOIUrl":"10.2196/60164","url":null,"abstract":"<p><strong>Background: </strong>In response to the intricate language, specialized terminology outside everyday life, and the frequent presence of abbreviations and acronyms inherent in health care text data, domain adaptation techniques have emerged as crucial to transformer-based models. This refinement in the knowledge of the language models (LMs) allows for a better understanding of the medical textual data, which results in an improvement in medical downstream tasks, such as information extraction (IE). We have identified a gap in the literature regarding health care LMs. Therefore, this study presents a scoping literature review investigating domain adaptation methods for transformers in health care, differentiating between English and non-English languages, focusing on Portuguese. Most specifically, we investigated the development of health care LMs, with the aim of comparing Portuguese with other more developed languages to guide the path of a non-English-language with fewer resources.</p><p><strong>Objective: </strong>This study aimed to research health care IE models, regardless of language, to understand the efficacy of transformers and what are the medical entities most commonly extracted.</p><p><strong>Methods: </strong>This scoping review was conducted using the PRISMA-ScR (Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews) methodology on Scopus and Web of Science Core Collection databases. Only studies that mentioned the creation of health care LMs or health care IE models were included, while large language models (LLMs) were excluded. The latest were not included since we wanted to research LMs and not LLMs, which are architecturally different and have distinct purposes.</p><p><strong>Results: </strong>Our search query retrieved 137 studies, 60 of which met the inclusion criteria, and none of them were systematic literature reviews. English and Chinese are the languages with the most health care LMs developed. These languages already have disease-specific LMs, while others only have general-health care LMs. European Portuguese does not have any public health care LM and should take examples from other languages to develop, first, general-health care LMs and then, in an advanced phase, disease-specific LMs. Regarding IE models, transformers were the most commonly used method, and named entity recognition was the most popular topic, with only a few studies mentioning Assertion Status or addressing medical lexical problems. The most extracted entities were diagnosis, posology, and symptoms.</p><p><strong>Conclusions: </strong>The findings indicate that domain adaptation is beneficial, achieving better results in downstream tasks. Our analysis allowed us to understand that the use of transformers is more developed for the English and Chinese languages. European Portuguese lacks relevant studies and should draw examples from other non-English languages to develop these models and drive pr","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e60164"},"PeriodicalIF":3.1,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11535799/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jesse Kruse, Joshua Wiedekopf, Ann-Kristin Kock-Schoppenhauer, Andrea Essenwanger, Josef Ingenerf, Hannes Ulrich
<p><strong>Background: </strong>Reaching meaningful interoperability between proprietary health care systems is a ubiquitous task in medical informatics, where communication servers are traditionally used for referring and transforming data from the source to target systems. The Mirth Connect Server, an open-source communication server, offers, in addition to the exchange functionality, functions for simultaneous manipulation of data. The standard Fast Healthcare Interoperability Resources (FHIR) has recently become increasingly prevalent in national health care systems. FHIR specifies its own standardized mechanisms for transforming data structures using StructureMaps and the FHIR mapping language (FML).</p><p><strong>Objective: </strong>In this study, a generic approach is developed, which allows for the application of declarative mapping rules defined using FML in an exchangeable manner. A transformation engine is required to execute the mapping rules.</p><p><strong>Methods: </strong>FHIR natively defines resources to support the conversion of instance data, such as an FHIR StructureMap. This resource encodes all information required to transform data from a source system to a target system. In our approach, this information is defined in an implementation-independent manner using FML. Once the mapping has been defined, executable Mirth channels are automatically generated from the resources containing the mapping in JavaScript format. These channels can then be deployed to the Mirth Connect Server.</p><p><strong>Results: </strong>The resulting tool is called FML2Mirth, a Java-based transformer that derives Mirth channels from detailed declarative mapping rules based on the underlying StructureMaps. Implementation of the translate functionality is provided by the integration of a terminology server, and to achieve conformity with existing profiles, validation via the FHIR validator is built in. The system was evaluated for its practical use by transforming Labordatenträger version 2 (LDTv.2) laboratory results into Medical Information Object (Medizinisches Informationsobjekt) laboratory reports in accordance with the National Association of Statutory Health Insurance Physicians' specifications and into the HL7 (Health Level Seven) Europe Laboratory Report. The system could generate complex structures, but LDTv.2 lacks some information to fully comply with the specification.</p><p><strong>Conclusions: </strong>The tool for the auto-generation of Mirth channels was successfully presented. Our tests reveal the feasibility of using the complex structures of the mapping language in combination with a terminology server to transform instance data. Although the Mirth Server and the FHIR are well established in medical informatics, the combination offers space for more research, especially with regard to FML. Simultaneously, it can be stated that the mapping language still has implementation-related shortcomings that can be compensated by Mirth Connec
{"title":"A Generic Transformation Approach for Complex Laboratory Data Using the Fast Healthcare Interoperability Resources Mapping Language: Method Development and Implementation.","authors":"Jesse Kruse, Joshua Wiedekopf, Ann-Kristin Kock-Schoppenhauer, Andrea Essenwanger, Josef Ingenerf, Hannes Ulrich","doi":"10.2196/57569","DOIUrl":"10.2196/57569","url":null,"abstract":"<p><strong>Background: </strong>Reaching meaningful interoperability between proprietary health care systems is a ubiquitous task in medical informatics, where communication servers are traditionally used for referring and transforming data from the source to target systems. The Mirth Connect Server, an open-source communication server, offers, in addition to the exchange functionality, functions for simultaneous manipulation of data. The standard Fast Healthcare Interoperability Resources (FHIR) has recently become increasingly prevalent in national health care systems. FHIR specifies its own standardized mechanisms for transforming data structures using StructureMaps and the FHIR mapping language (FML).</p><p><strong>Objective: </strong>In this study, a generic approach is developed, which allows for the application of declarative mapping rules defined using FML in an exchangeable manner. A transformation engine is required to execute the mapping rules.</p><p><strong>Methods: </strong>FHIR natively defines resources to support the conversion of instance data, such as an FHIR StructureMap. This resource encodes all information required to transform data from a source system to a target system. In our approach, this information is defined in an implementation-independent manner using FML. Once the mapping has been defined, executable Mirth channels are automatically generated from the resources containing the mapping in JavaScript format. These channels can then be deployed to the Mirth Connect Server.</p><p><strong>Results: </strong>The resulting tool is called FML2Mirth, a Java-based transformer that derives Mirth channels from detailed declarative mapping rules based on the underlying StructureMaps. Implementation of the translate functionality is provided by the integration of a terminology server, and to achieve conformity with existing profiles, validation via the FHIR validator is built in. The system was evaluated for its practical use by transforming Labordatenträger version 2 (LDTv.2) laboratory results into Medical Information Object (Medizinisches Informationsobjekt) laboratory reports in accordance with the National Association of Statutory Health Insurance Physicians' specifications and into the HL7 (Health Level Seven) Europe Laboratory Report. The system could generate complex structures, but LDTv.2 lacks some information to fully comply with the specification.</p><p><strong>Conclusions: </strong>The tool for the auto-generation of Mirth channels was successfully presented. Our tests reveal the feasibility of using the complex structures of the mapping language in combination with a terminology server to transform instance data. Although the Mirth Server and the FHIR are well established in medical informatics, the combination offers space for more research, especially with regard to FML. Simultaneously, it can be stated that the mapping language still has implementation-related shortcomings that can be compensated by Mirth Connec","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e57569"},"PeriodicalIF":3.1,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11508034/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Manasha Fernando, Bridget Abell, Steven M McPhail, Zephanie Tyack, Amina Tariq, Sundresan Naicker
<p><strong>Background: </strong>Computerized clinical decision support systems (CDSSs) enhance patient care through real-time, evidence-based guidance for health care professionals. Despite this, the effective implementation of these systems for health services presents multifaceted challenges, leading to inappropriate use and abandonment over the course of time. Using the Non-Adoption, Abandonment, Scale-Up, Spread, and Sustainability (NASSS) framework, this qualitative study examined CDSS adoption in a metropolitan health service, identifying determinants across implementation stages to optimize CDSS integration into health care practice.</p><p><strong>Objective: </strong>This study aims to identify the theory-informed (NASSS) determinants, which included multiple CDSS interventions across a 2-year period, both at the health-service level and at the individual hospital setting, that either facilitate or hinder the application of CDSSs within a metropolitan health service. In addition, this study aimed to map these determinants onto specific stages of the implementation process, thereby developing a system-level understanding of CDSS application across implementation stages.</p><p><strong>Methods: </strong>Participants involved in various stages of the implementation process were recruited (N=30). Participants took part in interviews and focus groups. We used a hybrid inductive-deductive qualitative content analysis and a framework mapping approach to categorize findings into barriers, enablers, or neutral determinants aligned to NASSS framework domains. These determinants were also mapped to implementation stages using the Active Implementation Framework stages approach.</p><p><strong>Results: </strong>Participants comprised clinical adopters (14/30, 47%), organizational champions (5/30, 16%), and those with roles in organizational clinical informatics (5/30, 16%). Most determinants were mapped to the organization level, technology, and adopter subdomains. However, the study findings also demonstrated a relative lack of long-term implementation planning. Consequently, determinants were not uniformly distributed across the stages of implementation, with 61.1% (77/126) identified in the exploration stage, 30.9% (39/126) in the full implementation stage, and 4.7% (6/126) in the installation stages. Stakeholders engaged in more preimplementation and full-scale implementation activities, with fewer cycles of monitoring and iteration activities identified.</p><p><strong>Conclusions: </strong>These findings addressed a substantial knowledge gap in the literature using systems thinking principles to identify the interdependent dynamics of CDSS implementation. A lack of sustained implementation strategies (ie, training and longer-term, adopter-level championing) weakened the sociotechnical network between developers and adopters, leading to communication barriers. More rigorous implementation planning, encompassing all 4 implementation stages, may, in a
{"title":"Applying the Non-Adoption, Abandonment, Scale-up, Spread, and Sustainability Framework Across Implementation Stages to Identify Key Strategies to Facilitate Clinical Decision Support System Integration Within a Large Metropolitan Health Service: Interview and Focus Group Study.","authors":"Manasha Fernando, Bridget Abell, Steven M McPhail, Zephanie Tyack, Amina Tariq, Sundresan Naicker","doi":"10.2196/60402","DOIUrl":"10.2196/60402","url":null,"abstract":"<p><strong>Background: </strong>Computerized clinical decision support systems (CDSSs) enhance patient care through real-time, evidence-based guidance for health care professionals. Despite this, the effective implementation of these systems for health services presents multifaceted challenges, leading to inappropriate use and abandonment over the course of time. Using the Non-Adoption, Abandonment, Scale-Up, Spread, and Sustainability (NASSS) framework, this qualitative study examined CDSS adoption in a metropolitan health service, identifying determinants across implementation stages to optimize CDSS integration into health care practice.</p><p><strong>Objective: </strong>This study aims to identify the theory-informed (NASSS) determinants, which included multiple CDSS interventions across a 2-year period, both at the health-service level and at the individual hospital setting, that either facilitate or hinder the application of CDSSs within a metropolitan health service. In addition, this study aimed to map these determinants onto specific stages of the implementation process, thereby developing a system-level understanding of CDSS application across implementation stages.</p><p><strong>Methods: </strong>Participants involved in various stages of the implementation process were recruited (N=30). Participants took part in interviews and focus groups. We used a hybrid inductive-deductive qualitative content analysis and a framework mapping approach to categorize findings into barriers, enablers, or neutral determinants aligned to NASSS framework domains. These determinants were also mapped to implementation stages using the Active Implementation Framework stages approach.</p><p><strong>Results: </strong>Participants comprised clinical adopters (14/30, 47%), organizational champions (5/30, 16%), and those with roles in organizational clinical informatics (5/30, 16%). Most determinants were mapped to the organization level, technology, and adopter subdomains. However, the study findings also demonstrated a relative lack of long-term implementation planning. Consequently, determinants were not uniformly distributed across the stages of implementation, with 61.1% (77/126) identified in the exploration stage, 30.9% (39/126) in the full implementation stage, and 4.7% (6/126) in the installation stages. Stakeholders engaged in more preimplementation and full-scale implementation activities, with fewer cycles of monitoring and iteration activities identified.</p><p><strong>Conclusions: </strong>These findings addressed a substantial knowledge gap in the literature using systems thinking principles to identify the interdependent dynamics of CDSS implementation. A lack of sustained implementation strategies (ie, training and longer-term, adopter-level championing) weakened the sociotechnical network between developers and adopters, leading to communication barriers. More rigorous implementation planning, encompassing all 4 implementation stages, may, in a","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e60402"},"PeriodicalIF":3.1,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11528173/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shengyu Liu, Anran Wang, Xiaolei Xiu, Ming Zhong, Sizhu Wu
<p><strong>Background: </strong>Named entity recognition (NER) models are essential for extracting structured information from unstructured medical texts by identifying entities such as diseases, treatments, and conditions, enhancing clinical decision-making and research. Innovations in machine learning, particularly those involving Bidirectional Encoder Representations From Transformers (BERT)-based deep learning and large language models, have significantly advanced NER capabilities. However, their performance varies across medical datasets due to the complexity and diversity of medical terminology. Previous studies have often focused on overall performance, neglecting specific challenges in medical contexts and the impact of macrofactors like lexical composition on prediction accuracy. These gaps hinder the development of optimized NER models for medical applications.</p><p><strong>Objective: </strong>This study aims to meticulously evaluate the performance of various NER models in the context of medical text analysis, focusing on how complex medical terminology affects entity recognition accuracy. Additionally, we explored the influence of macrofactors on model performance, seeking to provide insights for refining NER models and enhancing their reliability for medical applications.</p><p><strong>Methods: </strong>This study comprehensively evaluated 7 NER models-hidden Markov models, conditional random fields, BERT for Biomedical Text Mining, Big Transformer Models for Efficient Long-Sequence Attention, Decoding-enhanced BERT with Disentangled Attention, Robustly Optimized BERT Pretraining Approach, and Gemma-across 3 medical datasets: Revised Joint Workshop on Natural Language Processing in Biomedicine and its Applications (JNLPBA), BioCreative V CDR, and Anatomical Entity Mention (AnatEM). The evaluation focused on prediction accuracy, resource use (eg, central processing unit and graphics processing unit use), and the impact of fine-tuning hyperparameters. The macrofactors affecting model performance were also screened using the multilevel factor elimination algorithm.</p><p><strong>Results: </strong>The fine-tuned BERT for Biomedical Text Mining, with balanced resource use, generally achieved the highest prediction accuracy across the Revised JNLPBA and AnatEM datasets, with microaverage (AVG_MICRO) scores of 0.932 and 0.8494, respectively, highlighting its superior proficiency in identifying medical entities. Gemma, fine-tuned using the low-rank adaptation technique, achieved the highest accuracy on the BioCreative V CDR dataset with an AVG_MICRO score of 0.9962 but exhibited variability across the other datasets (AVG_MICRO scores of 0.9088 on the Revised JNLPBA and 0.8029 on AnatEM), indicating a need for further optimization. In addition, our analysis revealed that 2 macrofactors, entity phrase length and the number of entity words in each entity phrase, significantly influenced model performance.</p><p><strong>Conclusions: </strong>Th
背景:命名实体识别(NER)模型对于从非结构化医学文本中提取结构化信息至关重要,它可以识别疾病、治疗和病情等实体,从而加强临床决策和研究。机器学习领域的创新,尤其是基于双向编码器变换器表征(BERT)的深度学习和大型语言模型的创新,大大提高了 NER 的能力。然而,由于医学术语的复杂性和多样性,它们在不同医学数据集上的表现也不尽相同。以往的研究往往只关注整体性能,而忽视了医学语境中的特定挑战以及词法构成等宏观因素对预测准确性的影响。这些差距阻碍了针对医疗应用开发优化的 NER 模型:本研究旨在细致评估各种 NER 模型在医学文本分析中的性能,重点关注复杂的医学术语如何影响实体识别的准确性。此外,我们还探讨了宏观因素对模型性能的影响,力求为完善 NER 模型和提高其在医学应用中的可靠性提供见解:本研究在 3 个医学数据集上全面评估了 7 种 NER 模型--隐藏马尔可夫模型、条件随机场、生物医学文本挖掘 BERT、高效长序列注意的大变换器模型、解码增强型 BERT 与分离注意、稳健优化的 BERT 预训练方法和 Gemma:这些数据集包括:生物医学自然语言处理及其应用联合研讨会修订版(JNLPBA)、BioCreative V CDR 和 Anatomical Entity Mention (AnatEM)。评估的重点是预测准确性、资源使用(如中央处理单元和图形处理单元的使用)以及微调超参数的影响。此外,还使用多级因子消除算法筛选了影响模型性能的宏观因素:结果:经过微调的生物医学文本挖掘 BERT 在均衡使用资源的情况下,在修订版 JNLPBA 和 AnatEM 数据集上普遍获得了最高的预测准确率,微观平均(AVG_MICRO)得分分别为 0.932 和 0.8494,这突出表明它在识别医学实体方面具有卓越的能力。使用低秩适应技术进行微调的 Gemma 在 BioCreative V CDR 数据集上达到了最高的准确率,AVG_MICRO 得分为 0.9962,但在其他数据集上表现出了差异(在修订版 JNLPBA 上的 AVG_MICRO 得分为 0.9088,在 AnatEM 上的 AVG_MICRO 得分为 0.8029),这表明需要进一步优化。此外,我们的分析表明,实体短语长度和每个实体短语中的实体词数量这两个宏观因素对模型性能有显著影响:本研究突出了 NER 模型在医学信息学中的重要作用,强调了通过精确的数据定位和微调来优化模型的必要性。本研究的见解将显著改善临床决策,促进创建更复杂、更有效的医学 NER 模型。
{"title":"Evaluating Medical Entity Recognition in Health Care: Entity Model Quantitative Study.","authors":"Shengyu Liu, Anran Wang, Xiaolei Xiu, Ming Zhong, Sizhu Wu","doi":"10.2196/59782","DOIUrl":"10.2196/59782","url":null,"abstract":"<p><strong>Background: </strong>Named entity recognition (NER) models are essential for extracting structured information from unstructured medical texts by identifying entities such as diseases, treatments, and conditions, enhancing clinical decision-making and research. Innovations in machine learning, particularly those involving Bidirectional Encoder Representations From Transformers (BERT)-based deep learning and large language models, have significantly advanced NER capabilities. However, their performance varies across medical datasets due to the complexity and diversity of medical terminology. Previous studies have often focused on overall performance, neglecting specific challenges in medical contexts and the impact of macrofactors like lexical composition on prediction accuracy. These gaps hinder the development of optimized NER models for medical applications.</p><p><strong>Objective: </strong>This study aims to meticulously evaluate the performance of various NER models in the context of medical text analysis, focusing on how complex medical terminology affects entity recognition accuracy. Additionally, we explored the influence of macrofactors on model performance, seeking to provide insights for refining NER models and enhancing their reliability for medical applications.</p><p><strong>Methods: </strong>This study comprehensively evaluated 7 NER models-hidden Markov models, conditional random fields, BERT for Biomedical Text Mining, Big Transformer Models for Efficient Long-Sequence Attention, Decoding-enhanced BERT with Disentangled Attention, Robustly Optimized BERT Pretraining Approach, and Gemma-across 3 medical datasets: Revised Joint Workshop on Natural Language Processing in Biomedicine and its Applications (JNLPBA), BioCreative V CDR, and Anatomical Entity Mention (AnatEM). The evaluation focused on prediction accuracy, resource use (eg, central processing unit and graphics processing unit use), and the impact of fine-tuning hyperparameters. The macrofactors affecting model performance were also screened using the multilevel factor elimination algorithm.</p><p><strong>Results: </strong>The fine-tuned BERT for Biomedical Text Mining, with balanced resource use, generally achieved the highest prediction accuracy across the Revised JNLPBA and AnatEM datasets, with microaverage (AVG_MICRO) scores of 0.932 and 0.8494, respectively, highlighting its superior proficiency in identifying medical entities. Gemma, fine-tuned using the low-rank adaptation technique, achieved the highest accuracy on the BioCreative V CDR dataset with an AVG_MICRO score of 0.9962 but exhibited variability across the other datasets (AVG_MICRO scores of 0.9088 on the Revised JNLPBA and 0.8029 on AnatEM), indicating a need for further optimization. In addition, our analysis revealed that 2 macrofactors, entity phrase length and the number of entity words in each entity phrase, significantly influenced model performance.</p><p><strong>Conclusions: </strong>Th","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e59782"},"PeriodicalIF":3.1,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11528166/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Obtaining and describing semiology efficiently and classifying seizure types correctly are crucial for the diagnosis and treatment of epilepsy. Nevertheless, there exists an inadequacy in related informatics resources and decision support tools.
Objective: We developed a symptom entity extraction tool and an epilepsy semiology ontology (ESO) and used machine learning to achieve an automated binary classification of epilepsy in this study.
Methods: Using present history data of electronic health records from the Southwest Epilepsy Center in China, we constructed an ESO and a symptom-entity extraction tool to extract seizure duration, seizure symptoms, and seizure frequency from the unstructured text by combining manual annotation with natural language processing techniques. In addition, we achieved automatic classification of patients in the study cohort with high accuracy based on the extracted seizure feature data using multiple machine learning methods.
Results: Data included present history from 10,925 cases between 2010 and 2020. Six annotators labeled a total of 2500 texts to obtain 5844 words of semiology and construct an ESO with 702 terms. Based on the ontology, the extraction tool achieved an accuracy rate of 85% in symptom extraction. Furthermore, we trained a stacking ensemble learning model combining XGBoost and random forest with an F1-score of 75.03%. The random forest model had the highest area under the curve (0.985).
Conclusions: This work demonstrated the feasibility of natural language processing-assisted structural extraction of epilepsy medical record texts and downstream tasks, providing open ontology resources for subsequent related work.
{"title":"Semiology Extraction and Machine Learning-Based Classification of Electronic Health Records for Patients With Epilepsy: Retrospective Analysis.","authors":"Yilin Xia, Mengqiao He, Sijia Basang, Leihao Sha, Zijie Huang, Ling Jin, Yifei Duan, Yusha Tang, Hua Li, Wanlin Lai, Lei Chen","doi":"10.2196/57727","DOIUrl":"https://doi.org/10.2196/57727","url":null,"abstract":"<p><strong>Background: </strong>Obtaining and describing semiology efficiently and classifying seizure types correctly are crucial for the diagnosis and treatment of epilepsy. Nevertheless, there exists an inadequacy in related informatics resources and decision support tools.</p><p><strong>Objective: </strong>We developed a symptom entity extraction tool and an epilepsy semiology ontology (ESO) and used machine learning to achieve an automated binary classification of epilepsy in this study.</p><p><strong>Methods: </strong>Using present history data of electronic health records from the Southwest Epilepsy Center in China, we constructed an ESO and a symptom-entity extraction tool to extract seizure duration, seizure symptoms, and seizure frequency from the unstructured text by combining manual annotation with natural language processing techniques. In addition, we achieved automatic classification of patients in the study cohort with high accuracy based on the extracted seizure feature data using multiple machine learning methods.</p><p><strong>Results: </strong>Data included present history from 10,925 cases between 2010 and 2020. Six annotators labeled a total of 2500 texts to obtain 5844 words of semiology and construct an ESO with 702 terms. Based on the ontology, the extraction tool achieved an accuracy rate of 85% in symptom extraction. Furthermore, we trained a stacking ensemble learning model combining XGBoost and random forest with an F1-score of 75.03%. The random forest model had the highest area under the curve (0.985).</p><p><strong>Conclusions: </strong>This work demonstrated the feasibility of natural language processing-assisted structural extraction of epilepsy medical record texts and downstream tasks, providing open ontology resources for subsequent related work.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e57727"},"PeriodicalIF":3.1,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11501417/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142774814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Medical errors are becoming a major problem for health care providers and those who design health policies. These errors cause patients' illnesses to worsen over time and can make recovery impossible. For the benefit of patients and the welfare of health care providers, a decrease in these errors is required to maintain safe, high-quality patient care.
Objective: This study aimed to improve the ability of health care professionals to diagnose diseases and reduce medical errors.
Methods: Data collection was performed at Dr George Mukhari Academic Hospital using convenience sampling. In total, 300 health care professionals were given a self-administered questionnaire, including doctors, dentists, pharmacists, physiologists, and nurses. To test the study hypotheses, multiple linear regression was used to evaluate empirical data.
Results: In the sample of 300 health care professionals, no significant correlation was found between medical error reduction (MER) and knowledge quality (KQ) (β=.043, P=.48). A nonsignificant negative relationship existed between MER and information quality (IQ) (β=-.080, P=.19). However, a significant positive relationship was observed between MER and electronic health records (EHR; β=.125, 95% CI 0.005-0.245, P=.042).
Conclusions: Increasing patient access to medical records for health care professionals may significantly improve patient health and well-being. The effectiveness of health care organizations' operations can also be increased through better health information systems. To lower medical errors and enhance patient outcomes, policy makers should provide financing and support for EHR adoption as a top priority. Health care administrators should also concentrate on providing staff with the training they need to operate these systems efficiently. Empirical surveys in other public and private hospitals can be used to further test the validated survey instrument.
背景:医疗失误正成为医疗服务提供者和医疗政策制定者面临的一个主要问题。这些错误会导致病人的病情长期恶化,甚至无法康复。为了病人的利益和医疗服务提供者的福利,必须减少这些错误,以保持安全、高质量的病人护理:本研究旨在提高医护人员诊断疾病的能力,减少医疗失误:方法:在乔治-穆哈里博士学术医院采用便利抽样法收集数据。共向 300 名医护人员发放了自填问卷,其中包括医生、牙医、药剂师、生理学家和护士。为了验证研究假设,采用了多元线性回归法来评估经验数据:在 300 名医护人员的样本中,发现减少医疗差错(MER)与知识质量(KQ)之间没有明显的相关性(β=.043,P=.48)。减少医疗差错(MER)与信息质量(IQ)之间存在不明显的负相关关系(β=-.080,P=.19)。然而,MER 与电子健康记录(EHR;β=.125,95% CI 0.005-0.245,P=.042)之间存在明显的正相关关系:增加患者对医疗保健专业人员医疗记录的访问可能会大大改善患者的健康和福祉。通过更好的医疗信息系统还可以提高医疗机构的运营效率。为了减少医疗失误并提高病人的治疗效果,政策制定者应优先为电子病历的采用提供资金和支持。医疗管理者也应集中精力为员工提供有效操作这些系统所需的培训。在其他公立和私立医院进行的经验性调查可用于进一步检验经过验证的调查工具。
{"title":"The Effects of Electronic Health Records on Medical Error Reduction: Extension of the DeLone and McLean Information System Success Model.","authors":"Bester Chimbo, Lovemore Motsi","doi":"10.2196/54572","DOIUrl":"10.2196/54572","url":null,"abstract":"<p><strong>Background: </strong>Medical errors are becoming a major problem for health care providers and those who design health policies. These errors cause patients' illnesses to worsen over time and can make recovery impossible. For the benefit of patients and the welfare of health care providers, a decrease in these errors is required to maintain safe, high-quality patient care.</p><p><strong>Objective: </strong>This study aimed to improve the ability of health care professionals to diagnose diseases and reduce medical errors.</p><p><strong>Methods: </strong>Data collection was performed at Dr George Mukhari Academic Hospital using convenience sampling. In total, 300 health care professionals were given a self-administered questionnaire, including doctors, dentists, pharmacists, physiologists, and nurses. To test the study hypotheses, multiple linear regression was used to evaluate empirical data.</p><p><strong>Results: </strong>In the sample of 300 health care professionals, no significant correlation was found between medical error reduction (MER) and knowledge quality (KQ) (β=.043, P=.48). A nonsignificant negative relationship existed between MER and information quality (IQ) (β=-.080, P=.19). However, a significant positive relationship was observed between MER and electronic health records (EHR; β=.125, 95% CI 0.005-0.245, P=.042).</p><p><strong>Conclusions: </strong>Increasing patient access to medical records for health care professionals may significantly improve patient health and well-being. The effectiveness of health care organizations' operations can also be increased through better health information systems. To lower medical errors and enhance patient outcomes, policy makers should provide financing and support for EHR adoption as a top priority. Health care administrators should also concentrate on providing staff with the training they need to operate these systems efficiently. Empirical surveys in other public and private hospitals can be used to further test the validated survey instrument.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e54572"},"PeriodicalIF":3.1,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11525084/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abolfazl Mollalo, Bashir Hamidi, Leslie A Lenert, Alexander V Alekseyenko
Background: Electronic health records (EHRs) commonly contain patient addresses that provide valuable data for geocoding and spatial analysis, enabling more comprehensive descriptions of individual patients for clinical purposes. Despite the widespread use of EHRs in clinical decision support and interventions, no systematic review has examined the extent to which spatial analysis is used to characterize patient phenotypes.
Objective: This study reviews advanced spatial analyses that used individual-level health data from EHRs within the United States to characterize patient phenotypes.
Methods: We systematically evaluated English-language, peer-reviewed studies from the PubMed/MEDLINE, Scopus, Web of Science, and Google Scholar databases from inception to August 20, 2023, without imposing constraints on study design or specific health domains.
Results: A substantial proportion of studies (>85%) were limited to geocoding or basic mapping without implementing advanced spatial statistical analysis, leaving only 49 studies that met the eligibility criteria. These studies used diverse spatial methods, with a predominant focus on clustering techniques, while spatiotemporal analysis (frequentist and Bayesian) and modeling were less common. A noteworthy surge (n=42, 86%) in publications was observed after 2017. The publications investigated a variety of adult and pediatric clinical areas, including infectious disease, endocrinology, and cardiology, using phenotypes defined over a range of data domains such as demographics, diagnoses, and visits. The primary health outcomes investigated were asthma, hypertension, and diabetes. Notably, patient phenotypes involving genomics, imaging, and notes were limited.
Conclusions: This review underscores the growing interest in spatial analysis of EHR-derived data and highlights knowledge gaps in clinical health, phenotype domains, and spatial methodologies. We suggest that future research should focus on addressing these gaps and harnessing spatial analysis to enhance individual patient contexts and clinical decision support.
背景:电子健康记录(EHR)通常包含患者地址,这些地址为地理编码和空间分析提供了宝贵的数据,从而能够为临床目的提供更全面的个体患者描述。尽管电子病历广泛应用于临床决策支持和干预,但还没有系统性综述对空间分析用于描述患者表型的程度进行研究:本研究回顾了利用美国电子病历中的个人健康数据来描述患者表型的高级空间分析:我们系统评估了 PubMed/MEDLINE、Scopus、Web of Science 和 Google Scholar 数据库中从开始到 2023 年 8 月 20 日的英语同行评审研究,没有对研究设计或特定健康领域施加限制:相当一部分研究(>85%)仅限于地理编码或基本制图,没有实施高级空间统计分析,因此只有 49 项研究符合资格标准。这些研究使用了不同的空间方法,主要侧重于聚类技术,而时空分析(频数分析和贝叶斯分析)和建模则不太常见。值得注意的是,2017 年后发表的论文激增(42 篇,占 86%)。这些出版物调查了各种成人和儿科临床领域,包括传染病学、内分泌学和心脏病学,使用了在人口统计学、诊断和就诊等一系列数据域中定义的表型。调查的主要健康结果是哮喘、高血压和糖尿病。值得注意的是,涉及基因组学、影像学和笔记的患者表型有限:本综述强调了人们对电子病历衍生数据空间分析日益增长的兴趣,并突出了临床健康、表型领域和空间方法学方面的知识差距。我们建议,未来的研究应侧重于解决这些差距,并利用空间分析来增强患者个体情况和临床决策支持。
{"title":"Application of Spatial Analysis on Electronic Health Records to Characterize Patient Phenotypes: Systematic Review.","authors":"Abolfazl Mollalo, Bashir Hamidi, Leslie A Lenert, Alexander V Alekseyenko","doi":"10.2196/56343","DOIUrl":"10.2196/56343","url":null,"abstract":"<p><strong>Background: </strong>Electronic health records (EHRs) commonly contain patient addresses that provide valuable data for geocoding and spatial analysis, enabling more comprehensive descriptions of individual patients for clinical purposes. Despite the widespread use of EHRs in clinical decision support and interventions, no systematic review has examined the extent to which spatial analysis is used to characterize patient phenotypes.</p><p><strong>Objective: </strong>This study reviews advanced spatial analyses that used individual-level health data from EHRs within the United States to characterize patient phenotypes.</p><p><strong>Methods: </strong>We systematically evaluated English-language, peer-reviewed studies from the PubMed/MEDLINE, Scopus, Web of Science, and Google Scholar databases from inception to August 20, 2023, without imposing constraints on study design or specific health domains.</p><p><strong>Results: </strong>A substantial proportion of studies (>85%) were limited to geocoding or basic mapping without implementing advanced spatial statistical analysis, leaving only 49 studies that met the eligibility criteria. These studies used diverse spatial methods, with a predominant focus on clustering techniques, while spatiotemporal analysis (frequentist and Bayesian) and modeling were less common. A noteworthy surge (n=42, 86%) in publications was observed after 2017. The publications investigated a variety of adult and pediatric clinical areas, including infectious disease, endocrinology, and cardiology, using phenotypes defined over a range of data domains such as demographics, diagnoses, and visits. The primary health outcomes investigated were asthma, hypertension, and diabetes. Notably, patient phenotypes involving genomics, imaging, and notes were limited.</p><p><strong>Conclusions: </strong>This review underscores the growing interest in spatial analysis of EHR-derived data and highlights knowledge gaps in clinical health, phenotype domains, and spatial methodologies. We suggest that future research should focus on addressing these gaps and harnessing spatial analysis to enhance individual patient contexts and clinical decision support.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e56343"},"PeriodicalIF":3.1,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11522649/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lorenz Rosenau, Julian Gruendner, Alexander Kiel, Thomas Köhler, Bastian Schaffer, Raphael W Majeed
Background: To advance research with clinical data, it is essential to make access to the available data as fast and easy as possible for researchers, which is especially challenging for data from different source systems within and across institutions. Over the years, many research repositories and data standards have been created. One of these is the Fast Healthcare Interoperability Resources (FHIR) standard, used by the German Medical Informatics Initiative (MII) to harmonize and standardize data across university hospitals in Germany. One of the first steps to make these data available is to allow researchers to create feasibility queries to determine the data availability for a specific research question. Given the heterogeneity of different query languages to access different data across and even within standards such as FHIR (eg, CQL and FHIR Search), creating an intermediate query syntax for feasibility queries reduces the complexity of query translation and improves interoperability across different research repositories and query languages.
Objective: This study describes the creation and implementation of an intermediate query syntax for feasibility queries and how it integrates into the federated German health research portal (Forschungsdatenportal Gesundheit) and the MII.
Methods: We analyzed the requirements for feasibility queries and the feasibility tools that are currently available in research repositories. Based on this analysis, we developed an intermediate query syntax that can be easily translated into different research repository-specific query languages.
Results: The resulting Clinical Cohort Definition Language (CCDL) for feasibility queries combines inclusion criteria in a conjunctive normal form and exclusion criteria in a disjunctive normal form, allowing for additional filters like time or numerical restrictions. The inclusion and exclusion results are combined via an expression to specify feasibility queries. We defined a JSON schema for the CCDL, generated an ontology, and demonstrated the use and translatability of the CCDL across multiple studies and real-world use cases.
Conclusions: We developed and evaluated a structured query syntax for feasibility queries and demonstrated its use in a real-world example as part of a research platform across 39 German university hospitals.
{"title":"Bridging Data Models in Health Care With a Novel Intermediate Query Format for Feasibility Queries: Mixed Methods Study.","authors":"Lorenz Rosenau, Julian Gruendner, Alexander Kiel, Thomas Köhler, Bastian Schaffer, Raphael W Majeed","doi":"10.2196/58541","DOIUrl":"10.2196/58541","url":null,"abstract":"<p><strong>Background: </strong>To advance research with clinical data, it is essential to make access to the available data as fast and easy as possible for researchers, which is especially challenging for data from different source systems within and across institutions. Over the years, many research repositories and data standards have been created. One of these is the Fast Healthcare Interoperability Resources (FHIR) standard, used by the German Medical Informatics Initiative (MII) to harmonize and standardize data across university hospitals in Germany. One of the first steps to make these data available is to allow researchers to create feasibility queries to determine the data availability for a specific research question. Given the heterogeneity of different query languages to access different data across and even within standards such as FHIR (eg, CQL and FHIR Search), creating an intermediate query syntax for feasibility queries reduces the complexity of query translation and improves interoperability across different research repositories and query languages.</p><p><strong>Objective: </strong>This study describes the creation and implementation of an intermediate query syntax for feasibility queries and how it integrates into the federated German health research portal (Forschungsdatenportal Gesundheit) and the MII.</p><p><strong>Methods: </strong>We analyzed the requirements for feasibility queries and the feasibility tools that are currently available in research repositories. Based on this analysis, we developed an intermediate query syntax that can be easily translated into different research repository-specific query languages.</p><p><strong>Results: </strong>The resulting Clinical Cohort Definition Language (CCDL) for feasibility queries combines inclusion criteria in a conjunctive normal form and exclusion criteria in a disjunctive normal form, allowing for additional filters like time or numerical restrictions. The inclusion and exclusion results are combined via an expression to specify feasibility queries. We defined a JSON schema for the CCDL, generated an ontology, and demonstrated the use and translatability of the CCDL across multiple studies and real-world use cases.</p><p><strong>Conclusions: </strong>We developed and evaluated a structured query syntax for feasibility queries and demonstrated its use in a real-world example as part of a research platform across 39 German university hospitals.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e58541"},"PeriodicalIF":3.1,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11493108/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}