Eric Wündisch, Peter Hufnagl, Peter Brunecker, Sophie Meier Zu Ummeln, Sarah Träger, Fabian Prasser, Joachim Weber
{"title":"Authors' Reply: The University Medicine Greifswald's Trusted Third Party Dispatcher: State-of-the-Art Perspective Into Comprehensive Architectures and Complex Research Workflows.","authors":"Eric Wündisch, Peter Hufnagl, Peter Brunecker, Sophie Meier Zu Ummeln, Sarah Träger, Fabian Prasser, Joachim Weber","doi":"10.2196/67429","DOIUrl":"10.2196/67429","url":null,"abstract":"","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e67429"},"PeriodicalIF":3.1,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623781/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142755442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Martin Bialke, Dana Stahl, Torsten Leddig, Wolfgang Hoffmann
{"title":"The University Medicine Greifswald's Trusted Third Party Dispatcher: State-of-the-Art Perspective Into Comprehensive Architectures and Complex Research Workflows.","authors":"Martin Bialke, Dana Stahl, Torsten Leddig, Wolfgang Hoffmann","doi":"10.2196/65784","DOIUrl":"10.2196/65784","url":null,"abstract":"","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e65784"},"PeriodicalIF":3.1,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623778/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142755350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Social media platforms allow individuals to openly gather, communicate, and share information about their interactions with health care services, becoming an essential supplemental means of understanding patient experience.
Objective: We aimed to identify common discussion topics related to health care experience from the public's perspective and to determine areas of concern from patients' perspectives that health care providers should act on.
Methods: This study conducted a spatiotemporal analysis of the volume, sentiment, and topic of patient experience-related posts on the Weibo platform developed by Sina Corporation. We applied a supervised machine learning approach including human annotation and machine learning-based models for topic modeling and sentiment analysis of the public discourse. A multiclassifier voting method based on logistic regression, multinomial naïve Bayes, and random forest was used.
Results: A total of 4008 posts were manually classified into patient experience topics. A patient experience theme framework was developed. The accuracy, precision, recall, and F-measure of the method integrating logistic regression, multinomial naïve Bayes, and random forest for patient experience themes were 0.93, 0.95, 0.80, 0.77, and 0.84, respectively, indicating a satisfactory prediction. The sentiment analysis revealed that negative sentiment posts constituted the highest proportion (3319/4008, 82.81%). Twenty patient experience themes were discussed on the social media platform. The majority of the posts described the interpersonal aspects of care (2947/4008, 73.53%); the five most frequently discussed topics were "health care professionals' attitude," "access to care," "communication, information, and education," "technical competence," and "efficacy of treatment."
Conclusions: Hospital administrators and clinicians should consider the value of social media and pay attention to what patients and their family members are communicating on social media. To increase the utility of these data, a machine learning algorithm can be used for topic modeling. The results of this study highlighted the interpersonal and functional aspects of care, especially the interpersonal aspects, which are often the "moment of truth" during a service encounter in which patients make a critical evaluation of hospital services.
{"title":"Analyzing Patient Experience on Weibo: Machine Learning Approach to Topic Modeling and Sentiment Analysis.","authors":"Xiao Chen, Zhiyun Shen, Tingyu Guan, Yuchen Tao, Yichen Kang, Yuxia Zhang","doi":"10.2196/59249","DOIUrl":"10.2196/59249","url":null,"abstract":"<p><strong>Background: </strong>Social media platforms allow individuals to openly gather, communicate, and share information about their interactions with health care services, becoming an essential supplemental means of understanding patient experience.</p><p><strong>Objective: </strong>We aimed to identify common discussion topics related to health care experience from the public's perspective and to determine areas of concern from patients' perspectives that health care providers should act on.</p><p><strong>Methods: </strong>This study conducted a spatiotemporal analysis of the volume, sentiment, and topic of patient experience-related posts on the Weibo platform developed by Sina Corporation. We applied a supervised machine learning approach including human annotation and machine learning-based models for topic modeling and sentiment analysis of the public discourse. A multiclassifier voting method based on logistic regression, multinomial naïve Bayes, and random forest was used.</p><p><strong>Results: </strong>A total of 4008 posts were manually classified into patient experience topics. A patient experience theme framework was developed. The accuracy, precision, recall, and F-measure of the method integrating logistic regression, multinomial naïve Bayes, and random forest for patient experience themes were 0.93, 0.95, 0.80, 0.77, and 0.84, respectively, indicating a satisfactory prediction. The sentiment analysis revealed that negative sentiment posts constituted the highest proportion (3319/4008, 82.81%). Twenty patient experience themes were discussed on the social media platform. The majority of the posts described the interpersonal aspects of care (2947/4008, 73.53%); the five most frequently discussed topics were \"health care professionals' attitude,\" \"access to care,\" \"communication, information, and education,\" \"technical competence,\" and \"efficacy of treatment.\"</p><p><strong>Conclusions: </strong>Hospital administrators and clinicians should consider the value of social media and pay attention to what patients and their family members are communicating on social media. To increase the utility of these data, a machine learning algorithm can be used for topic modeling. The results of this study highlighted the interpersonal and functional aspects of care, especially the interpersonal aspects, which are often the \"moment of truth\" during a service encounter in which patients make a critical evaluation of hospital services.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e59249"},"PeriodicalIF":3.1,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623958/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142755433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Laura Gosselin, Alexandre Maes, Kevin Eyer, Badisse Dahamna, Flavien Disson, Stefan Darmoni, Julien Wils, Julien Grosjean
Background: The enzymatic system of cytochrome P450 (CYP450) is a group of enzymes involved in the metabolism of drugs present in the liver. Literature records instances of underdosing of drugs due to the concurrent administration of another drug that strongly induces the same cytochrome for which the first drug is a substrate and overdosing due to strong inhibition. IT solutions have been proposed to raise awareness among prescribers to mitigate these interactions.
Objective: This study aimed to develop a drug interaction dashboard for Cytochrome-mediated drug interactions (DIDC) using a health care data warehouse to display results that are easily readable and interpretable by clinical experts.
Methods: The initial step involved defining requirements with expert pharmacologists. An existing model of interactions involving the (CYP450) was used. A program for the automatic detection of cytochrome-mediated drug interactions (DI) was developed. Finally, the development and visualization of the DIDC were carried out by an IT engineer. An evaluation of the tool was carried out.
Results: The development of the DIDC was successfully completed. It automatically compiled cytochrome-mediated DIs in a comprehensive table and provided a dedicated dashboard for each potential DI. The most frequent interaction involved paracetamol and carbamazepine with CYP450 3A4 (n=50 patients). The prescription of tacrolimus with CYP3A5 genotyping pertained to 675 patients. Two experts qualitatively evaluated the tool, resulting in overall satisfaction scores of 6 and 5 out of 7, respectively.
Conclusions: At our hospital, measurements of molecules that could have altered concentrations due to cytochrome-mediated DIs are not systematic. These DIs can lead to serious clinical consequences. The purpose of this DIDC is to provide an overall view and raise awareness among prescribers about the importance of measuring concentrations of specific drugs and metabolites. Ultimately, the tool could lead to an individualized approach and become a prescription support tool if integrated into prescription assistance software.
{"title":"Design and Implementation of a Dashboard for Drug Interactions Mediated by Cytochromes Using a Health Care Data Warehouse in a University Hospital Center: Development Study.","authors":"Laura Gosselin, Alexandre Maes, Kevin Eyer, Badisse Dahamna, Flavien Disson, Stefan Darmoni, Julien Wils, Julien Grosjean","doi":"10.2196/57705","DOIUrl":"10.2196/57705","url":null,"abstract":"<p><strong>Background: </strong>The enzymatic system of cytochrome P450 (CYP450) is a group of enzymes involved in the metabolism of drugs present in the liver. Literature records instances of underdosing of drugs due to the concurrent administration of another drug that strongly induces the same cytochrome for which the first drug is a substrate and overdosing due to strong inhibition. IT solutions have been proposed to raise awareness among prescribers to mitigate these interactions.</p><p><strong>Objective: </strong>This study aimed to develop a drug interaction dashboard for Cytochrome-mediated drug interactions (DIDC) using a health care data warehouse to display results that are easily readable and interpretable by clinical experts.</p><p><strong>Methods: </strong>The initial step involved defining requirements with expert pharmacologists. An existing model of interactions involving the (CYP450) was used. A program for the automatic detection of cytochrome-mediated drug interactions (DI) was developed. Finally, the development and visualization of the DIDC were carried out by an IT engineer. An evaluation of the tool was carried out.</p><p><strong>Results: </strong>The development of the DIDC was successfully completed. It automatically compiled cytochrome-mediated DIs in a comprehensive table and provided a dedicated dashboard for each potential DI. The most frequent interaction involved paracetamol and carbamazepine with CYP450 3A4 (n=50 patients). The prescription of tacrolimus with CYP3A5 genotyping pertained to 675 patients. Two experts qualitatively evaluated the tool, resulting in overall satisfaction scores of 6 and 5 out of 7, respectively.</p><p><strong>Conclusions: </strong>At our hospital, measurements of molecules that could have altered concentrations due to cytochrome-mediated DIs are not systematic. These DIs can lead to serious clinical consequences. The purpose of this DIDC is to provide an overall view and raise awareness among prescribers about the importance of measuring concentrations of specific drugs and metabolites. Ultimately, the tool could lead to an individualized approach and become a prescription support tool if integrated into prescription assistance software.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e57705"},"PeriodicalIF":3.1,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11620019/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142752542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Atrial fibrillation (AF) is a progressive disease, and its clinical type is classified according to the AF duration: paroxysmal AF, persistent AF (PeAF; AF duration of less than 1 year), and long-standing persistent AF (AF duration of more than 1 year). When considering the indication for catheter ablation, having a long AF duration is considered a risk factor for recurrence, and therefore, the duration of AF is an important factor in determining the treatment strategy for PeAF.
Objective: This study aims to improve the accuracy of the cardiologists' diagnosis of the AF duration, and the steps to achieve this goal are to develop a predictive model of the AF duration and validate the support performance of the prediction model.
Methods: The study included 272 patients with PeAF (aged 20-90 years), with data obtained between January 1, 2015, and December 31, 2023. Of those, 189 (69.5%) were included in the study, excluding 83 (30.5%) who met the exclusion criteria. Of the 189 patients included, 145 (76.7%) were used as training data to build the machine learning (ML) model and 44 (23.3%) were used as test data for predictive ability of the ML model. Using a questionnaire, 10 cardiologists (group A) evaluated whether the test data (44 patients) included AF of more than a 1-year duration (phase 1). Next, the same questionnaire was performed again after providing the ML model's answer (phase 2). Subsequently, another 10 cardiologists (group B) were shown the test results of group A, were made aware of the limitations of their own diagnostic abilities, and were then administered the same 2-stage test as group A.
Results: The prediction results with the ML model using the test data provided 81.8% accuracy (72% sensitivity and 89% specificity). The mean percentage of correct answers in group A was 63.9% (SD 9.6%) for phase 1 and improved to 71.6% (SD 9.3%) for phase 2 (P=.01). The mean percentage of correct answers in group B was 59.8% (SD 5.3%) for phase 1 and improved to 68.2% (SD 5.9%) for phase 2 (P=.007). The mean percentage of answers that differed from the ML model's prediction for phase 2 (percentage of answers where cardiologists did not trust the ML model and believed their own determination) was 17.3% (SD 10.3%) in group A and 20.9% (SD 5%) in group B and was not significantly different (P=.85).
Conclusions: ML models predicting AF duration improved the diagnostic ability of cardiologists. However, cardiologists did not entirely rely on the ML model's prediction, even if they were aware of their diagnostic capability limitations.
背景:心房颤动(房颤)是一种进行性疾病,其临床类型可根据房颤持续时间进行分类:阵发性房颤、持续性房颤(PeAF;房颤持续时间少于 1 年)和长期持续性房颤(房颤持续时间超过 1 年)。在考虑导管消融的适应症时,房颤持续时间长被认为是复发的风险因素,因此,房颤持续时间是决定 PeAF 治疗策略的重要因素:本研究旨在提高心脏病专家对房颤持续时间诊断的准确性,实现这一目标的步骤是开发房颤持续时间预测模型,并验证预测模型的支持性能:研究纳入了 272 名 PeAF 患者(年龄在 20-90 岁之间),数据采集时间为 2015 年 1 月 1 日至 2023 年 12 月 31 日。其中,189 名(69.5%)患者被纳入研究,排除了 83 名(30.5%)符合排除标准的患者。在纳入的 189 例患者中,145 例(76.7%)作为训练数据用于建立机器学习 (ML) 模型,44 例(23.3%)作为测试数据用于检验 ML 模型的预测能力。10 名心脏病专家(A 组)通过调查问卷评估了测试数据(44 名患者)是否包括病程超过 1 年的房颤(第 1 阶段)。然后,在提供 ML 模型的答案后再次进行相同的问卷调查(第 2 阶段)。随后,向另外 10 名心脏病专家(B 组)展示了 A 组的测试结果,让他们意识到自己诊断能力的局限性,然后进行了与 A 组相同的两阶段测试:结果:使用测试数据的 ML 模型得出的预测结果准确率为 81.8%(灵敏度 72%,特异性 89%)。第一阶段 A 组的平均正确率为 63.9%(标准差 9.6%),第二阶段提高到 71.6%(标准差 9.3%)(P=.01)。B 组第一阶段的平均正确率为 59.8%(标准差 5.3%),第二阶段提高到 68.2%(标准差 5.9%)(P=.007)。在第 2 阶段,与 ML 模型预测不同的平均答案百分比(心脏病专家不相信 ML 模型而相信自己判断的答案百分比)在 A 组为 17.3% (SD 10.3%),在 B 组为 20.9% (SD 5%),没有显著差异 (P=.85):预测房颤持续时间的 ML 模型提高了心脏病专家的诊断能力。结论:ML 模型预测房颤持续时间提高了心脏病专家的诊断能力,但心脏病专家并不完全依赖 ML 模型的预测,即使他们意识到自己诊断能力的局限性。
{"title":"Using Machine Learning to Predict the Duration of Atrial Fibrillation: Model Development and Validation.","authors":"Satoshi Shimoo, Keitaro Senoo, Taku Okawa, Kohei Kawai, Masahiro Makino, Jun Munakata, Nobunari Tomura, Hibiki Iwakoshi, Tetsuro Nishimura, Hirokazu Shiraishi, Keiji Inoue, Satoaki Matoba","doi":"10.2196/63795","DOIUrl":"10.2196/63795","url":null,"abstract":"<p><strong>Background: </strong>Atrial fibrillation (AF) is a progressive disease, and its clinical type is classified according to the AF duration: paroxysmal AF, persistent AF (PeAF; AF duration of less than 1 year), and long-standing persistent AF (AF duration of more than 1 year). When considering the indication for catheter ablation, having a long AF duration is considered a risk factor for recurrence, and therefore, the duration of AF is an important factor in determining the treatment strategy for PeAF.</p><p><strong>Objective: </strong>This study aims to improve the accuracy of the cardiologists' diagnosis of the AF duration, and the steps to achieve this goal are to develop a predictive model of the AF duration and validate the support performance of the prediction model.</p><p><strong>Methods: </strong>The study included 272 patients with PeAF (aged 20-90 years), with data obtained between January 1, 2015, and December 31, 2023. Of those, 189 (69.5%) were included in the study, excluding 83 (30.5%) who met the exclusion criteria. Of the 189 patients included, 145 (76.7%) were used as training data to build the machine learning (ML) model and 44 (23.3%) were used as test data for predictive ability of the ML model. Using a questionnaire, 10 cardiologists (group A) evaluated whether the test data (44 patients) included AF of more than a 1-year duration (phase 1). Next, the same questionnaire was performed again after providing the ML model's answer (phase 2). Subsequently, another 10 cardiologists (group B) were shown the test results of group A, were made aware of the limitations of their own diagnostic abilities, and were then administered the same 2-stage test as group A.</p><p><strong>Results: </strong>The prediction results with the ML model using the test data provided 81.8% accuracy (72% sensitivity and 89% specificity). The mean percentage of correct answers in group A was 63.9% (SD 9.6%) for phase 1 and improved to 71.6% (SD 9.3%) for phase 2 (P=.01). The mean percentage of correct answers in group B was 59.8% (SD 5.3%) for phase 1 and improved to 68.2% (SD 5.9%) for phase 2 (P=.007). The mean percentage of answers that differed from the ML model's prediction for phase 2 (percentage of answers where cardiologists did not trust the ML model and believed their own determination) was 17.3% (SD 10.3%) in group A and 20.9% (SD 5%) in group B and was not significantly different (P=.85).</p><p><strong>Conclusions: </strong>ML models predicting AF duration improved the diagnostic ability of cardiologists. However, cardiologists did not entirely rely on the ML model's prediction, even if they were aware of their diagnostic capability limitations.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e63795"},"PeriodicalIF":3.1,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11624443/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142693920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sarah Soyeon Oh, Bada Kang, Dahye Hong, Jennifer Ivy Kim, Hyewon Jeong, Jinyeop Song, Minkyu Jeon
<p><strong>Background: </strong>Mild cognitive impairment (MCI) poses significant challenges in early diagnosis and timely intervention. Underdiagnosis, coupled with the economic and social burden of dementia, necessitates more precise detection methods. Machine learning (ML) algorithms show promise in managing complex data for MCI and dementia prediction.</p><p><strong>Objective: </strong>This study assessed the predictive accuracy of ML models in identifying the onset of MCI and dementia using the Korean Longitudinal Study of Aging (KLoSA) dataset.</p><p><strong>Methods: </strong>This study used data from the KLoSA, a comprehensive biennial survey that tracks the demographic, health, and socioeconomic aspects of middle-aged and older Korean adults from 2018 to 2020. Among the 6171 initial households, 4975 eligible older adult participants aged 60 years or older were selected after excluding individuals based on age and missing data. The identification of MCI and dementia relied on self-reported diagnoses, with sociodemographic and health-related variables serving as key covariates. The dataset was categorized into training and test sets to predict MCI and dementia by using multiple models, including logistic regression, light gradient-boosting machine, XGBoost (extreme gradient boosting), CatBoost, random forest, gradient boosting, AdaBoost, support vector classifier, and k-nearest neighbors, and the training and test sets were used to evaluate predictive performance. The performance was assessed using the area under the receiver operating characteristic curve (AUC). Class imbalances were addressed via weights. Shapley additive explanation values were used to determine the contribution of each feature to the prediction rate.</p><p><strong>Results: </strong>Among the 4975 participants, the best model for predicting MCI onset was random forest, with a median AUC of 0.6729 (IQR 0.3883-0.8152), followed by k-nearest neighbors with a median AUC of 0.5576 (IQR 0.4555-0.6761) and support vector classifier with a median AUC of 0.5067 (IQR 0.3755-0.6389). For dementia onset prediction, the best model was XGBoost, achieving a median AUC of 0.8185 (IQR 0.8085-0.8285), closely followed by light gradient-boosting machine with a median AUC of 0.8069 (IQR 0.7969-0.8169) and AdaBoost with a median AUC of 0.8007 (IQR 0.7907-0.8107). The Shapley values highlighted pain in everyday life, being widowed, living alone, exercising, and living with a partner as the strongest predictors of MCI. For dementia, the most predictive features were other contributing factors, education at the high school level, education at the middle school level, exercising, and monthly social engagement.</p><p><strong>Conclusions: </strong>ML algorithms, especially XGBoost, exhibited the potential for predicting MCI onset using KLoSA data. However, no model has demonstrated robust accuracy in predicting MCI and dementia. Sociodemographic and health-related factors are crucial for initiatin
{"title":"A Multivariable Prediction Model for Mild Cognitive Impairment and Dementia: Algorithm Development and Validation.","authors":"Sarah Soyeon Oh, Bada Kang, Dahye Hong, Jennifer Ivy Kim, Hyewon Jeong, Jinyeop Song, Minkyu Jeon","doi":"10.2196/59396","DOIUrl":"10.2196/59396","url":null,"abstract":"<p><strong>Background: </strong>Mild cognitive impairment (MCI) poses significant challenges in early diagnosis and timely intervention. Underdiagnosis, coupled with the economic and social burden of dementia, necessitates more precise detection methods. Machine learning (ML) algorithms show promise in managing complex data for MCI and dementia prediction.</p><p><strong>Objective: </strong>This study assessed the predictive accuracy of ML models in identifying the onset of MCI and dementia using the Korean Longitudinal Study of Aging (KLoSA) dataset.</p><p><strong>Methods: </strong>This study used data from the KLoSA, a comprehensive biennial survey that tracks the demographic, health, and socioeconomic aspects of middle-aged and older Korean adults from 2018 to 2020. Among the 6171 initial households, 4975 eligible older adult participants aged 60 years or older were selected after excluding individuals based on age and missing data. The identification of MCI and dementia relied on self-reported diagnoses, with sociodemographic and health-related variables serving as key covariates. The dataset was categorized into training and test sets to predict MCI and dementia by using multiple models, including logistic regression, light gradient-boosting machine, XGBoost (extreme gradient boosting), CatBoost, random forest, gradient boosting, AdaBoost, support vector classifier, and k-nearest neighbors, and the training and test sets were used to evaluate predictive performance. The performance was assessed using the area under the receiver operating characteristic curve (AUC). Class imbalances were addressed via weights. Shapley additive explanation values were used to determine the contribution of each feature to the prediction rate.</p><p><strong>Results: </strong>Among the 4975 participants, the best model for predicting MCI onset was random forest, with a median AUC of 0.6729 (IQR 0.3883-0.8152), followed by k-nearest neighbors with a median AUC of 0.5576 (IQR 0.4555-0.6761) and support vector classifier with a median AUC of 0.5067 (IQR 0.3755-0.6389). For dementia onset prediction, the best model was XGBoost, achieving a median AUC of 0.8185 (IQR 0.8085-0.8285), closely followed by light gradient-boosting machine with a median AUC of 0.8069 (IQR 0.7969-0.8169) and AdaBoost with a median AUC of 0.8007 (IQR 0.7907-0.8107). The Shapley values highlighted pain in everyday life, being widowed, living alone, exercising, and living with a partner as the strongest predictors of MCI. For dementia, the most predictive features were other contributing factors, education at the high school level, education at the middle school level, exercising, and monthly social engagement.</p><p><strong>Conclusions: </strong>ML algorithms, especially XGBoost, exhibited the potential for predicting MCI onset using KLoSA data. However, no model has demonstrated robust accuracy in predicting MCI and dementia. Sociodemographic and health-related factors are crucial for initiatin","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e59396"},"PeriodicalIF":3.1,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11624448/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142693918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Clinical named entity recognition (CNER) is a fundamental task in natural language processing used to extract named entities from electronic medical record texts. In recent years, with the continuous development of machine learning, deep learning models have replaced traditional machine learning and template-based methods, becoming widely applied in the CNER field. However, due to the complexity of clinical texts, the diversity and large quantity of named entity types, and the unclear boundaries between different entities, existing advanced methods rely to some extent on annotated databases and the scale of embedded dictionaries.
Objective: This study aims to address the issues of data scarcity and labeling difficulties in CNER tasks by proposing a dataset augmentation algorithm based on proximity word calculation.
Methods: We propose a Segmentation Synonym Sentence Synthesis (SSSS) algorithm based on neighboring vocabulary, which leverages existing public knowledge without the need for manual expansion of specialized domain dictionaries. Through lexical segmentation, the algorithm replaces new synonymous vocabulary by recombining from vast natural language data, achieving nearby expansion expressions of the dataset. We applied the SSSS algorithm to the Robustly Optimized Bidirectional Encoder Representations from Transformers Pretraining Approach (RoBERTa) + conditional random field (CRF) and RoBERTa + Bidirectional Long Short-Term Memory (BiLSTM) + CRF models and evaluated our models (SSSS + RoBERTa + CRF; SSSS + RoBERTa + BiLSTM + CRF) on the China Conference on Knowledge Graph and Semantic Computing (CCKS) 2017 and 2019 datasets.
Results: Our experiments demonstrated that the models SSSS + RoBERTa + CRF and SSSS + RoBERTa + BiLSTM + CRF achieved F1-scores of 91.30% and 91.35% on the CCKS-2017 dataset, respectively. They also achieved F1-scores of 83.21% and 83.01% on the CCKS-2019 dataset, respectively.
Conclusions: The experimental results indicated that our proposed method successfully expanded the dataset and remarkably improved the performance of the model, effectively addressing the challenges of data acquisition, annotation difficulties, and insufficient model generalization performance.
临床命名实体识别(CNER)是自然语言处理中的一项基本任务,用于从电子病历文本中提取命名实体。近年来,随着机器学习的不断发展,深度学习模型取代了传统的机器学习和基于模板的方法,在CNER领域得到了广泛的应用。然而,由于临床文本的复杂性、命名实体类型的多样性和数量庞大,以及不同实体之间的边界不明确,现有的先进方法在一定程度上依赖于带注释的数据库和嵌入式词典的规模。目的:本研究提出了一种基于邻近词计算的数据集增强算法,旨在解决CNER任务中数据稀缺和标注困难的问题。方法:提出了一种基于相邻词汇的分词同义词句子合成算法,该算法利用现有的公共知识,无需人工扩充专门的领域词典。该算法通过词法分割,从海量的自然语言数据中对新的同义词汇进行重组替换,实现数据集的就近扩展表达式。我们将SSSS算法应用于基于变压器预训练方法的稳健优化双向编码器表示(RoBERTa) +条件随机场(CRF)和RoBERTa +双向长短期记忆(BiLSTM) + CRF模型,并评估了我们的模型(SSSS + RoBERTa + CRF;SSSS + RoBERTa + BiLSTM + CRF)在中国知识图谱与语义计算会议(CCKS) 2017年和2019年数据集上的研究。结果:我们的实验表明,SSSS + RoBERTa + CRF和SSSS + RoBERTa + BiLSTM + CRF模型在CCKS-2017数据集上的f1得分分别为91.30%和91.35%。在CCKS-2019数据集上,他们也分别获得了83.21%和83.01%的f1分。结论:实验结果表明,我们提出的方法成功地扩展了数据集,显著提高了模型的性能,有效地解决了数据获取、标注困难和模型泛化性能不足的挑战。
{"title":"Chinese Clinical Named Entity Recognition With Segmentation Synonym Sentence Synthesis Mechanism: Algorithm Development and Validation.","authors":"Jian Tang, Zikun Huang, Hongzhen Xu, Hao Zhang, Hailing Huang, Minqiong Tang, Pengsheng Luo, Dong Qin","doi":"10.2196/60334","DOIUrl":"10.2196/60334","url":null,"abstract":"<p><strong>Background: </strong>Clinical named entity recognition (CNER) is a fundamental task in natural language processing used to extract named entities from electronic medical record texts. In recent years, with the continuous development of machine learning, deep learning models have replaced traditional machine learning and template-based methods, becoming widely applied in the CNER field. However, due to the complexity of clinical texts, the diversity and large quantity of named entity types, and the unclear boundaries between different entities, existing advanced methods rely to some extent on annotated databases and the scale of embedded dictionaries.</p><p><strong>Objective: </strong>This study aims to address the issues of data scarcity and labeling difficulties in CNER tasks by proposing a dataset augmentation algorithm based on proximity word calculation.</p><p><strong>Methods: </strong>We propose a Segmentation Synonym Sentence Synthesis (SSSS) algorithm based on neighboring vocabulary, which leverages existing public knowledge without the need for manual expansion of specialized domain dictionaries. Through lexical segmentation, the algorithm replaces new synonymous vocabulary by recombining from vast natural language data, achieving nearby expansion expressions of the dataset. We applied the SSSS algorithm to the Robustly Optimized Bidirectional Encoder Representations from Transformers Pretraining Approach (RoBERTa) + conditional random field (CRF) and RoBERTa + Bidirectional Long Short-Term Memory (BiLSTM) + CRF models and evaluated our models (SSSS + RoBERTa + CRF; SSSS + RoBERTa + BiLSTM + CRF) on the China Conference on Knowledge Graph and Semantic Computing (CCKS) 2017 and 2019 datasets.</p><p><strong>Results: </strong>Our experiments demonstrated that the models SSSS + RoBERTa + CRF and SSSS + RoBERTa + BiLSTM + CRF achieved F1-scores of 91.30% and 91.35% on the CCKS-2017 dataset, respectively. They also achieved F1-scores of 83.21% and 83.01% on the CCKS-2019 dataset, respectively.</p><p><strong>Conclusions: </strong>The experimental results indicated that our proposed method successfully expanded the dataset and remarkably improved the performance of the model, effectively addressing the challenges of data acquisition, annotation difficulties, and insufficient model generalization performance.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e60334"},"PeriodicalIF":3.1,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11612518/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142774761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Janna Nadav, Anu-Marja Kaihlanen, Sari Kujala, Ilmo Keskimäki, Johanna Viitanen, Samuel Salovaara, Petra Saukkonen, Jukka Vänskä, Tuulikki Vehko, Tarja Heponiemi
Background: The integration of information systems in health care and social welfare organizations has brought significant changes in patient and client care. This integration is expected to offer numerous benefits, but simultaneously the implementation of health information systems and client information systems can also introduce added stress due to the increased time and effort required by professionals.
Objective: This study aimed to examine whether professional groups and the factors that contribute to successful implementation (participation in information systems development and satisfaction with software providers' development work) are associated with the well-being of health care and social welfare professionals.
Methods: Data were obtained from 3 national cross-sectional surveys (n=9240), which were carried out among Finnish health care and social welfare professionals (registered nurses, physicians, and social welfare professionals) in 2020-2021. Self-rated stress and stress related to information systems were used as indicators of well-being. Analyses were conducted using linear and logistic regression analysis.
Results: Registered nurses were more likely to experience self-rated stress than physicians (odds ratio [OR] -0.47; P>.001) and social welfare professionals (OR -0.68; P<.001). They also had a higher likelihood of stress related to information systems than physicians (b=-.11; P<.001). Stress related to information systems was less prevalent among professionals who did not participate in information systems development work (b=-.14; P<.001). Higher satisfaction with software providers' development work was associated with a lower likelihood of self-rated stress (OR -0.23; P<.001) and stress related to information systems (b=-.36 P<.001). When comparing the professional groups, we found that physicians who were satisfied with software providers' development work had a significantly lower likelihood of stress related to information systems (b=-.12; P<.001) compared with registered nurses and social welfare professionals.
Conclusions: Organizations can enhance the well-being of professionals and improve the successful implementation of information systems by actively soliciting and incorporating professional feedback, dedicating time for information systems development, fostering collaboration with software providers, and addressing the unique needs of different professional groups.
{"title":"Factors Contributing to Successful Information System Implementation and Employee Well-Being in Health Care and Social Welfare Professionals: Comparative Cross-Sectional Study.","authors":"Janna Nadav, Anu-Marja Kaihlanen, Sari Kujala, Ilmo Keskimäki, Johanna Viitanen, Samuel Salovaara, Petra Saukkonen, Jukka Vänskä, Tuulikki Vehko, Tarja Heponiemi","doi":"10.2196/52817","DOIUrl":"10.2196/52817","url":null,"abstract":"<p><strong>Background: </strong>The integration of information systems in health care and social welfare organizations has brought significant changes in patient and client care. This integration is expected to offer numerous benefits, but simultaneously the implementation of health information systems and client information systems can also introduce added stress due to the increased time and effort required by professionals.</p><p><strong>Objective: </strong>This study aimed to examine whether professional groups and the factors that contribute to successful implementation (participation in information systems development and satisfaction with software providers' development work) are associated with the well-being of health care and social welfare professionals.</p><p><strong>Methods: </strong>Data were obtained from 3 national cross-sectional surveys (n=9240), which were carried out among Finnish health care and social welfare professionals (registered nurses, physicians, and social welfare professionals) in 2020-2021. Self-rated stress and stress related to information systems were used as indicators of well-being. Analyses were conducted using linear and logistic regression analysis.</p><p><strong>Results: </strong>Registered nurses were more likely to experience self-rated stress than physicians (odds ratio [OR] -0.47; P>.001) and social welfare professionals (OR -0.68; P<.001). They also had a higher likelihood of stress related to information systems than physicians (b=-.11; P<.001). Stress related to information systems was less prevalent among professionals who did not participate in information systems development work (b=-.14; P<.001). Higher satisfaction with software providers' development work was associated with a lower likelihood of self-rated stress (OR -0.23; P<.001) and stress related to information systems (b=-.36 P<.001). When comparing the professional groups, we found that physicians who were satisfied with software providers' development work had a significantly lower likelihood of stress related to information systems (b=-.12; P<.001) compared with registered nurses and social welfare professionals.</p><p><strong>Conclusions: </strong>Organizations can enhance the well-being of professionals and improve the successful implementation of information systems by actively soliciting and incorporating professional feedback, dedicating time for information systems development, fostering collaboration with software providers, and addressing the unique needs of different professional groups.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e52817"},"PeriodicalIF":3.1,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11604090/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142683733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chen Lv, Yi-Hong Gong, Jun An, Qian Wang, Jing Han, Xiu-Hua Wang, Xiao-Feng Chen
Background: Diagnosis-related group (DRG) payment has become the main way of medical expenses settlement, and its application is more and more extensive.
Objective: This study aimed to explore the correlation between DRG weights and nursing time and to develop a predictive model for nursing time in the cardiology department based on DRG weights and other factors.
Methods: The convenience sampling method was used to select patients who were hospitalised in the cardiology ward of our hospital between April 2023 and April 2024 as the study participants. Nursing time was measured by direct and indirect nursing time. For the distribution of nursing time with different demographic characteristics, Pearson correlation was used to analyse the relationship between DRG weights and nursing time and multiple linear regression was used to analyse the influencing factors of total nursing time.
Results: A total of 103 subjects were included in this study. The DRG weights were positively correlated with ln(direct nursing time), ln(indirect nursing time) and ln(total nursing time) (r = 0.480, r = 0.394, r = 0.448, all P < .001). Moreover, age was positively correlated with the three nursing times (r = 0.235, r = 0.192, r = 0.235, all P < .001); activities of daily living (ADL) on admission was negatively correlated with the three nursing times (r = -0.316, r = -0.252, r = -0.301, all P < .001); and nursing level on the first day of admission was positively correlated with the three nursing times (r = 0.333, r = 0.332, r = 0.352, all P < .001). Furthermore, the multivariate analysis found that nursing levels on the first day of admission, complications or comorbidities, DRG weights and ADL on admission were the influencing factors of the nursing time of patients (R2 = 0.328, F = 69.58, P < .001).
Conclusions: Diagnosis-related group weights showed a strong correlation with nursing time and can be used to predict nursing time, which may assist in nursing resource allocation in cardiology departments.
{"title":"Correlation between Diagnosis-related Group Weights and Nursing Time in the Cardiology Department: A Cross-sectional Study.","authors":"Chen Lv, Yi-Hong Gong, Jun An, Qian Wang, Jing Han, Xiu-Hua Wang, Xiao-Feng Chen","doi":"10.2196/65549","DOIUrl":"https://doi.org/10.2196/65549","url":null,"abstract":"<p><strong>Background: </strong>Diagnosis-related group (DRG) payment has become the main way of medical expenses settlement, and its application is more and more extensive.</p><p><strong>Objective: </strong>This study aimed to explore the correlation between DRG weights and nursing time and to develop a predictive model for nursing time in the cardiology department based on DRG weights and other factors.</p><p><strong>Methods: </strong>The convenience sampling method was used to select patients who were hospitalised in the cardiology ward of our hospital between April 2023 and April 2024 as the study participants. Nursing time was measured by direct and indirect nursing time. For the distribution of nursing time with different demographic characteristics, Pearson correlation was used to analyse the relationship between DRG weights and nursing time and multiple linear regression was used to analyse the influencing factors of total nursing time.</p><p><strong>Results: </strong>A total of 103 subjects were included in this study. The DRG weights were positively correlated with ln(direct nursing time), ln(indirect nursing time) and ln(total nursing time) (r = 0.480, r = 0.394, r = 0.448, all P < .001). Moreover, age was positively correlated with the three nursing times (r = 0.235, r = 0.192, r = 0.235, all P < .001); activities of daily living (ADL) on admission was negatively correlated with the three nursing times (r = -0.316, r = -0.252, r = -0.301, all P < .001); and nursing level on the first day of admission was positively correlated with the three nursing times (r = 0.333, r = 0.332, r = 0.352, all P < .001). Furthermore, the multivariate analysis found that nursing levels on the first day of admission, complications or comorbidities, DRG weights and ADL on admission were the influencing factors of the nursing time of patients (R2 = 0.328, F = 69.58, P < .001).</p><p><strong>Conclusions: </strong>Diagnosis-related group weights showed a strong correlation with nursing time and can be used to predict nursing time, which may assist in nursing resource allocation in cardiology departments.</p><p><strong>Clinicaltrial: </strong></p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":" ","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142683740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chung-Chun Lee, Seunghee Lee, Mi-Hwa Song, Jong-Yeup Kim, Suehyun Lee
Background: Social networking services (SNS) closely reflect the lives of individuals in modern society and generate large amounts of data. Previous studies have extracted drug information using relevant SNS data. In particular, it is important to detect adverse drug reactions (ADRs) early using drug surveillance systems. To this end, various deep learning methods have been used to analyze data in multiple languages in addition to English.
Objective: A cautionary drug that can cause ADRs in older patients was selected, and Korean SNS data containing this drug information were collected. Based on this information, we aimed to develop a deep learning model that classifies drug ADR posts based on a recurrent neural network.
Methods: In previous studies, ketoprofen, which has a high prescription frequency and, thus, was referred to the most in posts secured from SNS data, was selected as the target drug. Blog posts, café posts, and NAVER Q&A posts from 2005 to 2020 were collected from NAVER, a portal site containing drug-related information, and natural language processing techniques were applied to analyze data written in Korean. Posts containing highly relevant drug names and ADR word pairs were filtered through association analysis, and training data were generated through manual labeling tasks. Using the training data, an embedded layer of word2vec was formed, and a Bidirectional Long Short-Term Memory (Bi-LSTM) classification model was generated. Then, we evaluated the area under the curve with other machine learning models. In addition, the entire process was further verified using the nonsteroidal anti-inflammatory drug aceclofenac.
Results: Among the nonsteroidal anti-inflammatory drugs, Korean SNS posts containing information on ketoprofen and aceclofenac were secured, and the generic name lexicon, ADR lexicon, and Korean stop word lexicon were generated. In addition, to improve the accuracy of the classification model, an embedding layer was created considering the association between the drug name and the ADR word. In the ADR post classification test, ketoprofen and aceclofenac achieved 85% and 80% accuracy, respectively.
Conclusions: Here, we propose a process for developing a model for classifying ADR posts using SNS data. After analyzing drug name-ADR patterns, we filtered high-quality data by extracting posts, including known ADR words based on the analysis. Based on these data, we developed a model that classifies ADR posts. This confirmed that a model that can leverage social data to monitor ADRs automatically is feasible.
{"title":"Bidirectional Long Short-Term Memory-Based Detection of Adverse Drug Reaction Posts Using Korean Social Networking Services Data: Deep Learning Approaches.","authors":"Chung-Chun Lee, Seunghee Lee, Mi-Hwa Song, Jong-Yeup Kim, Suehyun Lee","doi":"10.2196/45289","DOIUrl":"10.2196/45289","url":null,"abstract":"<p><strong>Background: </strong>Social networking services (SNS) closely reflect the lives of individuals in modern society and generate large amounts of data. Previous studies have extracted drug information using relevant SNS data. In particular, it is important to detect adverse drug reactions (ADRs) early using drug surveillance systems. To this end, various deep learning methods have been used to analyze data in multiple languages in addition to English.</p><p><strong>Objective: </strong>A cautionary drug that can cause ADRs in older patients was selected, and Korean SNS data containing this drug information were collected. Based on this information, we aimed to develop a deep learning model that classifies drug ADR posts based on a recurrent neural network.</p><p><strong>Methods: </strong>In previous studies, ketoprofen, which has a high prescription frequency and, thus, was referred to the most in posts secured from SNS data, was selected as the target drug. Blog posts, café posts, and NAVER Q&A posts from 2005 to 2020 were collected from NAVER, a portal site containing drug-related information, and natural language processing techniques were applied to analyze data written in Korean. Posts containing highly relevant drug names and ADR word pairs were filtered through association analysis, and training data were generated through manual labeling tasks. Using the training data, an embedded layer of word2vec was formed, and a Bidirectional Long Short-Term Memory (Bi-LSTM) classification model was generated. Then, we evaluated the area under the curve with other machine learning models. In addition, the entire process was further verified using the nonsteroidal anti-inflammatory drug aceclofenac.</p><p><strong>Results: </strong>Among the nonsteroidal anti-inflammatory drugs, Korean SNS posts containing information on ketoprofen and aceclofenac were secured, and the generic name lexicon, ADR lexicon, and Korean stop word lexicon were generated. In addition, to improve the accuracy of the classification model, an embedding layer was created considering the association between the drug name and the ADR word. In the ADR post classification test, ketoprofen and aceclofenac achieved 85% and 80% accuracy, respectively.</p><p><strong>Conclusions: </strong>Here, we propose a process for developing a model for classifying ADR posts using SNS data. After analyzing drug name-ADR patterns, we filtered high-quality data by extracting posts, including known ADR words based on the analysis. Based on these data, we developed a model that classifies ADR posts. This confirmed that a model that can leverage social data to monitor ADRs automatically is feasible.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e45289"},"PeriodicalIF":3.1,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11601139/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142683724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}