首页 > 最新文献

JMIR Medical Informatics最新文献

英文 中文
CT Radiomics-Based Machine Learning Model for Predicting Capsular and Neural Invasion in Thyroid Carcinoma: Diagnostic Accuracy Study. 基于CT放射组学的机器学习模型预测甲状腺癌的囊膜和神经浸润:诊断准确性研究。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-03-12 DOI: 10.2196/77349
Fang-Fang Cong, Ke Tian, Qian Gao, Fulin Wang, Peng Sun, Nan Xu
<p><strong>Background: </strong>Thyroid carcinoma is the most prevalent endocrine malignancy, with a worldwide increasing incidence. Capsular invasion and neural invasion (NI) are pivotal prognostic factors for recurrence and survival; however, their preoperative noninvasive assessment remains challenging.</p><p><strong>Objective: </strong>We aimed to identify computed tomography (CT) radiomic biomarkers associated with capsular invasion in thyroid carcinoma, construct machine learning models integrating radiomic and clinical data, and evaluate their utility for NI risk stratification.</p><p><strong>Methods: </strong>In this retrospective cohort, 111 patients with thyroid carcinoma were divided into capsular invasion-positive (n=63) and capsular invasion-negative (n=48) groups, with 37 (33.3%) cases presenting concurrent NI. Radiomic features were extracted from arterial and venous phase CT images at original resolution, including 111 gray-level co-occurrence matrix features. Nine key radiomic features (A1-A9) were selected via least absolute shrinkage and selection operator regression (λ=0.017). To preserve the physical meaning of texture features (eg, spatial correlation and contrast reflecting tumor microstructural heterogeneity), no resampling or scaling was performed on the regions of interest during radiomic feature extraction. Nomogram models and random forest (RF) models were constructed based on clinical indicators (galectin-3, etc) and radiomic features, respectively. Additionally, a neural network (NN) model integrating multimodal data was developed. Model stability was verified using 5-fold cross-validation and 1000-time bootstrap resampling, while performance was evaluated via receiver operating characteristic curves, calibration curves, and decision curve analysis.</p><p><strong>Results: </strong>Model performance analysis revealed that among the nomogram models, the clinical indicator-based nomogram achieved an internally estimated area under the curve (AUC) of 0.9418 (95% CI 0.892-0.976) in the capsular invasion prediction task. The radiomic-based nomogram had an internally estimated AUC of 0.9334 (95% CI 0.881-0.968) in the capsular invasion prediction task and 0.8001 (95% CI 0.663-0.898) in the cross-label association analysis task. In RF models, clinical indicator-based and radiomic-based RFs exhibited an AUC of 0.7646 (95% CI 0.651-0.857) and 0.8102 (95% CI 0.703-0.892) in the cross-label association analysis task, respectively. The NN model performed promisingly, with an AUC of 0.775 (95% CI 0.621-0.903) in the cross-label association analysis task and a mean absolute error of <0.05 on the calibration curve.</p><p><strong>Conclusions: </strong>Capsular invasion is a strong predictor of NI risk in thyroid carcinoma. Radiomic models based solely on preoperative CT images show potential for preoperative NI risk stratification. Models incorporating clinical parameters (obtained from postoperative tissue), including the integrated
背景:甲状腺癌是最常见的内分泌恶性肿瘤,其发病率在世界范围内呈上升趋势。囊膜侵犯和神经侵犯(NI)是复发和生存的关键预后因素;然而,他们的术前无创评估仍然具有挑战性。目的:我们旨在识别与甲状腺癌包膜侵袭相关的计算机断层扫描(CT)放射组学生物标志物,构建整合放射组学和临床数据的机器学习模型,并评估其在NI风险分层中的应用。方法:将111例甲状腺癌患者分为囊膜浸润阳性组(63例)和囊膜浸润阴性组(48例),其中37例(33.3%)合并NI。以原始分辨率提取动脉和静脉相CT图像的放射学特征,包括111个灰度共现矩阵特征。通过最小绝对收缩和选择算子回归(λ=0.017)选择9个关键放射学特征(A1-A9)。为了保留纹理特征的物理意义(例如,反映肿瘤微观结构异质性的空间相关性和对比度),在放射学特征提取过程中,没有对感兴趣的区域进行重采样或缩放。根据临床指标(半乳糖凝集素-3等)和放射学特征分别构建Nomogram模型和random forest (RF)模型。在此基础上,建立了多模态数据的神经网络模型。通过5次交叉验证和1000次自举重采样验证了模型的稳定性,通过接收机工作特征曲线、校准曲线和决策曲线分析评估了模型的性能。结果:模型性能分析显示,基于临床指标的nomogram模型在预测包膜侵犯任务中获得的曲线下面积(AUC)为0.9418 (95% CI 0.892 ~ 0.976)。放射组学为基础的诺图在包膜侵犯预测任务中的内部估计AUC为0.9334 (95% CI 0.881-0.968),在交叉标签关联分析任务中的内部估计AUC为0.8001 (95% CI 0.663-0.898)。在RF模型中,基于临床指标的RF和基于放射组学的RF在交叉标签关联分析任务中的AUC分别为0.7646 (95% CI 0.651-0.857)和0.8102 (95% CI 0.703-0.892)。神经网络模型表现良好,交叉标签关联分析任务的AUC为0.775 (95% CI为0.621-0.903),平均绝对误差为:结论:囊膜侵犯是甲状腺癌NI风险的有力预测因子。仅基于术前CT图像的放射学模型显示了术前NI风险分层的潜力。纳入临床参数(从术后组织中获得)的模型,包括综合多模态模型,更准确地表征为术后风险分层工具。将原始CT图像与临床数据整合在一起的神经网络模型的AUC为0.775 (95% CI为0.621-0.903),强调了这种多模态分析在捕获成像表型与组织水平生物标志物之间的复杂关系以增强术后评估方面的潜力。该框架的放射学成分指向纯粹基于图像的术前评估工具的发展。
{"title":"CT Radiomics-Based Machine Learning Model for Predicting Capsular and Neural Invasion in Thyroid Carcinoma: Diagnostic Accuracy Study.","authors":"Fang-Fang Cong, Ke Tian, Qian Gao, Fulin Wang, Peng Sun, Nan Xu","doi":"10.2196/77349","DOIUrl":"10.2196/77349","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Thyroid carcinoma is the most prevalent endocrine malignancy, with a worldwide increasing incidence. Capsular invasion and neural invasion (NI) are pivotal prognostic factors for recurrence and survival; however, their preoperative noninvasive assessment remains challenging.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;We aimed to identify computed tomography (CT) radiomic biomarkers associated with capsular invasion in thyroid carcinoma, construct machine learning models integrating radiomic and clinical data, and evaluate their utility for NI risk stratification.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;In this retrospective cohort, 111 patients with thyroid carcinoma were divided into capsular invasion-positive (n=63) and capsular invasion-negative (n=48) groups, with 37 (33.3%) cases presenting concurrent NI. Radiomic features were extracted from arterial and venous phase CT images at original resolution, including 111 gray-level co-occurrence matrix features. Nine key radiomic features (A1-A9) were selected via least absolute shrinkage and selection operator regression (λ=0.017). To preserve the physical meaning of texture features (eg, spatial correlation and contrast reflecting tumor microstructural heterogeneity), no resampling or scaling was performed on the regions of interest during radiomic feature extraction. Nomogram models and random forest (RF) models were constructed based on clinical indicators (galectin-3, etc) and radiomic features, respectively. Additionally, a neural network (NN) model integrating multimodal data was developed. Model stability was verified using 5-fold cross-validation and 1000-time bootstrap resampling, while performance was evaluated via receiver operating characteristic curves, calibration curves, and decision curve analysis.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;Model performance analysis revealed that among the nomogram models, the clinical indicator-based nomogram achieved an internally estimated area under the curve (AUC) of 0.9418 (95% CI 0.892-0.976) in the capsular invasion prediction task. The radiomic-based nomogram had an internally estimated AUC of 0.9334 (95% CI 0.881-0.968) in the capsular invasion prediction task and 0.8001 (95% CI 0.663-0.898) in the cross-label association analysis task. In RF models, clinical indicator-based and radiomic-based RFs exhibited an AUC of 0.7646 (95% CI 0.651-0.857) and 0.8102 (95% CI 0.703-0.892) in the cross-label association analysis task, respectively. The NN model performed promisingly, with an AUC of 0.775 (95% CI 0.621-0.903) in the cross-label association analysis task and a mean absolute error of &lt;0.05 on the calibration curve.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;Capsular invasion is a strong predictor of NI risk in thyroid carcinoma. Radiomic models based solely on preoperative CT images show potential for preoperative NI risk stratification. Models incorporating clinical parameters (obtained from postoperative tissue), including the integrated","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e77349"},"PeriodicalIF":3.8,"publicationDate":"2026-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12981638/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147445693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Application of a Large Visual Language Model on Tongue Image Description Generation and Physical Constitution Reasoning in Traditional Chinese Medicine (TongueVLM): Model Development and Validation Study. 大型视觉语言模型在中医舌象描述生成和体格推理中的应用(TongueVLM):模型开发与验证研究。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-03-12 DOI: 10.2196/87237
Chengdong Peng, Jun Gao, Nuo Yang, Yong Wang, Renming Chen, Changwu Dong

Background: In the field of traditional Chinese medicine (TCM), diagnostic work based on tongue images to recognize the physical constitution is a process of collecting clinical information, reasoning, and combining the patient's tongue image features with questioning. It is necessary to simulate the recognition of pathological information of tongue images by TCM practitioners and professional dialogue based on tongue image features, which helps to develop an intelligent interactive system for TCM diagnosis.

Objective: This study aimed to develop and validate a vertical model of the TCM domain with TCM's understanding and reasoning capability for tongue images.

Methods: A TongueVLM multimodal large model is designed, which includes a visual encoder module, a modal fusion module, and a language decoder module. First, the visual encoder based on the CLIP-ViT (Contrastive Language-Image Pre-Training With Vision Transformer) pretrained model is used for image patch, dimensionality reduction, and migration learning, which maps the high-dimensional tongue features into low-dimensional language encoding vectors. Further, a modal fusion module with a residual architecture is applied to map visual features to a natural language word embedding space, realizing the conceptual alignment between visual encoding and TCM terminology. Finally, fine-tuning of visual instructions is performed based on the LLaMA (large language model meta artificial intelligence), and a TCM-domain large language model with 7B parameters is trained.

Results: The constructed multimodal dataset has 3 test datasets, and experiments are conducted using 3000 samples from each test dataset, respectively. Experimental results indicate that the TongueVLM model outperforms general-purpose large models on all 3 tasks. On the multimodal test dataset, the TongueVLM model achieved accuracy rates of 79.8%, 78.6%, and 60.7% in evaluation tasks respectively, it achieves 9.1%, 8.4%, and 1.1% in greater accuracy than LLaVA-OneVision, and is 7.5%, 7%, and 5.9% more accurate than Qwen2.5-VL-7B, with the text generation time being around 24 tokens per second.

Conclusions: The TongueVLM model, which achieves tongue image description generation and physical constitution reasoning in TCM, is suitable for the application of a Chinese medicine intelligent diagnosis system.

背景:在中医领域,基于舌像识别身体体质的诊断工作是一个收集临床信息、推理、结合患者舌像特征进行质疑的过程。有必要模拟中医医师对舌像病理信息的识别和基于舌像特征的专业对话,这有助于开发中医诊断的智能交互系统。目的:本研究旨在利用中医对舌象的理解和推理能力,开发并验证中医领域的垂直模型。方法:设计了一个TongueVLM多模态大模型,该模型包括视觉编码器模块、模态融合模块和语言解码器模块。首先,采用基于CLIP-ViT (contrast language - image Pre-Training With Vision Transformer)预训练模型的视觉编码器进行图像贴片、降维和迁移学习,将高维语言特征映射到低维语言编码向量中;利用残差架构的模态融合模块将视觉特征映射到自然语言词嵌入空间,实现视觉编码与中医术语的概念对齐。最后,基于LLaMA(大语言模型元人工智能)对视觉指令进行微调,训练出具有7B个参数的中医领域大语言模型。结果:构建的多模态数据集有3个测试数据集,每个测试数据集分别使用3000个样本进行实验。实验结果表明,TongueVLM模型在所有3个任务上都优于通用大型模型。在多模态测试数据集上,TongueVLM模型在评估任务上的准确率分别达到79.8%、78.6%和60.7%,比LLaVA-OneVision准确率分别提高9.1%、8.4%和1.1%,比Qwen2.5-VL-7B准确率分别提高7.5%、7%和5.9%,文本生成时间约为每秒24个标记。结论:TongueVLM模型实现了中医舌图像描述生成和体质推理,适合应用于中医智能诊断系统。
{"title":"Application of a Large Visual Language Model on Tongue Image Description Generation and Physical Constitution Reasoning in Traditional Chinese Medicine (TongueVLM): Model Development and Validation Study.","authors":"Chengdong Peng, Jun Gao, Nuo Yang, Yong Wang, Renming Chen, Changwu Dong","doi":"10.2196/87237","DOIUrl":"https://doi.org/10.2196/87237","url":null,"abstract":"<p><strong>Background: </strong>In the field of traditional Chinese medicine (TCM), diagnostic work based on tongue images to recognize the physical constitution is a process of collecting clinical information, reasoning, and combining the patient's tongue image features with questioning. It is necessary to simulate the recognition of pathological information of tongue images by TCM practitioners and professional dialogue based on tongue image features, which helps to develop an intelligent interactive system for TCM diagnosis.</p><p><strong>Objective: </strong>This study aimed to develop and validate a vertical model of the TCM domain with TCM's understanding and reasoning capability for tongue images.</p><p><strong>Methods: </strong>A TongueVLM multimodal large model is designed, which includes a visual encoder module, a modal fusion module, and a language decoder module. First, the visual encoder based on the CLIP-ViT (Contrastive Language-Image Pre-Training With Vision Transformer) pretrained model is used for image patch, dimensionality reduction, and migration learning, which maps the high-dimensional tongue features into low-dimensional language encoding vectors. Further, a modal fusion module with a residual architecture is applied to map visual features to a natural language word embedding space, realizing the conceptual alignment between visual encoding and TCM terminology. Finally, fine-tuning of visual instructions is performed based on the LLaMA (large language model meta artificial intelligence), and a TCM-domain large language model with 7B parameters is trained.</p><p><strong>Results: </strong>The constructed multimodal dataset has 3 test datasets, and experiments are conducted using 3000 samples from each test dataset, respectively. Experimental results indicate that the TongueVLM model outperforms general-purpose large models on all 3 tasks. On the multimodal test dataset, the TongueVLM model achieved accuracy rates of 79.8%, 78.6%, and 60.7% in evaluation tasks respectively, it achieves 9.1%, 8.4%, and 1.1% in greater accuracy than LLaVA-OneVision, and is 7.5%, 7%, and 5.9% more accurate than Qwen2.5-VL-7B, with the text generation time being around 24 tokens per second.</p><p><strong>Conclusions: </strong>The TongueVLM model, which achieves tongue image description generation and physical constitution reasoning in TCM, is suitable for the application of a Chinese medicine intelligent diagnosis system.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e87237"},"PeriodicalIF":3.8,"publicationDate":"2026-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147445728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Predictive Accuracy of Mood Symptoms Using Wearable Devices and Machine Learning in Bipolar Disorder. 使用可穿戴设备和机器学习提高双相情感障碍情绪症状的预测准确性。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-03-12 DOI: 10.2196/92172
P Keerthana Nair, Priyanka Renita D'Souza
{"title":"Enhancing Predictive Accuracy of Mood Symptoms Using Wearable Devices and Machine Learning in Bipolar Disorder.","authors":"P Keerthana Nair, Priyanka Renita D'Souza","doi":"10.2196/92172","DOIUrl":"10.2196/92172","url":null,"abstract":"","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e92172"},"PeriodicalIF":3.8,"publicationDate":"2026-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12981629/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147445737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of a Practical Nomogram for Depression Risk Stratification in Older Adults With Hypertension and Diabetes: Retrospective Analysis of Data From the China Health and Retirement Longitudinal Study. 老年高血压合并糖尿病患者抑郁风险分层实用Nomogram:中国健康与退休纵向研究数据的回顾性分析
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-03-12 DOI: 10.2196/81529
Ting Peng, Ying Zhang, Rujia Miao, Jiangang Wang
<p><strong>Background: </strong>Depression affects over 40% of middle-aged and older Chinese adults living with both hypertension and diabetes, amplifying cardiovascular risk, functional decline, and mortality. Existing screening instruments-such as the 10-item Center for Epidemiologic Studies Depression Scale-focus narrowly on mood symptoms and are rarely feasible in busy primary care consultations. They also omit routine functional, cognitive, and social data that may jointly drive depressive states in cardiometabolic populations.</p><p><strong>Objective: </strong>This study aimed to develop and validate a concise, clinically actionable nomogram that quantifies individual depression risk using readily available information in Chinese adults aged ≥45 years who have diagnosed hypertension and type 2 diabetes.</p><p><strong>Methods: </strong>We analyzed anonymized wave 5 China Health and Retirement Longitudinal Study data collected between July 2020 and August 2020. Of 1504 eligible participants, 635 (42.2%) met the Center for Epidemiologic Studies Depression Scale cutoff score of >10 for probable depression. A total of 42 candidate predictors spanning demographics, laboratory values, comorbidities, functional status, and socioenvironmental factors were screened. Least absolute shrinkage and selection operator regression with 10-fold cross-validation identified the most parsimonious set. A multivariable logistic model was built on a 70% training set (n=1052) and evaluated on a 30% testing set (n=452). Performance was assessed using the area under the receiver operating characteristic curve (AUC), calibration plots, decision curve analysis, and Shapley additive explanations for interpretability. Multiple imputation was used to handle <20% missingness.</p><p><strong>Results: </strong>Nine nonredundant predictors entered the final nomogram: activity of daily living score, memory impairment, number of pain sites, sleep duration, life satisfaction score, self-rated health score, social activity engagement score, retirement status, and memory test score. The model achieved excellent discrimination (training AUC=0.819; testing AUC=0.825) and calibration (mean absolute error ≤0.018). Decision curves demonstrated positive net clinical benefit across clinically relevant threshold probabilities. Shapley additive explanations analysis revealed a 3-fold increase in depression odds per 1-point increase in activity of daily living score, whereas retirement conferred substantial protection (prevalence of depression: 103/635, 16.2% in the retired group vs 269/869, 31.0% in the nonretired group; P<.001), mediated by greater social participation.</p><p><strong>Conclusions: </strong>The 9-item nomogram enables <3-minute depression risk stratification in resource-limited primary care settings for adults with hypertension and diabetes. Functional decline, affective-cognitive burden, and socioeconomic disengagement constitute the dominant causal pathway. Prospective tri
背景:超过40%的中国中老年高血压和糖尿病患者患有抑郁症,增加了心血管风险、功能下降和死亡率。现有的筛查工具,如10项流行病学研究中心抑郁症量表,只局限于情绪症状,在繁忙的初级保健咨询中很少可行。他们还忽略了可能共同导致心脏代谢人群抑郁状态的常规功能、认知和社会数据。目的:本研究旨在开发并验证一种简明的、临床可操作的nomogram (nomogram),该nomogram (nomogram)利用现有信息量化年龄≥45岁、诊断为高血压和2型糖尿病的中国成年人的个体抑郁风险。方法:我们分析了2020年7月至2020年8月收集的中国健康与退休纵向研究的匿名第五波数据。在1504名符合条件的参与者中,635名(42.2%)达到了流行病学研究中心抑郁量表中可能抑郁的分值bb10。总共筛选了42个候选预测因子,包括人口统计学、实验室值、合并症、功能状态和社会环境因素。最小绝对收缩和选择算子回归与10倍交叉验证确定了最节俭的集合。在70%的训练集(n=1052)上建立多变量逻辑模型,并在30%的测试集(n=452)上进行评估。使用受试者工作特征曲线(AUC)下的面积、校准图、决策曲线分析和Shapley可解释性解释来评估性能。结果:日常生活活动评分、记忆障碍、疼痛部位数量、睡眠时间、生活满意度评分、自评健康评分、社会活动参与评分、退休状态和记忆测试评分进入最终的nomogram。该模型具有良好的判别性(训练AUC=0.819,检验AUC=0.825)和校正性(平均绝对误差≤0.018)。决策曲线在临床相关阈值概率上显示了正的净临床效益。Shapley加性解释分析显示,日常生活活动得分每增加1分,抑郁几率增加3倍,而退休给予了实质性的保护(抑郁患病率:退休组为103/635,16.2%,非退休组为269/869,31.0%)
{"title":"Development of a Practical Nomogram for Depression Risk Stratification in Older Adults With Hypertension and Diabetes: Retrospective Analysis of Data From the China Health and Retirement Longitudinal Study.","authors":"Ting Peng, Ying Zhang, Rujia Miao, Jiangang Wang","doi":"10.2196/81529","DOIUrl":"10.2196/81529","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Depression affects over 40% of middle-aged and older Chinese adults living with both hypertension and diabetes, amplifying cardiovascular risk, functional decline, and mortality. Existing screening instruments-such as the 10-item Center for Epidemiologic Studies Depression Scale-focus narrowly on mood symptoms and are rarely feasible in busy primary care consultations. They also omit routine functional, cognitive, and social data that may jointly drive depressive states in cardiometabolic populations.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aimed to develop and validate a concise, clinically actionable nomogram that quantifies individual depression risk using readily available information in Chinese adults aged ≥45 years who have diagnosed hypertension and type 2 diabetes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;We analyzed anonymized wave 5 China Health and Retirement Longitudinal Study data collected between July 2020 and August 2020. Of 1504 eligible participants, 635 (42.2%) met the Center for Epidemiologic Studies Depression Scale cutoff score of &gt;10 for probable depression. A total of 42 candidate predictors spanning demographics, laboratory values, comorbidities, functional status, and socioenvironmental factors were screened. Least absolute shrinkage and selection operator regression with 10-fold cross-validation identified the most parsimonious set. A multivariable logistic model was built on a 70% training set (n=1052) and evaluated on a 30% testing set (n=452). Performance was assessed using the area under the receiver operating characteristic curve (AUC), calibration plots, decision curve analysis, and Shapley additive explanations for interpretability. Multiple imputation was used to handle &lt;20% missingness.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;Nine nonredundant predictors entered the final nomogram: activity of daily living score, memory impairment, number of pain sites, sleep duration, life satisfaction score, self-rated health score, social activity engagement score, retirement status, and memory test score. The model achieved excellent discrimination (training AUC=0.819; testing AUC=0.825) and calibration (mean absolute error ≤0.018). Decision curves demonstrated positive net clinical benefit across clinically relevant threshold probabilities. Shapley additive explanations analysis revealed a 3-fold increase in depression odds per 1-point increase in activity of daily living score, whereas retirement conferred substantial protection (prevalence of depression: 103/635, 16.2% in the retired group vs 269/869, 31.0% in the nonretired group; P&lt;.001), mediated by greater social participation.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;The 9-item nomogram enables &lt;3-minute depression risk stratification in resource-limited primary care settings for adults with hypertension and diabetes. Functional decline, affective-cognitive burden, and socioeconomic disengagement constitute the dominant causal pathway. Prospective tri","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e81529"},"PeriodicalIF":3.8,"publicationDate":"2026-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12981542/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147445672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced Prediction of Atrial Fibrillation in Patients With Ischemic Stroke Through Electronic Medical Records and Text Mining: Algorithm Development and Validation. 通过电子病历和文本挖掘增强对缺血性脑卒中患者房颤的预测:算法开发和验证。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-03-10 DOI: 10.2196/78117
Yu-Wei Chen, Sheng-Feng Sung, Ya-Han Hu, Yu-Hsuan Yang
<p><strong>Background: </strong>Stroke remains one of the leading causes of mortality and long-term disability worldwide. Atrial fibrillation (AF) is a major and often underdiagnosed risk factor for ischemic stroke as it is frequently asymptomatic and may remain undetected until a catastrophic cerebrovascular event occurs. The lack of timely identification and preventive treatment for AF substantially increases stroke risk. Although previous studies have proposed various predictive models for AF detection, many rely primarily on structured clinical variables and are developed using data from a single institution, which limits their generalizability and real-world applicability across different health care settings.</p><p><strong>Objective: </strong>The objective of this study was to develop a robust and generalizable AF risk prediction model for patients with stroke using electronic medical records. By integrating structured clinical variables with features derived from unstructured clinical text, this study aimed to construct a more comprehensive representation of patient health status. Furthermore, this study emphasized systematic internal and external validation, along with calibration assessment, to evaluate model stability and generalizability across multiple hospital datasets, thereby supporting its potential use in routine clinical practice.</p><p><strong>Methods: </strong>This study analyzed datasets from 2 hospitals in Taiwan: Landseed International Hospital (LIH), with 3988 patients, and Chia-Yi Christian Hospital (CYCH), with 5821 patients. We applied 5 feature engineering techniques to extract features from unstructured electronic medical record data, addressed data imbalance using 6 distinct resampling methods, and used 9 classification algorithms to compare model performance across both internal and external validation sets. This study identified the top 20 most important features from the best-performing models for both the LIH and CYCH datasets.</p><p><strong>Results: </strong>The optimal predictive model for LIH was based solely on structured variables, whereas the model for CYCH achieved superior results by integrating structured variables with text-derived variables obtained from unstructured clinical notes using term frequency-inverse document frequency. Notably, feature importance analysis consistently identified the ratio of E- to A-wave velocities, left atrial size, and age as the top 3 predictive factors across both datasets, underscoring their critical role in AF risk assessment among patients with stroke.</p><p><strong>Conclusions: </strong>This study demonstrated the development of predictive models for AF in patients with ischemic stroke. Notably, the integration of structured variables with variables derived from unstructured clinical text improved predictive performance in selected model configurations. Rigorous internal and external validation processes confirmed the superior performance of ensemble learning-based m
背景:中风仍然是世界范围内导致死亡和长期残疾的主要原因之一。心房颤动(AF)是缺血性脑卒中的一个主要且常被误诊的危险因素,因为它通常是无症状的,可能在发生灾难性脑血管事件之前一直未被发现。缺乏对房颤的及时识别和预防性治疗大大增加了卒中的风险。尽管先前的研究提出了各种房颤检测的预测模型,但许多模型主要依赖于结构化的临床变量,并且是使用单一机构的数据开发的,这限制了它们在不同医疗保健环境中的通用性和现实世界的适用性。目的:本研究的目的是利用电子病历为脑卒中患者建立一个可靠的、可推广的房颤风险预测模型。通过将结构化临床变量与非结构化临床文本的特征相结合,本研究旨在构建更全面的患者健康状况表征。此外,本研究强调系统的内部和外部验证,以及校准评估,以评估模型的稳定性和跨多个医院数据集的通用性,从而支持其在常规临床实践中的潜在应用。方法:本研究分析了台湾两家医院的数据,分别是Landseed International Hospital (LIH) 3988例患者和Chia-Yi Christian Hospital (CYCH) 5821例患者。我们应用5种特征工程技术从非结构化电子病历数据中提取特征,使用6种不同的重采样方法解决数据不平衡问题,并使用9种分类算法比较内部和外部验证集的模型性能。本研究从LIH和CYCH数据集的最佳表现模型中确定了前20个最重要的特征。结果:LIH的最佳预测模型仅基于结构化变量,而CYCH模型通过使用术语频率逆文档频率将结构化变量与从非结构化临床记录中获得的文本衍生变量相结合,获得了更好的结果。值得注意的是,特征重要性分析一致认为,E波与a波速度的比率、左心房大小和年龄是两个数据集中最重要的3个预测因素,强调了它们在卒中患者房颤风险评估中的关键作用。结论:本研究证实了缺血性脑卒中患者房颤预测模型的发展。值得注意的是,将结构化变量与来自非结构化临床文本的变量整合在一起,在选定的模型配置中提高了预测性能。与其他算法相比,严格的内部和外部验证过程证实了基于集成学习的机器学习模型的优越性能,强调了这种方法在房颤风险预测方面的潜力。
{"title":"Enhanced Prediction of Atrial Fibrillation in Patients With Ischemic Stroke Through Electronic Medical Records and Text Mining: Algorithm Development and Validation.","authors":"Yu-Wei Chen, Sheng-Feng Sung, Ya-Han Hu, Yu-Hsuan Yang","doi":"10.2196/78117","DOIUrl":"10.2196/78117","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Stroke remains one of the leading causes of mortality and long-term disability worldwide. Atrial fibrillation (AF) is a major and often underdiagnosed risk factor for ischemic stroke as it is frequently asymptomatic and may remain undetected until a catastrophic cerebrovascular event occurs. The lack of timely identification and preventive treatment for AF substantially increases stroke risk. Although previous studies have proposed various predictive models for AF detection, many rely primarily on structured clinical variables and are developed using data from a single institution, which limits their generalizability and real-world applicability across different health care settings.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;The objective of this study was to develop a robust and generalizable AF risk prediction model for patients with stroke using electronic medical records. By integrating structured clinical variables with features derived from unstructured clinical text, this study aimed to construct a more comprehensive representation of patient health status. Furthermore, this study emphasized systematic internal and external validation, along with calibration assessment, to evaluate model stability and generalizability across multiple hospital datasets, thereby supporting its potential use in routine clinical practice.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;This study analyzed datasets from 2 hospitals in Taiwan: Landseed International Hospital (LIH), with 3988 patients, and Chia-Yi Christian Hospital (CYCH), with 5821 patients. We applied 5 feature engineering techniques to extract features from unstructured electronic medical record data, addressed data imbalance using 6 distinct resampling methods, and used 9 classification algorithms to compare model performance across both internal and external validation sets. This study identified the top 20 most important features from the best-performing models for both the LIH and CYCH datasets.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;The optimal predictive model for LIH was based solely on structured variables, whereas the model for CYCH achieved superior results by integrating structured variables with text-derived variables obtained from unstructured clinical notes using term frequency-inverse document frequency. Notably, feature importance analysis consistently identified the ratio of E- to A-wave velocities, left atrial size, and age as the top 3 predictive factors across both datasets, underscoring their critical role in AF risk assessment among patients with stroke.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;This study demonstrated the development of predictive models for AF in patients with ischemic stroke. Notably, the integration of structured variables with variables derived from unstructured clinical text improved predictive performance in selected model configurations. Rigorous internal and external validation processes confirmed the superior performance of ensemble learning-based m","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e78117"},"PeriodicalIF":3.8,"publicationDate":"2026-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12975001/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147437989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artificial Intelligence Models for Predicting Triage in Emergency Departments: Seven-Month Retrospective Comparative Study of Natural Language Processing, Large Language Model, and Joint Embedding Predictive Architectures. 用于预测急诊科分诊的人工智能模型:自然语言处理、大语言模型和联合嵌入预测架构的七个月回顾性比较研究。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-03-10 DOI: 10.2196/83318
Edouard Lansiaux, Ramy Azzouz, Emmanuel Chazard, Amélie Vromant, Eric Wiel
<p><strong>Background: </strong>Triage errors in emergency departments (EDs), including undertriage and overtriage, pose significant risks to patient safety and resource allocation. With increasing patient volumes and staffing challenges, artificial intelligence (AI) integration into triage protocols has gained attention as a potential solution.</p><p><strong>Objective: </strong>This study aims to develop and compare 3 AI models-natural language processing (NLP), large language model (LLM), and Joint Embedding Predictive Architecture (JEPA)-for predicting triage outcomes according to the French Emergency Nurses Classification in Hospital (FRENCH) scale and to assess their performance relative to nurse triage and clinical expert consensus.</p><p><strong>Methods: </strong>We conducted a retrospective analysis of prospectively collected data from adult patients triaged at Roger Salengro Hospital ED (Lille, France) over 7 months (June-December 2024). Three AI models were developed: TRIAGEMASTER (NLP with Doc2Vec + MLP), URGENTIAPARSE (LLM with FlauBERT + Extreme Gradient Boosting [XGBoost]), and EMERGINET (JEPA with variance-invariance-covariance regularization). Of 73,236 ED visits, 657 (0.90%) had complete audio recordings and structured data. Data were split 80:20 into training and validation sets with stratification. Gold-standard labels were established by senior clinician consensus (minimum 5 years of ED experience). The primary outcome was concordance with the gold-standard FRENCH triage level, assessed using weighted κ, Spearman correlation, F1-score, area under the receiver operating characteristic (AUC-ROC) curve, mean absolute error (MAE), and root mean square error (RMSE). Secondary analyses evaluated Groupes d'Etude Multicentrique des Services d'Accueil (GEMSA) prediction and performance by input data type.</p><p><strong>Results: </strong>URGENTIAPARSE demonstrated superior performance, with a composite z score of 2.514 compared with EMERGINET (0.438), TRIAGEMASTER (-3.511), and nurse triage (-4.343). URGENTIAPARSE achieved an F1-score of 0.900 (95% CI 0.876-0.924), an AUC-ROC of 0.879 (95% CI 0.851-0.907), a weighted κ of 0.800 (P<.001), a Spearman correlation of 0.802 (P<.001), an MAE of 0.228, and an RMSE of 0.790. Exact agreement was 90.0%, with near-agreement (+1 or -1 level) of 92.8%. However, training showed perfect accuracy (1.0) with poor validation performance (~0.5), indicating overfitting. EMERGINET achieved moderate performance (F1-score=0.731, AUC 0.686), while TRIAGEMASTER and nurse triage performed poorly (F1-score=0.618 and 0.303, respectively). For GEMSA prediction, URGENTIAPARSE maintained superiority (κ=0.863, Spearman=0.864, P<.001). Class 1 (highest acuity) was underrepresented (4/657, 0.61%), limiting undertriage risk assessment.</p><p><strong>Conclusions: </strong>The LLM-based architecture (URGENTIAPARSE) demonstrated the highest accuracy for ED triage prediction among the tested models, outperforming traditional
背景:急诊科(EDs)的分类错误,包括分类不足和过度分类,对患者安全和资源分配构成重大风险。随着患者数量的增加和人员配备的挑战,将人工智能(AI)集成到分诊方案中作为一种潜在的解决方案受到了关注。目的:本研究旨在开发和比较3种人工智能模型——自然语言处理(NLP)、大语言模型(LLM)和联合嵌入预测架构(JEPA)——用于根据法国医院急诊护士分类(French)量表预测分诊结果,并评估它们相对于护士分诊和临床专家共识的表现。方法:我们对在法国里尔Roger Salengro医院ED分诊的7个月(2024年6月至12月)的成年患者进行回顾性分析。开发了三种人工智能模型:TRIAGEMASTER (NLP + Doc2Vec + MLP)、urgentiparse (LLM + FlauBERT + Extreme Gradient Boosting [XGBoost])和emergeninet (JEPA +方差-不变-协方差正则化)。在73236例急诊科就诊中,657例(0.90%)有完整的录音和结构化数据。数据按80:20分成训练集和验证集,并分层。金标准标签是由资深临床医生共识(至少5年ED经验)建立的。主要结局是与金标准法国分类水平的一致性,使用加权κ、Spearman相关、f1评分、受试者工作特征(AUC-ROC)曲线下面积、平均绝对误差(MAE)和均方根误差(RMSE)进行评估。二级分析通过输入数据类型评估了Groupes d’etudes multicentrque des Services d’accueil (GEMSA)的预测和性能。结果:urgentiparse表现出优越的性能,其综合z评分为2.514,优于emergeninet(0.438)、TRIAGEMASTER(-3.511)和护士分诊(-4.343)。URGENTIAPARSE的f1评分为0.900 (95% CI 0.876-0.924), AUC-ROC为0.879 (95% CI 0.851-0.907),加权κ为0.800 (pp结论:在测试的模型中,基于llm的架构(URGENTIAPARSE)显示出ED分诊预测的最高准确性,优于传统的NLP, JEPA和当前的护士分诊实践。然而,严重的过拟合、极端的选择偏倚(657/ 73236,0.90%,纳入)、单中心设计和稀疏的高灵敏度代表限制了临床适用性。在部署之前,该模型需要进行规范化、跨不同ed的外部验证、前瞻性测试和全面的安全性评估,特别是对分流检测。人工智能分诊支持系统的整合显示出前景,但需要严格的验证、减少偏见和透明的不确定性量化,以确保患者安全。
{"title":"Artificial Intelligence Models for Predicting Triage in Emergency Departments: Seven-Month Retrospective Comparative Study of Natural Language Processing, Large Language Model, and Joint Embedding Predictive Architectures.","authors":"Edouard Lansiaux, Ramy Azzouz, Emmanuel Chazard, Amélie Vromant, Eric Wiel","doi":"10.2196/83318","DOIUrl":"https://doi.org/10.2196/83318","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Triage errors in emergency departments (EDs), including undertriage and overtriage, pose significant risks to patient safety and resource allocation. With increasing patient volumes and staffing challenges, artificial intelligence (AI) integration into triage protocols has gained attention as a potential solution.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aims to develop and compare 3 AI models-natural language processing (NLP), large language model (LLM), and Joint Embedding Predictive Architecture (JEPA)-for predicting triage outcomes according to the French Emergency Nurses Classification in Hospital (FRENCH) scale and to assess their performance relative to nurse triage and clinical expert consensus.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;We conducted a retrospective analysis of prospectively collected data from adult patients triaged at Roger Salengro Hospital ED (Lille, France) over 7 months (June-December 2024). Three AI models were developed: TRIAGEMASTER (NLP with Doc2Vec + MLP), URGENTIAPARSE (LLM with FlauBERT + Extreme Gradient Boosting [XGBoost]), and EMERGINET (JEPA with variance-invariance-covariance regularization). Of 73,236 ED visits, 657 (0.90%) had complete audio recordings and structured data. Data were split 80:20 into training and validation sets with stratification. Gold-standard labels were established by senior clinician consensus (minimum 5 years of ED experience). The primary outcome was concordance with the gold-standard FRENCH triage level, assessed using weighted κ, Spearman correlation, F1-score, area under the receiver operating characteristic (AUC-ROC) curve, mean absolute error (MAE), and root mean square error (RMSE). Secondary analyses evaluated Groupes d'Etude Multicentrique des Services d'Accueil (GEMSA) prediction and performance by input data type.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;URGENTIAPARSE demonstrated superior performance, with a composite z score of 2.514 compared with EMERGINET (0.438), TRIAGEMASTER (-3.511), and nurse triage (-4.343). URGENTIAPARSE achieved an F1-score of 0.900 (95% CI 0.876-0.924), an AUC-ROC of 0.879 (95% CI 0.851-0.907), a weighted κ of 0.800 (P&lt;.001), a Spearman correlation of 0.802 (P&lt;.001), an MAE of 0.228, and an RMSE of 0.790. Exact agreement was 90.0%, with near-agreement (+1 or -1 level) of 92.8%. However, training showed perfect accuracy (1.0) with poor validation performance (~0.5), indicating overfitting. EMERGINET achieved moderate performance (F1-score=0.731, AUC 0.686), while TRIAGEMASTER and nurse triage performed poorly (F1-score=0.618 and 0.303, respectively). For GEMSA prediction, URGENTIAPARSE maintained superiority (κ=0.863, Spearman=0.864, P&lt;.001). Class 1 (highest acuity) was underrepresented (4/657, 0.61%), limiting undertriage risk assessment.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;The LLM-based architecture (URGENTIAPARSE) demonstrated the highest accuracy for ED triage prediction among the tested models, outperforming traditional","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e83318"},"PeriodicalIF":3.8,"publicationDate":"2026-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147438021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging Population Patterns and Individual Prediction: Framework for Prospective Multimorbidity Study. 桥梁人口模式和个体预测:前瞻性多病研究框架。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-03-10 DOI: 10.2196/84261
Qianyao Zhang, Runtong Zhang, Weiguang Ma, Butian Zhao, Xiaomin Zhu

Background: Multimorbidity has become a major global public health challenge. However, existing research primarily emphasizes the identification of disease patterns at the population level and lacks the capacity to provide predictive insights into individual future pattern membership. Bridging this gap is crucial for personalized prevention and management.

Objective: This study aims to propose an innovative framework that integrates population-level multimorbidity pattern recognition with individual-level predictive modeling, thus advancing multimorbidity research from descriptive analysis to prospective multimorbidity pattern prediction.

Methods: Using longitudinal health follow-up data, we first applied latent transition analysis (LTA) to identify temporally stable multimorbidity patterns. These patterns were subsequently transformed into predictive labels to construct a novel deep learning model, CLA-Net (Cross-Lag Attention Network). CLA-Net is designed to predict individual future multimorbidity patterns by leveraging the complementary strengths of Gated Recurrent Units (GRU) and transformer architectures. It introduces a bitemporal directed cross-attention mechanism to simultaneously capture temporal dependencies and complex feature interactions. We compared CLA-Net against several advanced baselines and conducted ablation studies to validate its architectural components.

Results: In terms of pattern recognition, the LTA identified 5 clinically meaningful multimorbidity patterns: Cardiometabolic-Multisystem, Hypertension-Arthritis, Respiratory-Musculoskeletal, Metabolic Syndrome, and Gastritis-Arthritis. In terms of prediction, experimental results demonstrated that CLA-Net significantly outperformed all baseline models. CLA-Net achieved an accuracy of 0.8352 (SD 0.0048), a precision of 0.8326 (SD 0.0053), a recall of 0.8312 (SD 0.0056), and an F1-score of 0.8319 (SD 0.0051). Notably, it achieved an area under the curve of 0.9293, surpassing baseline models. Ablation studies confirmed the necessity of the dual-branch architecture and the directed cross-attention mechanism, as removing these components resulted in performance declines ranging from 0.93% to 2.50%.

Conclusions: This study extends the scope of LTA beyond descriptive statistical modeling and establishes the scientific value of multimorbidity pattern prediction as an independent research task. By bridging population-level insights with individual-level prediction, the proposed framework provides a data-driven tool for the prospective prediction of future multimorbidity pattern membership conditional on survival, thereby supporting stratified disease management and care planning, rather than general risk stratification for acute or end-stage deterioration. This offers new methodological and practical value for precision medicine and public health policymaking.

背景:多病已成为一项重大的全球公共卫生挑战。然而,现有的研究主要强调在人口水平上识别疾病模式,缺乏对个体未来模式成员提供预测性见解的能力。弥合这一差距对于个性化预防和管理至关重要。目的:本研究旨在提出一种将人群水平的多发病模式识别与个体水平的多发病模式预测建模相结合的创新框架,从而推动多发病研究从描述性分析向前瞻性多发病模式预测迈进。方法:利用纵向健康随访数据,我们首先应用潜在转变分析(LTA)来确定暂时稳定的多病模式。这些模式随后被转化为预测标签,以构建一个新的深度学习模型,CLA-Net(交叉滞后注意网络)。CLA-Net旨在通过利用门控循环单元(GRU)和变压器架构的互补优势来预测个体未来的多病态模式。它引入了双时间定向交叉注意机制,以同时捕获时间依赖性和复杂的特征交互。我们将CLA-Net与几个先进的基线进行了比较,并进行了消融研究,以验证其架构组件。结果:在模式识别方面,LTA识别出5种具有临床意义的多病模式:心脏代谢-多系统、高血压-关节炎、呼吸-肌肉-骨骼、代谢综合征和胃炎-关节炎。在预测方面,实验结果表明,CLA-Net显著优于所有基线模型。CLA-Net的准确度为0.8352 (SD 0.0048),精密度为0.8326 (SD 0.0053),召回率为0.8312 (SD 0.0056), f1评分为0.8319 (SD 0.0051)。值得注意的是,它实现了0.9293的曲线下面积,超过了基线模型。消融研究证实了双分支结构和定向交叉注意机制的必要性,因为去除这些组件会导致性能下降0.93%至2.50%。结论:本研究将LTA的范围扩展到描述性统计建模之外,确立了多发病模式预测作为一项独立研究任务的科学价值。通过将人群水平的见解与个体水平的预测联系起来,所提出的框架提供了一个数据驱动的工具,用于以生存为条件的未来多病模式成员的前瞻性预测,从而支持分层的疾病管理和护理计划,而不是急性或终末期恶化的一般风险分层。这为精准医疗和公共卫生决策提供了新的方法论和实用价值。
{"title":"Bridging Population Patterns and Individual Prediction: Framework for Prospective Multimorbidity Study.","authors":"Qianyao Zhang, Runtong Zhang, Weiguang Ma, Butian Zhao, Xiaomin Zhu","doi":"10.2196/84261","DOIUrl":"10.2196/84261","url":null,"abstract":"<p><strong>Background: </strong>Multimorbidity has become a major global public health challenge. However, existing research primarily emphasizes the identification of disease patterns at the population level and lacks the capacity to provide predictive insights into individual future pattern membership. Bridging this gap is crucial for personalized prevention and management.</p><p><strong>Objective: </strong>This study aims to propose an innovative framework that integrates population-level multimorbidity pattern recognition with individual-level predictive modeling, thus advancing multimorbidity research from descriptive analysis to prospective multimorbidity pattern prediction.</p><p><strong>Methods: </strong>Using longitudinal health follow-up data, we first applied latent transition analysis (LTA) to identify temporally stable multimorbidity patterns. These patterns were subsequently transformed into predictive labels to construct a novel deep learning model, CLA-Net (Cross-Lag Attention Network). CLA-Net is designed to predict individual future multimorbidity patterns by leveraging the complementary strengths of Gated Recurrent Units (GRU) and transformer architectures. It introduces a bitemporal directed cross-attention mechanism to simultaneously capture temporal dependencies and complex feature interactions. We compared CLA-Net against several advanced baselines and conducted ablation studies to validate its architectural components.</p><p><strong>Results: </strong>In terms of pattern recognition, the LTA identified 5 clinically meaningful multimorbidity patterns: Cardiometabolic-Multisystem, Hypertension-Arthritis, Respiratory-Musculoskeletal, Metabolic Syndrome, and Gastritis-Arthritis. In terms of prediction, experimental results demonstrated that CLA-Net significantly outperformed all baseline models. CLA-Net achieved an accuracy of 0.8352 (SD 0.0048), a precision of 0.8326 (SD 0.0053), a recall of 0.8312 (SD 0.0056), and an F1-score of 0.8319 (SD 0.0051). Notably, it achieved an area under the curve of 0.9293, surpassing baseline models. Ablation studies confirmed the necessity of the dual-branch architecture and the directed cross-attention mechanism, as removing these components resulted in performance declines ranging from 0.93% to 2.50%.</p><p><strong>Conclusions: </strong>This study extends the scope of LTA beyond descriptive statistical modeling and establishes the scientific value of multimorbidity pattern prediction as an independent research task. By bridging population-level insights with individual-level prediction, the proposed framework provides a data-driven tool for the prospective prediction of future multimorbidity pattern membership conditional on survival, thereby supporting stratified disease management and care planning, rather than general risk stratification for acute or end-stage deterioration. This offers new methodological and practical value for precision medicine and public health policymaking.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e84261"},"PeriodicalIF":3.8,"publicationDate":"2026-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12983216/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147437986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development and Evaluation of SNOMED CT Automated Mapping Tool: Advancing Terminology Standardization and Semantic Interoperability. SNOMED CT自动制图工具的开发与评价:推进术语标准化和语义互操作性。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-03-09 DOI: 10.2196/82670
Youngsun Park, Hannah Kang, Jiwon Kim, Soo-Yong Shin, Dosang Cho, Sang Youl Rhee, Hong Seok Park, Kyung-Jae Lee, Sungchul Bae

Background: Effective secondary use of healthcare data is hindered by fragmentation and a lack of semantic interoperability due to heterogeneous local terminologies. Standardizing clinical terms using SNOMED CT (Systematized Nomenclature of Medicine Clinical Terms) is essential but remains a manual, labor-intensive, and inconsistent process, especially across multiple institutions. Automated, scalable solutions are needed to support reliable mapping and new concept authoring for large-scale research.

Objective: We aimed to develop a large language model (LLM)-assisted tool that streamlines SNOMED CT terminology mapping and concept authoring, which enables seamless, standardized data integration across multi-institutional clinical datasets.

Methods: The mapping pipeline included preprocessing local terms, syntactic and LLM-based vector similarity mapping, and iterative enrichment based on validated results. Translation and semantic representation used GPT-4o (OpenAI). New concepts were authored through a structured postcoordination process, and both the efficiency and quality of authoring (including duplicate rate and Machine Readable Concept Model validation violations) were quantitatively evaluated. Performance was evaluated using diagnostic and surgical procedural terms from 4 major hospital networks (9 university hospitals) in South Korea, with additional usability feedback gathered from clinical terminologists.

Results: Using reference terms, top-5 accuracy for diagnostic mapping reached 98.7%, 89.7%, 98.5%, and 92.8% across the 4 institutions and 99.2%, 82.6%, 98.7%, and 84.7% for surgical procedural mapping. Implementation of the tool reduced manual mapping rates by 30% and overall manual workload by up to 90%. The proposed tool reduced average mapping and new concept creation time by approximately 75%, while decreasing the final mapping table processing time by 90%. New concept authoring errors also decreased, with duplicate concepts reduced by 83% and modeling rule violations by 72%.

Conclusions: This study developed and validated an automated, LLM-assisted SNOMED CT mapping tool that significantly improved efficiency, mapping accuracy, and new concept quality. Limitations include technical integration challenges and dependency on translation quality. Future directions involve leveraging SNOMED CT's ontology structure and knowledge graphs, enhancing sustainability through ongoing maintenance and quality assurance, and further advancing new concept authoring with automated Machine Readable Concept Model rule enforcement and inactivation processes to achieve robust and scalable terminology standardization.

背景:由于异构的本地术语,碎片化和缺乏语义互操作性阻碍了医疗保健数据的有效二次使用。使用SNOMED CT(医学临床术语系统化命名法)对临床术语进行标准化是必要的,但这仍然是一个手动的、劳动密集型的、不一致的过程,特别是在多个机构之间。需要自动化的、可扩展的解决方案来支持可靠的映射和大规模研究的新概念创作。目的:我们旨在开发一个大型语言模型(LLM)辅助工具,以简化SNOMED CT术语映射和概念创作,从而实现跨多机构临床数据集的无缝、标准化数据集成。方法:映射管道包括预处理局部术语,基于语法和llm的向量相似映射,以及基于验证结果的迭代充实。翻译和语义表示使用gpt - 40 (OpenAI)。通过结构化的后协调过程创作新概念,并定量评估创作的效率和质量(包括重复率和机器可读概念模型验证违规)。使用韩国4个主要医院网络(9所大学医院)的诊断和外科手术术语,并从临床术语专家那里收集了额外的可用性反馈,对绩效进行了评估。结果:使用参考术语,4家机构诊断标测准确率前5位分别为98.7%、89.7%、98.5%、92.8%,外科手术标测准确率分别为99.2%、82.6%、98.7%、84.7%。该工具的实现将手动映射率降低了30%,并将总体手动工作量降低了90%。该工具将平均映射和新概念创建时间减少了约75%,同时将最终映射表处理时间减少了90%。新概念创作错误也减少了,重复概念减少了83%,建模规则违反减少了72%。结论:本研究开发并验证了一种自动化的、llm辅助的SNOMED CT制图工具,该工具显著提高了效率、制图精度和新概念的质量。限制包括技术集成挑战和对翻译质量的依赖。未来的发展方向包括利用SNOMED CT的本体结构和知识图谱,通过持续的维护和质量保证来增强可持续性,并通过自动化的机器可读概念模型规则执行和灭活流程进一步推进新概念创作,以实现稳健和可扩展的术语标准化。
{"title":"Development and Evaluation of SNOMED CT Automated Mapping Tool: Advancing Terminology Standardization and Semantic Interoperability.","authors":"Youngsun Park, Hannah Kang, Jiwon Kim, Soo-Yong Shin, Dosang Cho, Sang Youl Rhee, Hong Seok Park, Kyung-Jae Lee, Sungchul Bae","doi":"10.2196/82670","DOIUrl":"https://doi.org/10.2196/82670","url":null,"abstract":"<p><strong>Background: </strong>Effective secondary use of healthcare data is hindered by fragmentation and a lack of semantic interoperability due to heterogeneous local terminologies. Standardizing clinical terms using SNOMED CT (Systematized Nomenclature of Medicine Clinical Terms) is essential but remains a manual, labor-intensive, and inconsistent process, especially across multiple institutions. Automated, scalable solutions are needed to support reliable mapping and new concept authoring for large-scale research.</p><p><strong>Objective: </strong>We aimed to develop a large language model (LLM)-assisted tool that streamlines SNOMED CT terminology mapping and concept authoring, which enables seamless, standardized data integration across multi-institutional clinical datasets.</p><p><strong>Methods: </strong>The mapping pipeline included preprocessing local terms, syntactic and LLM-based vector similarity mapping, and iterative enrichment based on validated results. Translation and semantic representation used GPT-4o (OpenAI). New concepts were authored through a structured postcoordination process, and both the efficiency and quality of authoring (including duplicate rate and Machine Readable Concept Model validation violations) were quantitatively evaluated. Performance was evaluated using diagnostic and surgical procedural terms from 4 major hospital networks (9 university hospitals) in South Korea, with additional usability feedback gathered from clinical terminologists.</p><p><strong>Results: </strong>Using reference terms, top-5 accuracy for diagnostic mapping reached 98.7%, 89.7%, 98.5%, and 92.8% across the 4 institutions and 99.2%, 82.6%, 98.7%, and 84.7% for surgical procedural mapping. Implementation of the tool reduced manual mapping rates by 30% and overall manual workload by up to 90%. The proposed tool reduced average mapping and new concept creation time by approximately 75%, while decreasing the final mapping table processing time by 90%. New concept authoring errors also decreased, with duplicate concepts reduced by 83% and modeling rule violations by 72%.</p><p><strong>Conclusions: </strong>This study developed and validated an automated, LLM-assisted SNOMED CT mapping tool that significantly improved efficiency, mapping accuracy, and new concept quality. Limitations include technical integration challenges and dependency on translation quality. Future directions involve leveraging SNOMED CT's ontology structure and knowledge graphs, enhancing sustainability through ongoing maintenance and quality assurance, and further advancing new concept authoring with automated Machine Readable Concept Model rule enforcement and inactivation processes to achieve robust and scalable terminology standardization.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e82670"},"PeriodicalIF":3.8,"publicationDate":"2026-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147391453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Technological Strategies for the Patient Experience in Emergency Departments: Scoping Review. 急诊病人体验的技术策略:范围审查。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-03-09 DOI: 10.2196/79782
Andrés Santiago Santafé, Juan Manuel Aranda, Wilmer Jair Beltrán, William Javier Guerrero, Maryory Guevara Lozano, Ingrid Xiomara Bustos

Background: Technology has improved patient care in hospitals, enhancing the overall patient experience. However, digitalization raises questions on effectively integrating technological strategies to ensure assertive communication of information during emergency department (ED) journeys. Keeping patients well-informed boosts their service perception and satisfaction, a factor often neglected by institutions in EDs.

Objective: This paper analyzes relevant studies on technological strategies designed for EDs aimed at improving patient experience, focusing on communication and information access. We analyze the technologies, outcomes, impacts, and challenges of the strategies.

Methods: A scoping review was conducted using the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) guidelines and CADIMA tool. Searches were performed in Scopus, PubMed, IEEE Xplore, and CINAHL databases. Articles published from January 2018 to December 2024 were included. Quality appraisal was performed using the Crowe Critical Appraisal Tool version 1.4. Three reviewers independently examined the title and abstract for eligibility based on the inclusion and exclusion criteria.

Results: Sixteen eligible studies were included. Four technological strategy categories were identified: artificial intelligence-based, simulation-based, infrastructure and hardware technologies, and interfaces and information systems. Mobile and web applications were the main technologies adopted in the studies.

Conclusions: Technological strategies hold significant potential to enhance patient experiences in EDs by providing real-time updates on medical status and care progress. However, their effectiveness depends on usability, literacy, and system design. Existing literature highlights the impact and challenges of deploying and using these strategies in EDs. However, no studies have systematically evaluated long-term outcomes or cost-effectiveness across diverse ED settings.

背景:技术改善了医院的患者护理,增强了患者的整体体验。然而,数字化对有效整合技术策略以确保急诊科(ED)旅程中自信的信息沟通提出了问题。让病人充分了解情况可以提高他们对服务的认知度和满意度,这是急诊室机构经常忽视的一个因素。目的:分析以改善患者体验为目的的急诊科技术策略的相关研究,重点是沟通和信息获取。我们分析了这些战略的技术、结果、影响和挑战。方法:采用PRISMA-ScR(系统评价和荟萃分析扩展范围评价的首选报告项目)指南和CADIMA工具进行范围评价。在Scopus、PubMed、IEEE explore和CINAHL数据库中进行检索。收录了2018年1月至2024年12月期间发表的文章。使用Crowe关键评估工具1.4版进行质量评估。三位审稿人根据纳入和排除标准独立审查了标题和摘要的资格。结果:纳入了16项符合条件的研究。确定了四种技术战略类别:基于人工智能、基于仿真、基础设施和硬件技术以及接口和信息系统。移动和网页应用是研究中采用的主要技术。结论:通过提供实时更新的医疗状况和护理进展,技术策略在改善急诊科患者体验方面具有巨大的潜力。然而,它们的有效性取决于可用性、读写能力和系统设计。现有文献强调了在EDs中部署和使用这些策略的影响和挑战。然而,没有研究系统地评估了不同ED设置的长期结果或成本效益。
{"title":"Technological Strategies for the Patient Experience in Emergency Departments: Scoping Review.","authors":"Andrés Santiago Santafé, Juan Manuel Aranda, Wilmer Jair Beltrán, William Javier Guerrero, Maryory Guevara Lozano, Ingrid Xiomara Bustos","doi":"10.2196/79782","DOIUrl":"10.2196/79782","url":null,"abstract":"<p><strong>Background: </strong>Technology has improved patient care in hospitals, enhancing the overall patient experience. However, digitalization raises questions on effectively integrating technological strategies to ensure assertive communication of information during emergency department (ED) journeys. Keeping patients well-informed boosts their service perception and satisfaction, a factor often neglected by institutions in EDs.</p><p><strong>Objective: </strong>This paper analyzes relevant studies on technological strategies designed for EDs aimed at improving patient experience, focusing on communication and information access. We analyze the technologies, outcomes, impacts, and challenges of the strategies.</p><p><strong>Methods: </strong>A scoping review was conducted using the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) guidelines and CADIMA tool. Searches were performed in Scopus, PubMed, IEEE Xplore, and CINAHL databases. Articles published from January 2018 to December 2024 were included. Quality appraisal was performed using the Crowe Critical Appraisal Tool version 1.4. Three reviewers independently examined the title and abstract for eligibility based on the inclusion and exclusion criteria.</p><p><strong>Results: </strong>Sixteen eligible studies were included. Four technological strategy categories were identified: artificial intelligence-based, simulation-based, infrastructure and hardware technologies, and interfaces and information systems. Mobile and web applications were the main technologies adopted in the studies.</p><p><strong>Conclusions: </strong>Technological strategies hold significant potential to enhance patient experiences in EDs by providing real-time updates on medical status and care progress. However, their effectiveness depends on usability, literacy, and system design. Existing literature highlights the impact and challenges of deploying and using these strategies in EDs. However, no studies have systematically evaluated long-term outcomes or cost-effectiveness across diverse ED settings.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e79782"},"PeriodicalIF":3.8,"publicationDate":"2026-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12978909/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147437966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Developing a Thai User Interface Terminology for Systematized Nomenclature of Medicine Clinical Terms Implementation in Primary Care: Cross-Sectional Content Coverage Analysis. 为初级保健中医学临床术语实施的系统化命名开发泰国用户界面术语:横断面内容覆盖分析。
IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2026-03-09 DOI: 10.2196/80039
Nat Tangchitnob, Wanchana Ponthongmak, Boonchai Kijsanayotin, Oraluck Pattanaprateep, Sithakom Phusanti, Pongsakorn Atiksawedparit, Kamonporn Suwanthaweemeesuk, Jirayus Siangfu, Gareth J McKay, John Attia, Ammarin Thakkinstian

Background: Primary care in Thailand often uses mixed Thai-English free-text documentation for diagnoses and clinical problems, limiting standardization, interoperability, and secondary data use. Clinical terminologies like Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), a comprehensive reference terminology, can bridge this gap through the use of structured clinical data. Developing and mapping a local user interface terminology (UIT) is one of the key strategies for implementing SNOMED CT in real-world clinical settings.

Objective: This study aimed to develop a Thai UIT derived from frequently used terms in real-world primary care practice, map these terms to SNOMED CT concepts, and evaluate the extent of concept coverage.

Methods: Frequently used clinical terms were extracted from outpatient medical records from the family, emergency, and internal medicine departments using a customized tokenization method, N-gram analysis, and expert review. This process yielded 2054 Thai-specific terms. All terms were normalized and mapped to SNOMED CT through manual expert-driven and semiautomated tools. Unmapped terms were subsequently analyzed to identify mapping barriers and solutions.

Results: Of the 2054 Thai-specific terms, 2012 were successfully mapped to 2041 (97.98%) SNOMED CT concepts, including 1781 (85.50%) fully, 123 (5.90%) broader, 56 (2.69%) narrower, 81 (3.89%) inexact mappings, and 42 (2.02%) remained unmapped. Most mappings were one-to-one (1984), with 28 terms mapped to multiple concepts (57), covering 1486 unique SNOMED CT concepts. The remaining 42 unmapped terms were mostly due to culturally specific expressions or concepts not yet represented in SNOMED CT. These were categorized for potential postcoordination, exclusion, or national extension development.

Conclusions: This study demonstrates the feasibility of developing a Thai UIT mapped to SNOMED CT and describes mapping challenges. The resulting UIT enhances semantic clarity in clinical documentation and supports better interoperability, clinical decision-making, and health data analytics within Thailand's health care system.

背景:泰国的初级保健经常使用泰英混合的自由文本文档来诊断和临床问题,这限制了标准化、互操作性和二次数据的使用。临床术语,如系统化医学临床术语命名法(SNOMED CT),一个全面的参考术语,可以通过使用结构化的临床数据来弥补这一差距。开发和映射本地用户界面术语(UIT)是在实际临床环境中实施SNOMED CT的关键策略之一。目的:本研究旨在从现实世界初级保健实践中经常使用的术语开发泰国语UIT,将这些术语映射到SNOMED CT概念,并评估概念覆盖的程度。方法:采用定制的标记化方法、n图分析和专家评审,从家庭、急诊和内科门诊病历中提取常用临床术语。这一过程产生了2054项泰国特有的条款。通过人工专家驱动和半自动工具将所有术语归一化并映射到SNOMED CT。随后对未映射项进行分析,以确定映射障碍和解决方案。结果:在2054个泰国特定术语中,2012个成功映射到2041个(97.98%)SNOMED CT概念,其中1781个(85.50%)完全映射,123个(5.90%)更宽,56个(2.69%)更窄,81个(3.89%)不精确映射,42个(2.02%)未映射。大多数映射是一对一的(1984),有28个术语映射到多个概念(57),涵盖1486个独特的SNOMED CT概念。其余42个未映射的术语主要是由于文化特定的表达或概念尚未在SNOMED CT中表示。这些被分类为潜在的后协调、排除或国家扩展发展。结论:本研究证明了开发一个泰国UIT映射到SNOMED CT的可行性,并描述了映射的挑战。由此产生的UIT增强了临床文档的语义清晰度,并支持泰国卫生保健系统内更好的互操作性、临床决策和卫生数据分析。
{"title":"Developing a Thai User Interface Terminology for Systematized Nomenclature of Medicine Clinical Terms Implementation in Primary Care: Cross-Sectional Content Coverage Analysis.","authors":"Nat Tangchitnob, Wanchana Ponthongmak, Boonchai Kijsanayotin, Oraluck Pattanaprateep, Sithakom Phusanti, Pongsakorn Atiksawedparit, Kamonporn Suwanthaweemeesuk, Jirayus Siangfu, Gareth J McKay, John Attia, Ammarin Thakkinstian","doi":"10.2196/80039","DOIUrl":"10.2196/80039","url":null,"abstract":"<p><strong>Background: </strong>Primary care in Thailand often uses mixed Thai-English free-text documentation for diagnoses and clinical problems, limiting standardization, interoperability, and secondary data use. Clinical terminologies like Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), a comprehensive reference terminology, can bridge this gap through the use of structured clinical data. Developing and mapping a local user interface terminology (UIT) is one of the key strategies for implementing SNOMED CT in real-world clinical settings.</p><p><strong>Objective: </strong>This study aimed to develop a Thai UIT derived from frequently used terms in real-world primary care practice, map these terms to SNOMED CT concepts, and evaluate the extent of concept coverage.</p><p><strong>Methods: </strong>Frequently used clinical terms were extracted from outpatient medical records from the family, emergency, and internal medicine departments using a customized tokenization method, N-gram analysis, and expert review. This process yielded 2054 Thai-specific terms. All terms were normalized and mapped to SNOMED CT through manual expert-driven and semiautomated tools. Unmapped terms were subsequently analyzed to identify mapping barriers and solutions.</p><p><strong>Results: </strong>Of the 2054 Thai-specific terms, 2012 were successfully mapped to 2041 (97.98%) SNOMED CT concepts, including 1781 (85.50%) fully, 123 (5.90%) broader, 56 (2.69%) narrower, 81 (3.89%) inexact mappings, and 42 (2.02%) remained unmapped. Most mappings were one-to-one (1984), with 28 terms mapped to multiple concepts (57), covering 1486 unique SNOMED CT concepts. The remaining 42 unmapped terms were mostly due to culturally specific expressions or concepts not yet represented in SNOMED CT. These were categorized for potential postcoordination, exclusion, or national extension development.</p><p><strong>Conclusions: </strong>This study demonstrates the feasibility of developing a Thai UIT mapped to SNOMED CT and describes mapping challenges. The resulting UIT enhances semantic clarity in clinical documentation and supports better interoperability, clinical decision-making, and health data analytics within Thailand's health care system.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"14 ","pages":"e80039"},"PeriodicalIF":3.8,"publicationDate":"2026-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12978892/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147437951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
JMIR Medical Informatics
全部 Environ. Geochem. Health Environ. Eng. Res. ATMOSPHERE-BASEL Geochim. Cosmochim. Acta Environmental Health Insights Aquat. Geochem. J. Earth Syst. Sci. Clim. Change Expert Rev. Clin. Pharmacol. Entomologisk tidskrift European journal of biochemistry Appl. Clay Sci. Geochem. Trans. Clean Technol. Environ. Policy Theor. Appl. Climatol. EXPERT REV RESP MED Energy Storage Ocean Modell. Carbon Balance Manage. J. Adv. Model. Earth Syst. Org. Geochem. Geochem. J. FETAL DIAGN THER Exp. Anim. Stud. Geophys. Geod. COMP BIOCHEM PHYS C 电力系统及其自动化学报 Conserv. Genet. Resour. ECOTOXICOLOGY Strategic Organization European Journal of Biological Research J PALAEONTOL SOC IND Indian journal of animal nutrition Environmental Epigenetics Geol. Ore Deposits Global Biogeochem. Cycles Turk. J. Earth Sci. npj Clim. Atmos. Sci. AAPG Bull. SOLA FOLIA PHONIATR LOGO PROG PHYS GEOG Exp. Parasitol. J. Paleontol. Regulation & Governance Ore Geol. Rev. INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT OCEAN SCI J Il Nuovo Cimento (1924-1942) Acta Geophys. ACS Appl. Nano Mater. ENVIRON HEALTH-GLOB Bull. Geol. Soc. Den. J. Sol-Gel Sci. Technol. ERN: Stock Market Risk (Topic) Am. Mineral. INDONESIAN JOURNAL OF PHARMACY TELLUS A Eur. J. Control Chem. Ecol. Hydrol. Processes J. Syst. Paleontol. GEOHERITAGE Geochem. Perspect. Public Adm Dev GEOMORPHOLOGIE Intensivmedizin Und Notfallmedizin Int. J. Geog. Inf. Sci. Environmental Claims Journal High Temp. Mater. Processes (London) Exp. Hematol. Oncol. Int. J. Qual. Assur. Eng. Technol. Educ. Q. J. R. Meteorolog. Soc. GEOLOGY Adv. Electron. Mater. EUROSURVEILLANCE AMEGHINIANA Zeitschrift Fur Bibliothekswesen Und Bibliographie Indonesian Journal of Chemical Science and Technology (IJCST) J. Mineral. Petrol. Sci. Seismol. Res. Lett. GEOPHYSICS Phys. Mesomech. Precambrian Res. Int. J. Biometeorol. RADIOCARBON P GEOLOGIST ASSOC ECOL RESTOR Zhonguo Nongshi Innovative Infrastructure Solutions Environ. Chem. Austrian J. Earth Sci. 胜利油田党校学报 Energy Ecol Environ Environ. Prot. Eng. IZV-PHYS SOLID EART+ Acta Geophys. Environ. Prog. Sustainable Energy Ecol. Monogr. ERN: Other Macroeconomics: Aggregative Models (Topic)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1