Kevin Krause,Sharon Davis,Zhijun Yin,Katherine Schafer,Trent Rosenbloom,Colin Walsh
{"title":"利用时态临床笔记特征增强自杀风险预测模型。","authors":"Kevin Krause,Sharon Davis,Zhijun Yin,Katherine Schafer,Trent Rosenbloom,Colin Walsh","doi":"10.1055/a-2411-5796","DOIUrl":null,"url":null,"abstract":"OBJECTIVE\r\nThe objective of this study was to investigate the impact of enhancing a structured-data-based suicide attempt risk prediction model with temporal Concept Unique Identifiers (CUIs) derived from clinical notes. We aimed to examine how different temporal schemes, model types, and prediction ranges influenced the model's predictive performance. This research sought to improve our understanding of how the integration of temporal information and clinical variable transformation could enhance model predictions.\r\n\r\nMATERIALS AND METHODS\r\nWe identified modeling targets using diagnostic codes for suicide attempts within 30, 90, or 365 days following a temporally grouped visit cluster. Structured data included medications, diagnoses, procedures, and demographics, while unstructured data consisted of terms extracted with regular expressions from clinical notes. We compared models trained only on structured data (controls) to hybrid models trained on both structured and unstructured data. We used two temporalization schemes for clinical notes: fixed 90-day windows and flexible epochs. We trained and assessed random forests and hybrid LSTM neural networks using AUPRC and AUROC, with additional evaluation of sensitivity and PPV at 95% specificity.\r\n\r\nRESULTS\r\nThe training set included 2,364,183 visit clusters with 2,009 30-day suicide attempts, and the testing set contained 471,936 visit clusters with 480 suicide attempts. Models trained with temporal CUIs outperformed those trained with only structured data. The window-temporalized LSTM model achieved the highest AUPRC (0.056 ± 0.013) for the 30-day prediction range. Hybrid models generally showed better performance compared to controls across most metrics.\r\n\r\nDISCUSSION AND CONCLUSION\r\nThis study demonstrated that incorporating EHR-derived clinical note features enhanced suicide attempt risk prediction models, particularly with window-temporalized LSTM models. Our results underscored the critical value of unstructured data in suicidality prediction, aligning with previous findings. Future research should focus on integrating more sophisticated methods to continue improving prediction accuracy, which will enhance the effectiveness of future intervention.","PeriodicalId":48956,"journal":{"name":"Applied Clinical Informatics","volume":"4 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing Suicide Risk Prediction Models with Temporal Clinical Note Features.\",\"authors\":\"Kevin Krause,Sharon Davis,Zhijun Yin,Katherine Schafer,Trent Rosenbloom,Colin Walsh\",\"doi\":\"10.1055/a-2411-5796\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"OBJECTIVE\\r\\nThe objective of this study was to investigate the impact of enhancing a structured-data-based suicide attempt risk prediction model with temporal Concept Unique Identifiers (CUIs) derived from clinical notes. We aimed to examine how different temporal schemes, model types, and prediction ranges influenced the model's predictive performance. This research sought to improve our understanding of how the integration of temporal information and clinical variable transformation could enhance model predictions.\\r\\n\\r\\nMATERIALS AND METHODS\\r\\nWe identified modeling targets using diagnostic codes for suicide attempts within 30, 90, or 365 days following a temporally grouped visit cluster. Structured data included medications, diagnoses, procedures, and demographics, while unstructured data consisted of terms extracted with regular expressions from clinical notes. We compared models trained only on structured data (controls) to hybrid models trained on both structured and unstructured data. We used two temporalization schemes for clinical notes: fixed 90-day windows and flexible epochs. We trained and assessed random forests and hybrid LSTM neural networks using AUPRC and AUROC, with additional evaluation of sensitivity and PPV at 95% specificity.\\r\\n\\r\\nRESULTS\\r\\nThe training set included 2,364,183 visit clusters with 2,009 30-day suicide attempts, and the testing set contained 471,936 visit clusters with 480 suicide attempts. Models trained with temporal CUIs outperformed those trained with only structured data. The window-temporalized LSTM model achieved the highest AUPRC (0.056 ± 0.013) for the 30-day prediction range. Hybrid models generally showed better performance compared to controls across most metrics.\\r\\n\\r\\nDISCUSSION AND CONCLUSION\\r\\nThis study demonstrated that incorporating EHR-derived clinical note features enhanced suicide attempt risk prediction models, particularly with window-temporalized LSTM models. Our results underscored the critical value of unstructured data in suicidality prediction, aligning with previous findings. Future research should focus on integrating more sophisticated methods to continue improving prediction accuracy, which will enhance the effectiveness of future intervention.\",\"PeriodicalId\":48956,\"journal\":{\"name\":\"Applied Clinical Informatics\",\"volume\":\"4 1\",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2024-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Clinical Informatics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1055/a-2411-5796\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Clinical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1055/a-2411-5796","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
Enhancing Suicide Risk Prediction Models with Temporal Clinical Note Features.
OBJECTIVE
The objective of this study was to investigate the impact of enhancing a structured-data-based suicide attempt risk prediction model with temporal Concept Unique Identifiers (CUIs) derived from clinical notes. We aimed to examine how different temporal schemes, model types, and prediction ranges influenced the model's predictive performance. This research sought to improve our understanding of how the integration of temporal information and clinical variable transformation could enhance model predictions.
MATERIALS AND METHODS
We identified modeling targets using diagnostic codes for suicide attempts within 30, 90, or 365 days following a temporally grouped visit cluster. Structured data included medications, diagnoses, procedures, and demographics, while unstructured data consisted of terms extracted with regular expressions from clinical notes. We compared models trained only on structured data (controls) to hybrid models trained on both structured and unstructured data. We used two temporalization schemes for clinical notes: fixed 90-day windows and flexible epochs. We trained and assessed random forests and hybrid LSTM neural networks using AUPRC and AUROC, with additional evaluation of sensitivity and PPV at 95% specificity.
RESULTS
The training set included 2,364,183 visit clusters with 2,009 30-day suicide attempts, and the testing set contained 471,936 visit clusters with 480 suicide attempts. Models trained with temporal CUIs outperformed those trained with only structured data. The window-temporalized LSTM model achieved the highest AUPRC (0.056 ± 0.013) for the 30-day prediction range. Hybrid models generally showed better performance compared to controls across most metrics.
DISCUSSION AND CONCLUSION
This study demonstrated that incorporating EHR-derived clinical note features enhanced suicide attempt risk prediction models, particularly with window-temporalized LSTM models. Our results underscored the critical value of unstructured data in suicidality prediction, aligning with previous findings. Future research should focus on integrating more sophisticated methods to continue improving prediction accuracy, which will enhance the effectiveness of future intervention.
期刊介绍:
ACI is the third Schattauer journal dealing with biomedical and health informatics. It perfectly complements our other journals Öffnet internen Link im aktuellen FensterMethods of Information in Medicine and the Öffnet internen Link im aktuellen FensterYearbook of Medical Informatics. The Yearbook of Medical Informatics being the “Milestone” or state-of-the-art journal and Methods of Information in Medicine being the “Science and Research” journal of IMIA, ACI intends to be the “Practical” journal of IMIA.