Emily Nguyen, Zijun Cui, Georgia Kokaraki, Joseph Carlson, Yan Liu
Ovarian cancer, a potentially life-threatening disease, is often difficult to treat. There is a critical need for innovations that can assist in improved therapy selection. Although deep learning models are showing promising results, they are employed as a "black-box" and require enormous amounts of data. Therefore, we explore the transferable and interpretable prediction of treatment effectiveness for ovarian cancer patients. Unlike existing works focusing on histopathology images, we propose a multimodal deep learning framework which takes into account not only large histopathology images, but also clinical variables to increase the scope of the data. The results demonstrate that the proposed models achieve high prediction accuracy and interpretability, and can also be transferred to other cancer datasets without significant loss of performance.
{"title":"Transferable and Interpretable Treatment Effectiveness Prediction for Ovarian Cancer via Multimodal Deep Learning.","authors":"Emily Nguyen, Zijun Cui, Georgia Kokaraki, Joseph Carlson, Yan Liu","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Ovarian cancer, a potentially life-threatening disease, is often difficult to treat. There is a critical need for innovations that can assist in improved therapy selection. Although deep learning models are showing promising results, they are employed as a \"black-box\" and require enormous amounts of data. Therefore, we explore the transferable and interpretable prediction of treatment effectiveness for ovarian cancer patients. Unlike existing works focusing on histopathology images, we propose a multimodal deep learning framework which takes into account not only large histopathology images, but also clinical variables to increase the scope of the data. The results demonstrate that the proposed models achieve high prediction accuracy and interpretability, and can also be transferred to other cancer datasets without significant loss of performance.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"550-558"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785847/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139465616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cancer caregivers are often informal family members who may not be prepared to adequately meet the needs of patients and often experience high stress along with significant physical, emotional, and financial burdens. Accurate prediction of caregiver's burden level is highly valuable for early intervention and support. In this study, we used several machine learning approaches to build prediction models from the National Alliance for Caregiving/AARP dataset. We performed data cleansing and imputation on the raw data to give us a working dataset of cancer caregivers. Then a series of feature selection methods were used to identify predictive risk factors for burden level. Using supervised machine learning classifiers, we achieved reasonably good prediction performance (Accuracy ∼ 0.94; AUC ∼ 0.97; F1∼ 0.93). We identify a small set of 15 features that are strong predictors of burden and can be used to build Clinical Decision Support Systems.
{"title":"Understanding Cancer Caregiving and Predicting Burden: An Analytics and Machine Learning Approach.","authors":"Armin Abazari, Samir Chatterjee, Md Moniruzzaman","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Cancer caregivers are often informal family members who may not be prepared to adequately meet the needs of patients and often experience high stress along with significant physical, emotional, and financial burdens. Accurate prediction of caregiver's burden level is highly valuable for early intervention and support. In this study, we used several machine learning approaches to build prediction models from the National Alliance for Caregiving/AARP dataset. We performed data cleansing and imputation on the raw data to give us a working dataset of cancer caregivers. Then a series of feature selection methods were used to identify predictive risk factors for burden level. Using supervised machine learning classifiers, we achieved reasonably good prediction performance (Accuracy ∼ 0.94; AUC ∼ 0.97; F1∼ 0.93). We identify a small set of 15 features that are strong predictors of burden and can be used to build Clinical Decision Support Systems.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"243-252"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785947/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139465713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anastasios Lamproudis, Therese Olsen Svenning, Torbjørn Torsvik, Taridzo Chomutare, Andrius Budrionis, Phuong Dinh Ngo, Thomas Vakili, Hercules Dalianis
With the recent advances in natural language processing and deep learning, the development of tools that can assist medical coders in ICD-10 diagnosis coding and increase their efficiency in coding discharge summaries is significantly more viable than before. To that end, one important component in the development of these models is the datasets used to train them. In this study, such datasets are presented, and it is shown that one of them can be used to develop a BERT-based language model that can consistently perform well in assigning ICD-10 codes to discharge summaries written in Swedish. Most importantly, it can be used in a coding support setup where a tool can recommend potential codes to the coders. This reduces the range of potential codes to consider and, in turn, reduces the workload of the coder. Moreover, the de-identified and pseudonymised dataset is open to use for academic users.
{"title":"Using a Large Open Clinical Corpus for Improved ICD-10 Diagnosis Coding.","authors":"Anastasios Lamproudis, Therese Olsen Svenning, Torbjørn Torsvik, Taridzo Chomutare, Andrius Budrionis, Phuong Dinh Ngo, Thomas Vakili, Hercules Dalianis","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>With the recent advances in natural language processing and deep learning, the development of tools that can assist medical coders in ICD-10 diagnosis coding and increase their efficiency in coding discharge summaries is significantly more viable than before. To that end, one important component in the development of these models is the datasets used to train them. In this study, such datasets are presented, and it is shown that one of them can be used to develop a BERT-based language model that can consistently perform well in assigning ICD-10 codes to discharge summaries written in Swedish. Most importantly, it can be used in a coding support setup where a tool can recommend potential codes to the coders. This reduces the range of potential codes to consider and, in turn, reduces the workload of the coder. Moreover, the de-identified and pseudonymised dataset is open to use for academic users.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"465-473"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785868/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139466093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bradley Carlson, Michael Watkins, Mei Li, Brian Furner, Ellen Cohen, Samuel L Volchenboum
The Pediatric Cancer Data Commons (PCDC) comprises an international community whose ironclad commitment to data sharing is combatting pediatric cancer in an unprecedented way. The byproduct of their data sharing efforts is a gold-standard consensus data model covering many types of pediatric cancer. This article describes an effort to utilize SSSOM, an emerging specification for semantically-rich data mappings, to provide a "hub and spoke" model of mappings from several common data models (CDMs) to the PCDC data model. This provides important contributions to the research community, including: 1) a clear view of the current coverage of these CDMs in the domain of pediatric oncology, and 2) a demonstration of creating standardized mappings. These mappings can allow downstream crosswalk for data transformation and enhance data sharing. This can guide those who currently create and maintain brittle ad hoc data mappings in order to utilize the growing volume of viable research data.
{"title":"Using A Standardized Nomenclature to Semantically Map Oncology-Related Concepts from Common Data Models to a Pediatric Cancer Data Model.","authors":"Bradley Carlson, Michael Watkins, Mei Li, Brian Furner, Ellen Cohen, Samuel L Volchenboum","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The Pediatric Cancer Data Commons (PCDC) comprises an international community whose ironclad commitment to data sharing is combatting pediatric cancer in an unprecedented way. The byproduct of their data sharing efforts is a gold-standard consensus data model covering many types of pediatric cancer. This article describes an effort to utilize SSSOM, an emerging specification for semantically-rich data mappings, to provide a \"hub and spoke\" model of mappings from several common data models (CDMs) to the PCDC data model. This provides important contributions to the research community, including: 1) a clear view of the current coverage of these CDMs in the domain of pediatric oncology, and 2) a demonstration of creating standardized mappings. These mappings can allow downstream crosswalk for data transformation and enhance data sharing. This can guide those who currently create and maintain brittle ad hoc data mappings in order to utilize the growing volume of viable research data.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"874-883"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785885/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139466094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Translating prediction models into practice and supporting clinicians' decision-making demand demonstration of clinical value. Existing approaches to evaluating machine learning models emphasize discriminatory power, which is only a part of the medical decision problem. We propose the Applicability Area (ApAr), a decision-analytic utility-based approach to evaluating predictive models that communicate the range of prior probability and test cutoffs for which the model has positive utility; larger ApArs suggest a broader potential use of the model. We assess ApAr with simulated datasets and with three published medical datasets. ApAr adds value beyond the typical area under the receiver operating characteristic curve (AUROC) metric analysis. As an example, in the diabetes dataset, the top model by ApAr was ranked as the 23rd best model by AUROC. Decision makers looking to adopt and implement models can leverage ApArs to assess if the local range of priors and utilities is within the respective ApArs.
将预测模型转化为实践并支持临床医生的决策需要证明其临床价值。评估机器学习模型的现有方法强调判别能力,而这只是医疗决策问题的一部分。我们提出了适用性区域(Applicability Area,ApAr),这是一种基于决策分析效用的预测模型评估方法,它传达了模型具有积极效用的先验概率和测试临界值的范围;ApAr 越大,表明模型的潜在用途越广。我们利用模拟数据集和三个已发表的医学数据集对 ApAr 进行了评估。ApAr 带来的价值超出了典型的接收者工作特征曲线下面积(AUROC)度量分析。例如,在糖尿病数据集中,ApAr 的最佳模型在 AUROC 中排名第 23 位。希望采用和实施模型的决策者可以利用 ApAr 来评估本地先验和效用范围是否在各自的 ApAr 范围内。
{"title":"Applicability Area: A novel utility-based approach for evaluating predictive models, beyond discrimination.","authors":"Star Liu, Shixiong Wei, Harold P Lehmann","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Translating prediction models into practice and supporting clinicians' decision-making demand demonstration of clinical value. Existing approaches to evaluating machine learning models emphasize discriminatory power, which is only a part of the medical decision problem. We propose the Applicability Area (ApAr), a decision-analytic utility-based approach to evaluating predictive models that communicate the range of prior probability and test cutoffs for which the model has positive utility; larger ApArs suggest a broader potential use of the model. We assess ApAr with simulated datasets and with three published medical datasets. ApAr adds value beyond the typical area under the receiver operating characteristic curve (AUROC) metric analysis. As an example, in the diabetes dataset, the top model by ApAr was ranked as the 23<sup>rd</sup> best model by AUROC. Decision makers looking to adopt and implement models can leverage ApArs to assess if the local range of priors and utilities is within the respective ApArs.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"494-503"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785877/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139467349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alberto Purpura, Joao Bettencourt-Silva, Natasha Mulligan, Tesfaye Yadete, Kingsley Njoku, Julia Liu, Thaddeus Stappenbeck
Biomedical ontologies are a key component in many systems for the analysis of textual clinical data. They are employed to organize information about a certain domain relying on a hierarchy of different classes. Each class maps a concept to items in a terminology developed by domain experts. These mappings are then leveraged to organize the information extracted by Natural Language Processing (NLP) models to build knowledge graphs for inferences. The creation of these associations, however, requires extensive manual review. In this paper, we present an automated approach and repeatable framework to learn a mapping between ontology classes and terminology terms derived from vocabularies in the Unified Medical Language System (UMLS) metathesaurus. According to our evaluation, the proposed system achieves a performance close to humans and provides a substantial improvement over existing systems developed by the National Library of Medicine to assist researchers through this process.
{"title":"Automatic Mapping of Terminology Items with Transformers.","authors":"Alberto Purpura, Joao Bettencourt-Silva, Natasha Mulligan, Tesfaye Yadete, Kingsley Njoku, Julia Liu, Thaddeus Stappenbeck","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Biomedical ontologies are a key component in many systems for the analysis of textual clinical data. They are employed to organize information about a certain domain relying on a hierarchy of different classes. Each class maps a concept to items in a terminology developed by domain experts. These mappings are then leveraged to organize the information extracted by Natural Language Processing (NLP) models to build knowledge graphs for inferences. The creation of these associations, however, requires extensive manual review. In this paper, we present an automated approach and repeatable framework to learn a mapping between ontology classes and terminology terms derived from vocabularies in the Unified Medical Language System (UMLS) metathesaurus. According to our evaluation, the proposed system achieves a performance close to humans and provides a substantial improvement over existing systems developed by the National Library of Medicine to assist researchers through this process.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"599-607"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785948/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139467354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
William R Kearns, Jessica Bertram, Myra Divina, Lauren Kemp, Yinzhou Wang, Alex Marin, Trevor Cohen, Weichao Yuwen
Despite the high prevalence and burden of mental health conditions, there is a global shortage of mental health providers. Artificial Intelligence (AI) methods have been proposed as a way to address this shortage, by supporting providers with less extensive training as they deliver care. To this end, we developed the AI-Assisted Provider Platform (A2P2), a text-based virtual therapy interface that includes a response suggestion feature, which supports providers in delivering protocolized therapies empathetically. We studied providers with and without expertise in mental health treatment delivering a therapy session using the platform with (intervention) and without (control) AI-assistance features. Upon evaluation, the AI-assisted system significantly decreased response times by 29.34% (p=0.002), tripled empathic response accuracy (p=0.0001), and increased goal recommendation accuracy by 66.67% (p=0.001) across both user groups compared to the control. Both groups rated the system as having excellent usability.
{"title":"Bridging the Skills Gap: Evaluating an AI-Assisted Provider Platform to Support Care Providers with Empathetic Delivery of Protocolized Therapy.","authors":"William R Kearns, Jessica Bertram, Myra Divina, Lauren Kemp, Yinzhou Wang, Alex Marin, Trevor Cohen, Weichao Yuwen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Despite the high prevalence and burden of mental health conditions, there is a global shortage of mental health providers. Artificial Intelligence (AI) methods have been proposed as a way to address this shortage, by supporting providers with less extensive training as they deliver care. To this end, we developed the AI-Assisted Provider Platform (A2P2), a text-based virtual therapy interface that includes a response suggestion feature, which supports providers in delivering protocolized therapies empathetically. We studied providers with and without expertise in mental health treatment delivering a therapy session using the platform with (intervention) and without (control) AI-assistance features. Upon evaluation, the AI-assisted system significantly decreased response times by 29.34% (p=0.002), tripled empathic response accuracy (p=0.0001), and increased goal recommendation accuracy by 66.67% (p=0.001) across both user groups compared to the control. Both groups rated the system as having excellent usability.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"436-445"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785887/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139467359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michal K Grzeszczyk, Paulina Adamczyk, Sylwia Marek, Ryszard Pręcikowski, Maciej Kuś, M Patrycja Lelujko, Rosmary Blanco, Tomasz Trzciński, Arkadiusz Sitek, Maciej Malawski, Aneta Lisowska
The effectiveness of digital treatments can be measured by requiring patients to self-report their state through applications, however, it can be overwhelming and causes disengagement. We conduct a study to explore the impact of gamification on self-reporting. Our approach involves the creation of a system to assess cognitive load (CL) through the analysis of photoplethysmography (PPG) signals. The data from 11 participants is utilized to train a machine learning model to detect CL. Subsequently, we create two versions of surveys: a gamified and a traditional one. We estimate the CL experienced by other participants (13) while completing surveys. We find that CL detector performance can be enhanced via pre-training on stress detection tasks. For 10 out of 13 participants, a personalized CL detector can achieve an F1 score above 0.7. We find no difference between the gamified and non-gamified surveys in terms of CL but participants prefer the gamified version.
{"title":"Can gamification reduce the burden of self-reporting in mHealth applications? A feasibility study using machine learning from smartwatch data to estimate cognitive load.","authors":"Michal K Grzeszczyk, Paulina Adamczyk, Sylwia Marek, Ryszard Pręcikowski, Maciej Kuś, M Patrycja Lelujko, Rosmary Blanco, Tomasz Trzciński, Arkadiusz Sitek, Maciej Malawski, Aneta Lisowska","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The effectiveness of digital treatments can be measured by requiring patients to self-report their state through applications, however, it can be overwhelming and causes disengagement. We conduct a study to explore the impact of gamification on self-reporting. Our approach involves the creation of a system to assess cognitive load (CL) through the analysis of photoplethysmography (PPG) signals. The data from 11 participants is utilized to train a machine learning model to detect CL. Subsequently, we create two versions of surveys: a gamified and a traditional one. We estimate the CL experienced by other participants (13) while completing surveys. We find that CL detector performance can be enhanced via pre-training on stress detection tasks. For 10 out of 13 participants, a personalized CL detector can achieve an F1 score above 0.7. We find no difference between the gamified and non-gamified surveys in terms of CL but participants prefer the gamified version.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"389-396"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785949/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139467363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The diversity of patient information recorded on electronic medical records generally, presents a challenge for converting it into fixed-length vectors that align with clinical characteristics. To address this issue, this study aimed to utilize an unsupervised graph representation learning method to transform the unstructured inpatient information from electronic medical records into a fixed-length vector. Infograph, one of the unsupervised graph representation learning algorithms was applied to the graphed inpatient information, resulting in embedded vectors of fixed length. The embedded vectors were then evaluated for whether the clinical information was preserved in it. The results indicated that the embedded representation contained information that could predict readmission within 30 days, demonstrating the feasibility of using unsupervised graph representation learning to transform patient information into fixed-length vectors that retain clinical characteristics.
{"title":"Clinical Feature Vector Generation using Unsupervised Graph Representation Learning from Heterogeneous Medical Records.","authors":"Tomohisa Seki, Yoshimasa Kawazoe, Kazuhiko Ohe","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The diversity of patient information recorded on electronic medical records generally, presents a challenge for converting it into fixed-length vectors that align with clinical characteristics. To address this issue, this study aimed to utilize an unsupervised graph representation learning method to transform the unstructured inpatient information from electronic medical records into a fixed-length vector. Infograph, one of the unsupervised graph representation learning algorithms was applied to the graphed inpatient information, resulting in embedded vectors of fixed length. The embedded vectors were then evaluated for whether the clinical information was preserved in it. The results indicated that the embedded representation contained information that could predict readmission within 30 days, demonstrating the feasibility of using unsupervised graph representation learning to transform patient information into fixed-length vectors that retain clinical characteristics.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"618-623"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785854/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139467379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sarah Pungitore, Toluwanimi Olorunnisola, Jarrod Mosier, Vignesh Subbian
Post-acute sequelae of SARS-CoV-2 (PASC) is an increasingly recognized yet incompletely understood public health concern. Several studies have examined various ways to phenotype PASC to better characterize this heterogeneous condition. However, many gaps in PASC phenotyping research exist, including a lack of the following: 1) standardized definitions for PASC based on symptomatology; 2) generalizable and reproducible phenotyping heuristics and meta-heuristics; and 3) phenotypes based on both COVID-19 severity and symptom duration. In this study, we defined computable phenotypes (or heuristics) and meta-heuristics for PASC phenotypes based on COVID-19 severity and symptom duration. We also developed a symptom profile for PASC based on a common data standard. We identified four phenotypes based on COVID-19 severity (mild vs. moderate/severe) and duration of PASC symptoms (subacute vs. chronic). The symptoms groups with the highest frequency among phenotypes were cardiovascular and neuropsychiatric with each phenotype characterized by a different set of symptoms.
{"title":"Computable Phenotypes for Post-acute sequelae of SARS-CoV-2: A National COVID Cohort Collaborative Analysis.","authors":"Sarah Pungitore, Toluwanimi Olorunnisola, Jarrod Mosier, Vignesh Subbian","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Post-acute sequelae of SARS-CoV-2 (PASC) is an increasingly recognized yet incompletely understood public health concern. Several studies have examined various ways to phenotype PASC to better characterize this heterogeneous condition. However, many gaps in PASC phenotyping research exist, including a lack of the following: 1) standardized definitions for PASC based on symptomatology; 2) generalizable and reproducible phenotyping heuristics and meta-heuristics; and 3) phenotypes based on both COVID-19 severity and symptom duration. In this study, we defined computable phenotypes (or heuristics) and meta-heuristics for PASC phenotypes based on COVID-19 severity and symptom duration. We also developed a symptom profile for PASC based on a common data standard. We identified four phenotypes based on COVID-19 severity (mild vs. moderate/severe) and duration of PASC symptoms (subacute vs. chronic). The symptoms groups with the highest frequency among phenotypes were cardiovascular and neuropsychiatric with each phenotype characterized by a different set of symptoms.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"589-598"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785914/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139467399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}