Pub Date : 2024-06-01Epub Date: 2024-08-22DOI: 10.1109/ichi61247.2024.00030
Ibna Kowsar, Shourav B Rabbani, Manar D Samad
The imputation of missing values (IMV) in electronic health records tabular data is crucial to enable machine learning for patient-specific predictive modeling. While IMV methods are developed in biostatistics and recently in machine learning, deep learning-based solutions have shown limited success in learning tabular data. This paper proposes a novel attention-based missing value imputation framework that learns to reconstruct data with missing values leveraging between-feature (self-attention) or between-sample attentions. We adopt data manipulation methods used in contrastive learning to improve the generalization of the trained imputation model. The proposed self-attention imputation method outperforms state-of-the-art statistical and machine learning-based (decision-tree) imputation methods, reducing the normalized root mean squared error by 18.4% to 74.7% on five tabular data sets and 52.6% to 82.6% on two electronic health records data sets. The proposed attention-based missing value imputation method shows superior performance across a wide range of missingness (10% to 50%) when the values are missing completely at random.
{"title":"Attention-based Imputation of Missing Values in Electronic Health Records Tabular Data.","authors":"Ibna Kowsar, Shourav B Rabbani, Manar D Samad","doi":"10.1109/ichi61247.2024.00030","DOIUrl":"10.1109/ichi61247.2024.00030","url":null,"abstract":"<p><p>The imputation of missing values (IMV) in electronic health records tabular data is crucial to enable machine learning for patient-specific predictive modeling. While IMV methods are developed in biostatistics and recently in machine learning, deep learning-based solutions have shown limited success in learning tabular data. This paper proposes a novel attention-based missing value imputation framework that learns to reconstruct data with missing values leveraging between-feature (self-attention) or between-sample attentions. We adopt data manipulation methods used in contrastive learning to improve the generalization of the trained imputation model. The proposed self-attention imputation method outperforms state-of-the-art statistical and machine learning-based (decision-tree) imputation methods, reducing the normalized root mean squared error by 18.4% to 74.7% on five tabular data sets and 52.6% to 82.6% on two electronic health records data sets. The proposed attention-based missing value imputation method shows superior performance across a wide range of missingness (10% to 50%) when the values are missing completely at random.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":"2024 ","pages":"177-182"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11463999/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142395730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-01Epub Date: 2024-08-22DOI: 10.1109/ichi61247.2024.00058
Qingqing Zhu, Xiuying Chen, Qiao Jin, Benjamin Hou, Tejas Sudharshan Mathai, Pritam Mukherjee, Xin Gao, Ronald M Summers, Zhiyong Lu
In radiology, Artificial Intelligence (AI) has significantly advanced report generation, but automatic evaluation of these AI-produced reports remains challenging. Current metrics, such as Conventional Natural Language Generation (NLG) and Clinical Efficacy (CE), often fall short in capturing the semantic intricacies of clinical contexts or overemphasize clinical details, undermining report clarity. To overcome these issues, our proposed method synergizes the expertise of professional radiologists with Large Language Models (LLMs), like GPT-3.5 and GPT-4. Utilizing In-Context Instruction Learning (ICIL) and Chain of Thought (CoT) reasoning, our approach aligns LLM evaluations with radiologist standards, enabling detailed comparisons between human and AI-generated reports. This is further enhanced by a Regression model that aggregates sentence evaluation scores. Experimental results show that our "Detailed GPT-4 (5-shot)" model achieves a correlation that is 0.48, outperforming the METEOR metric by 0.19, while our "Regressed GPT-4" model shows even greater alignment(0.64) with expert evaluations, exceeding the best existing metric by a 0.35 margin. Moreover, the robustness of our explanations has been validated through a thorough iterative strategy. We plan to publicly release annotations from radiology experts, setting a new standard for accuracy in future assessments. This underscores the potential of our approach in enhancing the quality assessment of AI-driven medical reports.
{"title":"Leveraging Professional Radiologists' Expertise to Enhance LLMs' Evaluation for AI-generated Radiology Reports.","authors":"Qingqing Zhu, Xiuying Chen, Qiao Jin, Benjamin Hou, Tejas Sudharshan Mathai, Pritam Mukherjee, Xin Gao, Ronald M Summers, Zhiyong Lu","doi":"10.1109/ichi61247.2024.00058","DOIUrl":"10.1109/ichi61247.2024.00058","url":null,"abstract":"<p><p>In radiology, Artificial Intelligence (AI) has significantly advanced report generation, but automatic evaluation of these AI-produced reports remains challenging. Current metrics, such as Conventional Natural Language Generation (NLG) and Clinical Efficacy (CE), often fall short in capturing the semantic intricacies of clinical contexts or overemphasize clinical details, undermining report clarity. To overcome these issues, our proposed method synergizes the expertise of professional radiologists with Large Language Models (LLMs), like GPT-3.5 and GPT-4. Utilizing In-Context Instruction Learning (ICIL) and Chain of Thought (CoT) reasoning, our approach aligns LLM evaluations with radiologist standards, enabling detailed comparisons between human and AI-generated reports. This is further enhanced by a Regression model that aggregates sentence evaluation scores. Experimental results show that our \"Detailed GPT-4 (5-shot)\" model achieves a correlation that is 0.48, outperforming the METEOR metric by 0.19, while our \"Regressed GPT-4\" model shows even greater alignment(0.64) with expert evaluations, exceeding the best existing metric by a 0.35 margin. Moreover, the robustness of our explanations has been validated through a thorough iterative strategy. We plan to publicly release annotations from radiology experts, setting a new standard for accuracy in future assessments. This underscores the potential of our approach in enhancing the quality assessment of AI-driven medical reports.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":"2024 ","pages":"402-411"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11651630/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142848618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-01Epub Date: 2024-08-22DOI: 10.1109/ichi61247.2024.00020
Mattia Prosperi, Simone Marini, Christina Boucher
A problem extension of the longest common substring (LCS) between two texts is the enumeration of all LCSs given a minimum length (ALCS- ), along with their positions in each text. In bioinformatics, an efficient solution to the ALCS- for very long texts -genomes or metagenomes- can provide useful insights to discover genetic signatures responsible for biological mechanisms. The ALCS- problem has two additional requirements compared to the LCS problem: one is the minimum length , and the other is that all common strings longer than must be reported. We present an efficient, two-stage ALCS- algorithm exploiting the spectrum of text substrings of length ( -mers). Our approach yields a worst-case time complexity loglinear in the number of -mers for the first stage, and an average-case loglinear in the number of common -mers for the second stage (several orders of magnitudes smaller than the total -mer spectrum). The space complexity is linear in the first phase (disk-based), and on average linear in the second phase (disk- and memory-based). Tests performed on genomes for different organisms (including viruses, bacteria and animal chromosomes) show that run times are consistent with our theoretical estimates; further, comparisons with MUMmer4 show an asymptotic advantage with divergent genomes.
两个文本之间最长公共子串(LCS)问题的扩展是枚举给定最小长度 k 的所有 LCS(ALCS- k)以及它们在每个文本中的位置。在生物信息学中,针对超长文本--基因组或元基因组--的 ALCS- k 的有效解决方案可以为发现生物机制的遗传特征提供有用的见解。与 LCS 问题相比,ALCS- k 问题有两个额外的要求:一个是最小长度 k,另一个是必须报告所有长于 k 的普通字符串。我们提出了一种高效的两阶段 ALCS- k 算法,该算法利用了长度为 k 的文本子串谱(k -mers)。我们的方法在最坏情况下,第一阶段的时间复杂度与 k -mers 的数量成对数线性关系,在平均情况下,第二阶段的时间复杂度与常见 k -mers 的数量成对数线性关系(比总 k -mers 频谱小几个数量级)。空间复杂度在第一阶段(基于磁盘)是线性的,在第二阶段(基于磁盘和内存)平均是线性的。在不同生物体(包括病毒、细菌和动物染色体)基因组上进行的测试表明,运行时间与我们的理论估计值一致;此外,与 MUMmer4 的比较显示,在不同基因组上具有渐进优势。
{"title":"<ArticleTitle xmlns:ns0=\"http://www.w3.org/1998/Math/MathML\">An average-case efficient two-stage algorithm for enumerating all longest common substrings of minimum length <ns0:math><ns0:mi>k</ns0:mi></ns0:math> between genome pairs.","authors":"Mattia Prosperi, Simone Marini, Christina Boucher","doi":"10.1109/ichi61247.2024.00020","DOIUrl":"10.1109/ichi61247.2024.00020","url":null,"abstract":"<p><p>A problem extension of the longest common substring (LCS) between two texts is the enumeration of all LCSs given a minimum length <math><mi>k</mi></math> (ALCS- <math><mi>k</mi></math> ), along with their positions in each text. In bioinformatics, an efficient solution to the ALCS- <math><mi>k</mi></math> for very long texts -genomes or metagenomes- can provide useful insights to discover genetic signatures responsible for biological mechanisms. The ALCS- <math><mi>k</mi></math> problem has two additional requirements compared to the LCS problem: one is the minimum length <math><mi>k</mi></math> , and the other is that all common strings longer than <math><mi>k</mi></math> must be reported. We present an efficient, two-stage ALCS- <math><mi>k</mi></math> algorithm exploiting the spectrum of text substrings of length <math><mi>k</mi></math> ( <math><mi>k</mi></math> -mers). Our approach yields a worst-case time complexity loglinear in the number of <math><mi>k</mi></math> -mers for the first stage, and an average-case loglinear in the number of common <math><mi>k</mi></math> -mers for the second stage (several orders of magnitudes smaller than the total <math><mi>k</mi></math> -mer spectrum). The space complexity is linear in the first phase (disk-based), and on average linear in the second phase (disk- and memory-based). Tests performed on genomes for different organisms (including viruses, bacteria and animal chromosomes) show that run times are consistent with our theoretical estimates; further, comparisons with MUMmer4 show an asymptotic advantage with divergent genomes.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":"2024 ","pages":"93-102"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11412151/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-01Epub Date: 2024-08-22DOI: 10.1109/ichi61247.2024.00012
Eloisa Nguyen, Rebecca Z Lin, Yang Gong, Cui Tao, Muhammad Tuan Amith
Many studies have examined the impact of exercise and other physical activities in influencing the health outcomes of individuals. These physical activities entail an intricate sequence and series of physical anatomy, physiological movement, movement of the anatomy, etc. To better understand how these components interact with one another and their downstream impact on health outcomes, there needs to be an information model that conceptualizes all entities involved. In this study, we introduced our early development of an ontology model to computationally describe human physical activities and the various entities that compose each activity. We developed an open-sourced biomedical ontology called the Kinetic Human Movement Ontology that reused OBO Foundry terminologies and encoded in OWL2. We applied this ontology in modeling and linking a specific Tai Chi movement. The contribution of this work could enable modeling of information relating to human physical activity, like exercise, and lead towards information standardization of human movement for analysis. Future work will include expanding our ontology to include more expressive information and completely modeling entire sets of movement from human physical activity.
许多研究都探讨了运动和其他体育活动对个人健康结果的影响。这些体能活动包含一系列错综复杂的物理解剖、生理运动、解剖运动等。为了更好地理解这些组成部分之间如何相互作用以及它们对健康结果的下游影响,需要有一个信息模型来概念化所涉及的所有实体。在本研究中,我们介绍了我们早期开发的本体模型,该模型用于计算描述人类的身体活动以及构成每项活动的各种实体。我们开发了一个开源的生物医学本体,名为 "人体运动本体"(Kinetic Human Movement Ontology),该本体重复使用了 OBO Foundry 术语,并用 OWL2 进行了编码。我们将该本体应用于特定太极运动的建模和链接。这项工作的贡献在于能够对与人类身体活动(如运动)相关的信息进行建模,并实现人类运动分析的信息标准化。未来的工作将包括扩展我们的本体,以包含更具表现力的信息,并对人类体育活动的整套动作进行完全建模。
{"title":"Developing a computational representation of human physical activity and exercise using open ontology-based approach: a Tai Chi use case.","authors":"Eloisa Nguyen, Rebecca Z Lin, Yang Gong, Cui Tao, Muhammad Tuan Amith","doi":"10.1109/ichi61247.2024.00012","DOIUrl":"10.1109/ichi61247.2024.00012","url":null,"abstract":"<p><p>Many studies have examined the impact of exercise and other physical activities in influencing the health outcomes of individuals. These physical activities entail an intricate sequence and series of physical anatomy, physiological movement, movement of the anatomy, etc. To better understand how these components interact with one another and their downstream impact on health outcomes, there needs to be an information model that conceptualizes all entities involved. In this study, we introduced our early development of an ontology model to computationally describe human physical activities and the various entities that compose each activity. We developed an open-sourced biomedical ontology called the Kinetic Human Movement Ontology that reused OBO Foundry terminologies and encoded in OWL2. We applied this ontology in modeling and linking a specific Tai Chi movement. The contribution of this work could enable modeling of information relating to human physical activity, like exercise, and lead towards information standardization of human movement for analysis. Future work will include expanding our ontology to include more expressive information and completely modeling entire sets of movement from human physical activity.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":"2024 ","pages":"31-39"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11503552/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142514161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-01Epub Date: 2024-08-22DOI: 10.1109/ichi61247.2024.00032
Richard Li Xu, Song Wang, Zewei Wang, Yuhan Zhang, Yunyu Xiao, Jyotishman Pathak, David Hodge, Yan Leng, S Craig Watkins, Ying Ding, Yifan Peng
Social factors like family background, education level, financial status, and stress can impact public health outcomes, such as suicidal ideation. However, the analysis of social factors for suicide prevention has been limited by the lack of up-to-date suicide reporting data, variations in reporting practices, and small sample sizes. In this study, we analyzed 172,629 suicide incidents from 2014 to 2020 utilizing the National Violent Death Reporting System Restricted Access Database (NVDRS-RAD). Logistic regression models were developed to examine the relationships between demographics and suicide-related circumstances. Trends over time were assessed, and Latent Dirichlet Allocation (LDA) was used to identify common suicide-related social factors. Mental health, interpersonal relationships, mental health treatment and disclosure, and school/work-related stressors were identified as the main themes of suicide-related social factors. This study also identified systemic disparities across various population groups, particularly concerning Black individuals, young people aged under 24, healthcare practitioners, and those with limited education backgrounds, which shed light on potential directions for demographic-specific suicidal interventions.
{"title":"Analyzing Social Factors to Enhance Suicide Prevention Across Population Groups.","authors":"Richard Li Xu, Song Wang, Zewei Wang, Yuhan Zhang, Yunyu Xiao, Jyotishman Pathak, David Hodge, Yan Leng, S Craig Watkins, Ying Ding, Yifan Peng","doi":"10.1109/ichi61247.2024.00032","DOIUrl":"10.1109/ichi61247.2024.00032","url":null,"abstract":"<p><p>Social factors like family background, education level, financial status, and stress can impact public health outcomes, such as suicidal ideation. However, the analysis of social factors for suicide prevention has been limited by the lack of up-to-date suicide reporting data, variations in reporting practices, and small sample sizes. In this study, we analyzed 172,629 suicide incidents from 2014 to 2020 utilizing the National Violent Death Reporting System Restricted Access Database (NVDRS-RAD). Logistic regression models were developed to examine the relationships between demographics and suicide-related circumstances. Trends over time were assessed, and Latent Dirichlet Allocation (LDA) was used to identify common suicide-related social factors. Mental health, interpersonal relationships, mental health treatment and disclosure, and school/work-related stressors were identified as the main themes of suicide-related social factors. This study also identified systemic disparities across various population groups, particularly concerning Black individuals, young people aged under 24, healthcare practitioners, and those with limited education backgrounds, which shed light on potential directions for demographic-specific suicidal interventions.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":"2024 ","pages":"189-199"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11450796/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142382637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-01Epub Date: 2024-08-22DOI: 10.1109/ichi61247.2024.00084
Liyue Fan, Ashley Bang, Luca Bonomi
Data synthesis can address important data availability challenges in biomedical informatics. Quantitative evaluation of generative models may help understand their applications to synthesizing biomedical data. This poster paper examines state-of-the-art generative models used in medical imaging, such as StyleGAN and DDPM models, and evaluates their performance in learning data manifolds and in the visible features of generated samples. Results show that existing generative models have much to improve based on the studied measures.
{"title":"Evaluating Generative Models in Medical Imaging.","authors":"Liyue Fan, Ashley Bang, Luca Bonomi","doi":"10.1109/ichi61247.2024.00084","DOIUrl":"10.1109/ichi61247.2024.00084","url":null,"abstract":"<p><p>Data synthesis can address important data availability challenges in biomedical informatics. Quantitative evaluation of generative models may help understand their applications to synthesizing biomedical data. This poster paper examines state-of-the-art generative models used in medical imaging, such as StyleGAN and DDPM models, and evaluates their performance in learning data manifolds and in the visible features of generated samples. Results show that existing generative models have much to improve based on the studied measures.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":"2024 ","pages":"553-555"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11508590/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142514162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-01Epub Date: 2024-08-22DOI: 10.1109/ichi61247.2024.00009
Yuxi Liu, Zhenhao Zhang, Shaowen Qin, Flora D Salim, Jiang Bian, Antonio Jimeno Yepes
Predictive analytics using Electronic Health Records (EHRs) have become an active research area in recent years, especially with the development of deep learning techniques. A popular EHR data analysis paradigm in deep learning is patient representation learning, which aims to learn a condensed mathematical representation of individual patients. However, EHR data are often inherently irregular, i.e., data entries were captured at different times as well as with different contents due to the individualized needs of each patient. Most of the work focused on the provision of deep neural networks with attention mechanisms that generate complete patient representations that can be readily used for downstream prediction tasks. However, such approaches fail to take patient similarity into account, which is generally used in clinical reasoning scenarios. This study presents a new Contrastive Graph Similarity Network for similarity calculation among patients in large EHR datasets. Particularly, we apply graph-based similarity analysis that explicitly extracts the clinical characteristics of each patient and aggregates the information of similar patients to generate rich patient representations. Experimental results on real-world EHR databases demonstrate the effectiveness and superiority of our method for the task of vital signs imputation and ICU patient deterioration prediction.
{"title":"Fine-grained Patient Similarity Measuring using Contrastive Graph Similarity Networks.","authors":"Yuxi Liu, Zhenhao Zhang, Shaowen Qin, Flora D Salim, Jiang Bian, Antonio Jimeno Yepes","doi":"10.1109/ichi61247.2024.00009","DOIUrl":"https://doi.org/10.1109/ichi61247.2024.00009","url":null,"abstract":"<p><p>Predictive analytics using Electronic Health Records (EHRs) have become an active research area in recent years, especially with the development of deep learning techniques. A popular EHR data analysis paradigm in deep learning is patient representation learning, which aims to learn a condensed mathematical representation of individual patients. However, EHR data are often inherently irregular, i.e., data entries were captured at different times as well as with different contents due to the individualized needs of each patient. Most of the work focused on the provision of deep neural networks with attention mechanisms that generate complete patient representations that can be readily used for downstream prediction tasks. However, such approaches fail to take patient similarity into account, which is generally used in clinical reasoning scenarios. This study presents a new Contrastive Graph Similarity Network for similarity calculation among patients in large EHR datasets. Particularly, we apply graph-based similarity analysis that explicitly extracts the clinical characteristics of each patient and aggregates the information of similar patients to generate rich patient representations. Experimental results on real-world EHR databases demonstrate the effectiveness and superiority of our method for the task of vital signs imputation and ICU patient deterioration prediction.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":"2024 ","pages":"1-10"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11654828/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142857143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01Epub Date: 2023-12-11DOI: 10.1109/ichi57859.2023.00022
Liyue Fan, Luca Bonomi
Deep neural networks have been increasingly integrated in healthcare applications to enable accurate predicative analyses. Sharing trained deep models not only facilitates knowledge integration in collaborative research efforts but also enables equitable access to computational intelligence. However, recent studies have shown that an adversary may leverage a shared model to learn the participation of a target individual in the training set. In this work, we investigate privacy-protecting model sharing for survival studies. Specifically, we pose three research questions. (1) Do deep survival models leak membership information? (2) How effective is differential privacy in defending against membership inference in deep survival analyses? (3) Are there other effects of differential privacy on deep survival analyses? Our study assesses the membership leakage in emerging deep survival models and develops differentially private training procedures to provide rigorous privacy protection. The experimental results show that deep survival models leak membership information and our approach effectively reduces membership inference risks. The results also show that differential privacy introduces a limited performance loss, and may improve the model robustness in the presence of noisy data, compared to non-private models.
{"title":"Mitigating Membership Inference in Deep Survival Analyses with Differential Privacy.","authors":"Liyue Fan, Luca Bonomi","doi":"10.1109/ichi57859.2023.00022","DOIUrl":"10.1109/ichi57859.2023.00022","url":null,"abstract":"<p><p>Deep neural networks have been increasingly integrated in healthcare applications to enable accurate predicative analyses. Sharing trained deep models not only facilitates knowledge integration in collaborative research efforts but also enables equitable access to computational intelligence. However, recent studies have shown that an adversary may leverage a shared model to learn the participation of a target individual in the training set. In this work, we investigate privacy-protecting model sharing for survival studies. Specifically, we pose three research questions. (1) Do deep survival models leak membership information? (2) How effective is differential privacy in defending against membership inference in deep survival analyses? (3) Are there other effects of differential privacy on deep survival analyses? Our study assesses the membership leakage in emerging deep survival models and develops differentially private training procedures to provide rigorous privacy protection. The experimental results show that deep survival models leak membership information and our approach effectively reduces membership inference risks. The results also show that differential privacy introduces a limited performance loss, and may improve the model robustness in the presence of noisy data, compared to non-private models.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":"2023 ","pages":"81-90"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10751041/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139049861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01Epub Date: 2023-12-11DOI: 10.1109/ichi57859.2023.00062
Riyad Bin Rafiq, Syed Araib Karim, Mark V Albert
Fast and flexible communication options are limited for speech-impaired people. Hand gestures coupled with fast, generated speech can enable a more natural social dynamic for those individuals - particularly individuals without the fine motor skills to type on a keyboard or tablet reliably. We created a mobile phone application prototype that generates audible responses associated with trained hand movements and collects and organizes the accelerometer data for rapid training to allow tailored models for individuals who may not be able to perform standard movements such as sign language. Six participants performed 11 distinct gestures to produce the dataset. A mobile application was developed that integrated a bidirectional LSTM network architecture which was trained from this data. After evaluation using nested subject-wise cross-validation, our integrated bidirectional LSTM model demonstrates an overall recall of 91.8% in recognition of these pre-selected 11 hand gestures, with recall at 95.8% when two commonly confused gestures were not assessed. This prototype is a step in creating a mobile phone system capable of capturing new gestures and developing tailored gesture recognition models for individuals in speech-impaired populations. Further refinement of this prototype can enable fast and efficient communication with the goal of further improving social interaction for individuals unable to speak.
{"title":"An LSTM-based Gesture-to-Speech Recognition System.","authors":"Riyad Bin Rafiq, Syed Araib Karim, Mark V Albert","doi":"10.1109/ichi57859.2023.00062","DOIUrl":"10.1109/ichi57859.2023.00062","url":null,"abstract":"<p><p>Fast and flexible communication options are limited for speech-impaired people. Hand gestures coupled with fast, generated speech can enable a more natural social dynamic for those individuals - particularly individuals without the fine motor skills to type on a keyboard or tablet reliably. We created a mobile phone application prototype that generates audible responses associated with trained hand movements and collects and organizes the accelerometer data for rapid training to allow tailored models for individuals who may not be able to perform standard movements such as sign language. Six participants performed 11 distinct gestures to produce the dataset. A mobile application was developed that integrated a bidirectional LSTM network architecture which was trained from this data. After evaluation using nested subject-wise cross-validation, our integrated bidirectional LSTM model demonstrates an overall recall of 91.8% in recognition of these pre-selected 11 hand gestures, with recall at 95.8% when two commonly confused gestures were not assessed. This prototype is a step in creating a mobile phone system capable of capturing new gestures and developing tailored gesture recognition models for individuals in speech-impaired populations. Further refinement of this prototype can enable fast and efficient communication with the goal of further improving social interaction for individuals unable to speak.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":"2023 ","pages":"430-438"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10894657/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139974844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01Epub Date: 2023-12-11DOI: 10.1109/ichi57859.2023.00102
Xiaoyu Wang, Dipankar Gupta, Michael Killian, Zhe He
Electronic health records (EHR) have been widely used in building machine learning models for health outcomes prediction. However, many EHR-based models are inherently biased due to lack of risk factors on social determinants of health (SDoH), which are responsible for up to 40% preventive deaths. As SDoH information is often captured in clinical notes, recent efforts have been made to extract such information from notes with natural language processing and append it to other structured data. In this work, we benchmark 7 pre-trained transformer-based models, including BERT, ALBERT, BioBERT, BioClinicalBERT, RoBERTa, ELECTRA, and RoBERTa-MIMIC-Trial, for recognizing SDoH terms using a previously annotated corpus of MIMIC-III clinical notes. Our study shows that BioClinicalBERT model performs best on F-1 scores (0.911, 0.923) under both strict and relaxed criteria. This work shows the promise of using transformer-based models for recognizing SDoH information from clinical notes.
{"title":"Benchmarking Transformer-Based Models for Identifying Social Determinants of Health in Clinical Notes.","authors":"Xiaoyu Wang, Dipankar Gupta, Michael Killian, Zhe He","doi":"10.1109/ichi57859.2023.00102","DOIUrl":"10.1109/ichi57859.2023.00102","url":null,"abstract":"<p><p>Electronic health records (EHR) have been widely used in building machine learning models for health outcomes prediction. However, many EHR-based models are inherently biased due to lack of risk factors on social determinants of health (SDoH), which are responsible for up to 40% preventive deaths. As SDoH information is often captured in clinical notes, recent efforts have been made to extract such information from notes with natural language processing and append it to other structured data. In this work, we benchmark 7 pre-trained transformer-based models, including BERT, ALBERT, BioBERT, BioClinicalBERT, RoBERTa, ELECTRA, and RoBERTa-MIMIC-Trial, for recognizing SDoH terms using a previously annotated corpus of MIMIC-III clinical notes. Our study shows that BioClinicalBERT model performs best on F-1 scores (0.911, 0.923) under both strict and relaxed criteria. This work shows the promise of using transformer-based models for recognizing SDoH information from clinical notes.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":"2023 ","pages":"570-574"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10795706/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139492901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}