Melanoma is the most common form of cancer in the world. Early diagnosis of the disease and an accurate estimation of its size and shape are crucial in preventing its spread to other body parts. Manual segmentation of these lesions by a radiologist however is time consuming and error-prone. It is clinically desirable to have an automatic tool to detect malignant skin lesions from dermoscopic skin images. We propose a novel end-to-end convolution neural network(CNN) for a precise and robust skin lesion localization and segmentation. The proposed network has 3 sub-encoders branching out from the main encoder. The 3 sub-encoders are inspired from Coordinate Convolution, Hourglass and Octave Convolutional blocks: each sub-encoder summarizes different patterns and yet collectively aims to achieve a precise segmentation. We trained our segmentation model just on the ISIC 2018 dataset. To demonstrate the generalizability of our model, we evaluated our model on the ISIC 2018 and unseen datasets including ISIC 2017 and PH2. Our approach showed an average 5% improvement in performance over different datasets, while having less than half of the number of parameters when compared to other state-of-the-arts segmentation models.
{"title":"B-SegNet: branched-SegMentor network for skin lesion segmentation","authors":"Shreshth Saini, Y. Jeon, Mengling Feng","doi":"10.1145/3450439.3451873","DOIUrl":"https://doi.org/10.1145/3450439.3451873","url":null,"abstract":"Melanoma is the most common form of cancer in the world. Early diagnosis of the disease and an accurate estimation of its size and shape are crucial in preventing its spread to other body parts. Manual segmentation of these lesions by a radiologist however is time consuming and error-prone. It is clinically desirable to have an automatic tool to detect malignant skin lesions from dermoscopic skin images. We propose a novel end-to-end convolution neural network(CNN) for a precise and robust skin lesion localization and segmentation. The proposed network has 3 sub-encoders branching out from the main encoder. The 3 sub-encoders are inspired from Coordinate Convolution, Hourglass and Octave Convolutional blocks: each sub-encoder summarizes different patterns and yet collectively aims to achieve a precise segmentation. We trained our segmentation model just on the ISIC 2018 dataset. To demonstrate the generalizability of our model, we evaluated our model on the ISIC 2018 and unseen datasets including ISIC 2017 and PH2. Our approach showed an average 5% improvement in performance over different datasets, while having less than half of the number of parameters when compared to other state-of-the-arts segmentation models.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82931573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matthew Saponaro, Ajith Vemuri, G. Dominick, Keith S. Decker
Wearable technology opens opportunities to reduce sedentary behavior; however, commercially available devices do not provide tailored coaching strategies. Just-In-Time Adaptive Interventions (JITAI) provide such a framework; however most JITAI are conceptual to date. We conduct a study to evaluate just-in-time nudges in free-living conditions in terms of receptiveness and nudge impact. We first quantify baseline behavioral patterns in context using features such as location and step count, and assess differences in individual responses. We show there is a strong inverse relationship between average daily step counts and time spent being sedentary indicating that steps are steadily taken throughout the day, rather than in large bursts. Interestingly, the effect of nudges delivered at the workplace is larger in terms of step count than those delivered at home. We develop Random Forest models to learn nudge receptiveness using both individualized and contextualized data. We show that step count is the least important identifier in nudge receptiveness, while location is the most important. Furthermore, we compare the developed models with a commercially available smart coach using post-hoc analysis. The results show that using the contextualized and individualized information significantly outperforms non-JITAI approaches to determine nudge receptiveness.
{"title":"Contextualization and individualization for just-in-time adaptive interventions to reduce sedentary behavior","authors":"Matthew Saponaro, Ajith Vemuri, G. Dominick, Keith S. Decker","doi":"10.1145/3450439.3451874","DOIUrl":"https://doi.org/10.1145/3450439.3451874","url":null,"abstract":"Wearable technology opens opportunities to reduce sedentary behavior; however, commercially available devices do not provide tailored coaching strategies. Just-In-Time Adaptive Interventions (JITAI) provide such a framework; however most JITAI are conceptual to date. We conduct a study to evaluate just-in-time nudges in free-living conditions in terms of receptiveness and nudge impact. We first quantify baseline behavioral patterns in context using features such as location and step count, and assess differences in individual responses. We show there is a strong inverse relationship between average daily step counts and time spent being sedentary indicating that steps are steadily taken throughout the day, rather than in large bursts. Interestingly, the effect of nudges delivered at the workplace is larger in terms of step count than those delivered at home. We develop Random Forest models to learn nudge receptiveness using both individualized and contextualized data. We show that step count is the least important identifier in nudge receptiveness, while location is the most important. Furthermore, we compare the developed models with a commercially available smart coach using post-hoc analysis. The results show that using the contextualized and individualized information significantly outperforms non-JITAI approaches to determine nudge receptiveness.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86071293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Konstantin D. Pandl, Fabian Feiland, Scott Thiebes, A. Sunyaev
Collecting data from many sources is an essential approach to generate large data sets required for the training of machine learning models. Trustworthy machine learning requires incentives, guarantees of data quality, and information privacy. Applying recent advancements in data valuation methods for machine learning can help to enable these. In this work, we analyze the suitability of three different data valuation methods for medical image classification tasks, specifically pleural effusion, on an extensive data set of chest X-ray scans. Our results reveal that a heuristic for calculating the Shapley valuation scheme based on a k-nearest neighbor classifier can successfully value large quantities of data instances. We also demonstrate possible applications for incentivizing data sharing, the efficient detection of mislabeled data, and summarizing data sets to exclude private information. Thereby, this work contributes to developing modern data infrastructures for trustworthy machine learning in health care.
{"title":"Trustworthy machine learning for health care: scalable data valuation with the shapley value","authors":"Konstantin D. Pandl, Fabian Feiland, Scott Thiebes, A. Sunyaev","doi":"10.1145/3450439.3451861","DOIUrl":"https://doi.org/10.1145/3450439.3451861","url":null,"abstract":"Collecting data from many sources is an essential approach to generate large data sets required for the training of machine learning models. Trustworthy machine learning requires incentives, guarantees of data quality, and information privacy. Applying recent advancements in data valuation methods for machine learning can help to enable these. In this work, we analyze the suitability of three different data valuation methods for medical image classification tasks, specifically pleural effusion, on an extensive data set of chest X-ray scans. Our results reveal that a heuristic for calculating the Shapley valuation scheme based on a k-nearest neighbor classifier can successfully value large quantities of data instances. We also demonstrate possible applications for incentivizing data sharing, the efficient detection of mislabeled data, and summarizing data sets to exclude private information. Thereby, this work contributes to developing modern data infrastructures for trustworthy machine learning in health care.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74018872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine learning algorithms in healthcare have the potential to continually learn from real-world data generated during healthcare delivery and adapt to dataset shifts. As such, regulatory bodies like the US FDA have begun discussions on how to autonomously approve modifications to algorithms. Current proposals evaluate algorithmic modifications via hypothesis testing and control a definition of online approval error that only applies if the data is stationary over time, which is unlikely in practice. To this end, we investigate designing approval policies for modifications to ML algorithms in the presence of distributional shifts. Our key observation is that the approval policy most efficient at identifying and approving beneficial modifications varies across problem settings. So, rather than selecting a fixed approval policy a priori, we propose learning the best approval policy by searching over a family of approval strategies. We define a family of strategies that range in their level of optimism when approving modifications. To protect against settings where no version of the ML algorithm performs well, this family includes a pessimistic strategy that rescinds approval. We use the exponentially weighted averaging forecaster (EWAF) to learn the most appropriate strategy and derive tighter regret bounds assuming the distributional shifts are bounded. In simulation studies and empirical analyses, we find that wrapping approval strategies within EWAF is a simple yet effective approach to protect against distributional shifts without significantly slowing down approval of beneficial modifications.
{"title":"Learning to safely approve updates to machine learning algorithms","authors":"Jean Feng","doi":"10.1145/3450439.3451864","DOIUrl":"https://doi.org/10.1145/3450439.3451864","url":null,"abstract":"Machine learning algorithms in healthcare have the potential to continually learn from real-world data generated during healthcare delivery and adapt to dataset shifts. As such, regulatory bodies like the US FDA have begun discussions on how to autonomously approve modifications to algorithms. Current proposals evaluate algorithmic modifications via hypothesis testing and control a definition of online approval error that only applies if the data is stationary over time, which is unlikely in practice. To this end, we investigate designing approval policies for modifications to ML algorithms in the presence of distributional shifts. Our key observation is that the approval policy most efficient at identifying and approving beneficial modifications varies across problem settings. So, rather than selecting a fixed approval policy a priori, we propose learning the best approval policy by searching over a family of approval strategies. We define a family of strategies that range in their level of optimism when approving modifications. To protect against settings where no version of the ML algorithm performs well, this family includes a pessimistic strategy that rescinds approval. We use the exponentially weighted averaging forecaster (EWAF) to learn the most appropriate strategy and derive tighter regret bounds assuming the distributional shifts are bounded. In simulation studies and empirical analyses, we find that wrapping approval strategies within EWAF is a simple yet effective approach to protect against distributional shifts without significantly slowing down approval of beneficial modifications.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79521409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Raouf Kerkouche, G. Ács, C. Castelluccia, P. Genevès
Machine Learning, and in particular Federated Machine Learning, opens new perspectives in terms of medical research and patient care. Although Federated Machine Learning improves over centralized Machine Learning in terms of privacy, it does not provide provable privacy guarantees. Furthermore, Federated Machine Learning is quite expensive in term of bandwidth consumption as it requires participant nodes to regularly exchange large updates. This paper proposes a bandwidth-efficient privacy-preserving Federated Learning that provides theoretical privacy guarantees based on Differential Privacy. We experimentally evaluate our proposal for in-hospital mortality prediction using a real dataset, containing Electronic Health Records of about one million patients. Our results suggest that strong and provable patient-level privacy can be enforced at the expense of only a moderate loss of prediction accuracy.
{"title":"Privacy-preserving and bandwidth-efficient federated learning: an application to in-hospital mortality prediction","authors":"Raouf Kerkouche, G. Ács, C. Castelluccia, P. Genevès","doi":"10.1145/3450439.3451859","DOIUrl":"https://doi.org/10.1145/3450439.3451859","url":null,"abstract":"Machine Learning, and in particular Federated Machine Learning, opens new perspectives in terms of medical research and patient care. Although Federated Machine Learning improves over centralized Machine Learning in terms of privacy, it does not provide provable privacy guarantees. Furthermore, Federated Machine Learning is quite expensive in term of bandwidth consumption as it requires participant nodes to regularly exchange large updates. This paper proposes a bandwidth-efficient privacy-preserving Federated Learning that provides theoretical privacy guarantees based on Differential Privacy. We experimentally evaluate our proposal for in-hospital mortality prediction using a real dataset, containing Electronic Health Records of about one million patients. Our results suggest that strong and provable patient-level privacy can be enforced at the expense of only a moderate loss of prediction accuracy.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90363444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guimin Dong, Lihua Cai, Debajyoti Datta, Shashwat Kumar, Laura E. Barnes, M. Boukhechba
Early detection of influenza-like symptoms can prevent widespread flu viruses and enable timely treatments, particularly in the post-pandemic era. Mobile sensing leverages an increasingly diverse set of embedded sensors to capture fine-grained information of human behaviors and ambient contexts, and can serve as a promising solution for influenza-like symptom recognition. Traditionally, handcrafted and high level features of mobile sensing data are extracted by manual feature engineering and convolutional/recurrent neural network respectively. In this work, we apply graph representation to encode the dynamics of state transitions and internal dependencies in human behaviors, leverage graph embeddings to automatically extract the topological and spatial features from graph inputs, and propose an end-to-end graph neural network (GNN) model with multi-channel mobile sensing input for influenzalike symptom recognition based on people's daily mobility, social interactions, and physical activities. Using data generated from 448 participants, we show that GNN with GraphSAGE convolutional layers significantly outperforms baseline models with handcrafted features. Furthermore, we use GNN interpretability method to generate insights (e.g., important nodes and graph structures) about the importance of mobile sensing for recognizing Influenza-like symptoms. To the best of our knowledge, this is the first work that applies graph representation and graph neural network on mobile sensing data for graph-based human behavior modeling and health symptoms prediction.
{"title":"Influenza-like symptom recognition using mobile sensing and graph neural networks","authors":"Guimin Dong, Lihua Cai, Debajyoti Datta, Shashwat Kumar, Laura E. Barnes, M. Boukhechba","doi":"10.1145/3450439.3451880","DOIUrl":"https://doi.org/10.1145/3450439.3451880","url":null,"abstract":"Early detection of influenza-like symptoms can prevent widespread flu viruses and enable timely treatments, particularly in the post-pandemic era. Mobile sensing leverages an increasingly diverse set of embedded sensors to capture fine-grained information of human behaviors and ambient contexts, and can serve as a promising solution for influenza-like symptom recognition. Traditionally, handcrafted and high level features of mobile sensing data are extracted by manual feature engineering and convolutional/recurrent neural network respectively. In this work, we apply graph representation to encode the dynamics of state transitions and internal dependencies in human behaviors, leverage graph embeddings to automatically extract the topological and spatial features from graph inputs, and propose an end-to-end graph neural network (GNN) model with multi-channel mobile sensing input for influenzalike symptom recognition based on people's daily mobility, social interactions, and physical activities. Using data generated from 448 participants, we show that GNN with GraphSAGE convolutional layers significantly outperforms baseline models with handcrafted features. Furthermore, we use GNN interpretability method to generate insights (e.g., important nodes and graph structures) about the importance of mobile sensing for recognizing Influenza-like symptoms. To the best of our knowledge, this is the first work that applies graph representation and graph neural network on mobile sensing data for graph-based human behavior modeling and health symptoms prediction.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90676441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haoran Zhang, Natalie Dullerud, L. Seyyed-Kalantari, Q. Morris, Shalmali Joshi, M. Ghassemi
Clinical machine learning models experience significantly degraded performance in datasets not seen during training, e.g., new hospitals or populations. Recent developments in domain generalization offer a promising solution to this problem by creating models that learn invariances across environments. In this work, we benchmark the performance of eight domain generalization methods on multi-site clinical time series and medical imaging data. We introduce a framework to induce synthetic but realistic domain shifts and sampling bias to stress-test these methods over existing non-healthcare benchmarks. We find that current domain generalization methods do not achieve significant gains in out-of-distribution performance over empirical risk minimization on real-world medical imaging data, in line with prior work on general imaging datasets. However, a subset of realistic induced-shift scenarios in clinical time series data exhibit limited performance gains. We characterize these scenarios in detail, and recommend best practices for domain generalization in the clinical setting.
{"title":"An empirical framework for domain generalization in clinical settings","authors":"Haoran Zhang, Natalie Dullerud, L. Seyyed-Kalantari, Q. Morris, Shalmali Joshi, M. Ghassemi","doi":"10.1145/3450439.3451878","DOIUrl":"https://doi.org/10.1145/3450439.3451878","url":null,"abstract":"Clinical machine learning models experience significantly degraded performance in datasets not seen during training, e.g., new hospitals or populations. Recent developments in domain generalization offer a promising solution to this problem by creating models that learn invariances across environments. In this work, we benchmark the performance of eight domain generalization methods on multi-site clinical time series and medical imaging data. We introduce a framework to induce synthetic but realistic domain shifts and sampling bias to stress-test these methods over existing non-healthcare benchmarks. We find that current domain generalization methods do not achieve significant gains in out-of-distribution performance over empirical risk minimization on real-world medical imaging data, in line with prior work on general imaging datasets. However, a subset of realistic induced-shift scenarios in clinical time series data exhibit limited performance gains. We characterize these scenarios in detail, and recommend best practices for domain generalization in the clinical setting.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74328999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Maag, S. Feuerriegel, Mathias Kraus, M. Saar-Tsechansky, Thomas Züger
In medicine, comorbidities refer to the presence of multiple, co-occurring diseases. Due to their co-occurring nature, the course of one comorbidity is often highly dependent on the course of the other disease and, hence, treatments can have significant spill-over effects. Despite the prevalence of comorbidities among patients, a comprehensive statistical framework for modeling the longitudinal dynamics of comorbidities is missing. In this paper, we propose a probabilistic model for analyzing comorbidity dynamics over time in patients. Specifically, we develop a coupled hidden Markov model with a personalized, non-homogeneous transition mechanism, named Comorbidity-HMM. The specification of our Comorbidity-HMM is informed by clinical research: (1) It accounts for different disease states (i. e., acute, stable) in the disease progression by introducing latent states that are of clinical meaning. (2) It models a coupling among the trajectories from comorbidities to capture co-evolution dynamics. (3) It considers between-patient heterogeneity (e. g., risk factors, treatments) in the transition mechanism. Based on our model, we define a spill-over effect that measures the indirect effect of treatments on patient trajectories through coupling (i. e., through comorbidity co-evolution). We evaluated our proposed Comorbidity-HMM based on 675 health trajectories where we investigate the joint progression of diabetes mellitus and chronic liver disease. Compared to alternative models without coupling, we find that our Comorbidity-HMM achieves a superior fit. Further, we quantify the spill-over effect, that is, to what extent diabetes treatments are associated with a change in the chronic liver disease from an acute to a stable disease state. To this end, our model is of direct relevance for both treatment planning and clinical research in the context of comorbidities.
{"title":"Modeling longitudinal dynamics of comorbidities","authors":"B. Maag, S. Feuerriegel, Mathias Kraus, M. Saar-Tsechansky, Thomas Züger","doi":"10.1145/3450439.3451871","DOIUrl":"https://doi.org/10.1145/3450439.3451871","url":null,"abstract":"In medicine, comorbidities refer to the presence of multiple, co-occurring diseases. Due to their co-occurring nature, the course of one comorbidity is often highly dependent on the course of the other disease and, hence, treatments can have significant spill-over effects. Despite the prevalence of comorbidities among patients, a comprehensive statistical framework for modeling the longitudinal dynamics of comorbidities is missing. In this paper, we propose a probabilistic model for analyzing comorbidity dynamics over time in patients. Specifically, we develop a coupled hidden Markov model with a personalized, non-homogeneous transition mechanism, named Comorbidity-HMM. The specification of our Comorbidity-HMM is informed by clinical research: (1) It accounts for different disease states (i. e., acute, stable) in the disease progression by introducing latent states that are of clinical meaning. (2) It models a coupling among the trajectories from comorbidities to capture co-evolution dynamics. (3) It considers between-patient heterogeneity (e. g., risk factors, treatments) in the transition mechanism. Based on our model, we define a spill-over effect that measures the indirect effect of treatments on patient trajectories through coupling (i. e., through comorbidity co-evolution). We evaluated our proposed Comorbidity-HMM based on 675 health trajectories where we investigate the joint progression of diabetes mellitus and chronic liver disease. Compared to alternative models without coupling, we find that our Comorbidity-HMM achieves a superior fit. Further, we quantify the spill-over effect, that is, to what extent diabetes treatments are associated with a change in the chronic liver disease from an acute to a stable disease state. To this end, our model is of direct relevance for both treatment planning and clinical research in the context of comorbidities.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77646932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alvin Chan, A. Korsakova, Y. Ong, F. Winnerdy, K. W. Lim, A. Phan
A single gene can encode for different protein versions through a process called alternative splicing. Since proteins play major roles in cellular functions, aberrant splicing profiles can result in a variety of diseases, including cancers. Alternative splicing is determined by the gene's primary sequence and other regulatory factors such as RNA-binding protein levels. With these as input, we formulate the prediction of RNA splicing as a regression task and build a new training dataset (CAPD) to benchmark learned models. We propose discrete compositional energy network (DCEN) which leverages the hierarchical relationships between splice sites, junctions and transcripts to approach this task. In the case of alternative splicing prediction, DCEN models mRNA transcript probabilities through its constituent splice junctions' energy values. These transcript probabilities are subsequently mapped to relative abundance values of key nucleotides and trained with ground-truth experimental measurements. Through our experiments on CAPD1, we show that DCEN outperforms baselines and ablation variants.2
{"title":"RNA alternative splicing prediction with discrete compositional energy network","authors":"Alvin Chan, A. Korsakova, Y. Ong, F. Winnerdy, K. W. Lim, A. Phan","doi":"10.1145/3450439.3451857","DOIUrl":"https://doi.org/10.1145/3450439.3451857","url":null,"abstract":"A single gene can encode for different protein versions through a process called alternative splicing. Since proteins play major roles in cellular functions, aberrant splicing profiles can result in a variety of diseases, including cancers. Alternative splicing is determined by the gene's primary sequence and other regulatory factors such as RNA-binding protein levels. With these as input, we formulate the prediction of RNA splicing as a regression task and build a new training dataset (CAPD) to benchmark learned models. We propose discrete compositional energy network (DCEN) which leverages the hierarchical relationships between splice sites, junctions and transcripts to approach this task. In the case of alternative splicing prediction, DCEN models mRNA transcript probabilities through its constituent splice junctions' energy values. These transcript probabilities are subsequently mapped to relative abundance values of key nucleotides and trained with ground-truth experimental measurements. Through our experiments on CAPD1, we show that DCEN outperforms baselines and ablation variants.2","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89077137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aniruddh Raghu, J. Guttag, K. Young, E. Pomerantsev, Adrian V. Dalca, Collin M. Stultz
The impact of machine learning models on healthcare will depend on the degree of trust that healthcare professionals place in the predictions made by these models. In this paper, we present a method to provide individuals with clinical expertise with domain-relevant evidence about why a prediction should be trusted. We first design a probabilistic model that relates meaningful latent concepts to prediction targets and observed data. Inference of latent variables in this model corresponds to both making a prediction and providing supporting evidence for that prediction. We present a two-step process to efficiently approximate inference: (i) estimating model parameters using variational learning, and (ii) approximating maximum a posteriori estimation of latent variables in the model using a neural network, trained with an objective derived from the probabilistic model. We demonstrate the method on the task of predicting mortality risk for patients with cardiovascular disease. Specifically, using electrocardiogram and tabular data as input, we show that our approach provides appropriate domain-relevant supporting evidence for accurate predictions.
{"title":"Learning to predict with supporting evidence: applications to clinical risk prediction","authors":"Aniruddh Raghu, J. Guttag, K. Young, E. Pomerantsev, Adrian V. Dalca, Collin M. Stultz","doi":"10.1145/3450439.3451869","DOIUrl":"https://doi.org/10.1145/3450439.3451869","url":null,"abstract":"The impact of machine learning models on healthcare will depend on the degree of trust that healthcare professionals place in the predictions made by these models. In this paper, we present a method to provide individuals with clinical expertise with domain-relevant evidence about why a prediction should be trusted. We first design a probabilistic model that relates meaningful latent concepts to prediction targets and observed data. Inference of latent variables in this model corresponds to both making a prediction and providing supporting evidence for that prediction. We present a two-step process to efficiently approximate inference: (i) estimating model parameters using variational learning, and (ii) approximating maximum a posteriori estimation of latent variables in the model using a neural network, trained with an objective derived from the probabilistic model. We demonstrate the method on the task of predicting mortality risk for patients with cardiovascular disease. Specifically, using electrocardiogram and tabular data as input, we show that our approach provides appropriate domain-relevant supporting evidence for accurate predictions.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82566732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}