Pub Date : 2024-05-13DOI: 10.1016/j.patter.2024.100989
Eivind Heggernes Ask, Astrid Tschan-Plessl, Hanna Julie Hoel, Arne Kolstad, Harald Holte, Karl-Johan Malmberg
Flow cytometry is a powerful technology for high-throughput protein quantification at the single-cell level. Technical advances have substantially increased data complexity, but novel bioinformatical tools often show limitations in statistical testing, data sharing, cross-experiment comparability, or clinical data integration. We developed MetaGate as a platform for interactive statistical analysis and visualization of manually gated high-dimensional cytometry data with integration of metadata. MetaGate provides a data reduction algorithm based on a combinatorial gating system that produces a small, portable, and standardized data file. This is subsequently used to produce figures and statistical analyses through a fast web-based user interface. We demonstrate the utility of MetaGate through a comprehensive mass cytometry analysis of peripheral blood immune cells from 28 patients with diffuse large B cell lymphoma along with 17 healthy controls. Through MetaGate analysis, our study identifies key immune cell population changes associated with disease progression.
{"title":"MetaGate: Interactive analysis of high-dimensional cytometry data with metadata integration","authors":"Eivind Heggernes Ask, Astrid Tschan-Plessl, Hanna Julie Hoel, Arne Kolstad, Harald Holte, Karl-Johan Malmberg","doi":"10.1016/j.patter.2024.100989","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100989","url":null,"abstract":"<p>Flow cytometry is a powerful technology for high-throughput protein quantification at the single-cell level. Technical advances have substantially increased data complexity, but novel bioinformatical tools often show limitations in statistical testing, data sharing, cross-experiment comparability, or clinical data integration. We developed MetaGate as a platform for interactive statistical analysis and visualization of manually gated high-dimensional cytometry data with integration of metadata. MetaGate provides a data reduction algorithm based on a combinatorial gating system that produces a small, portable, and standardized data file. This is subsequently used to produce figures and statistical analyses through a fast web-based user interface. We demonstrate the utility of MetaGate through a comprehensive mass cytometry analysis of peripheral blood immune cells from 28 patients with diffuse large B cell lymphoma along with 17 healthy controls. Through MetaGate analysis, our study identifies key immune cell population changes associated with disease progression.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"109 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140933481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-13DOI: 10.1016/j.patter.2024.100990
Samar Samir Khalil, Noha S. Tawfik, Marco Spruit
The incidences of mental health illnesses, such as suicidal ideation and depression, are increasing, which highlights the urgent need for early detection methods. There is a growing interest in using natural language processing (NLP) models to analyze textual data from patients, but accessing patients’ data for research purposes can be challenging due to privacy concerns. Federated learning (FL) is a promising approach that can balance the need for centralized learning with data ownership sensitivity. In this study, we examine the effectiveness of FL models in detecting depression by using a simulated multilingual dataset. We analyzed social media posts in five different languages with varying sample sizes. Our findings indicate that FL achieves strong performance in most cases while maintaining clients’ privacy for both independent and non-independent client partitioning.
{"title":"Federated learning for privacy-preserving depression detection with multilingual language models in social media posts","authors":"Samar Samir Khalil, Noha S. Tawfik, Marco Spruit","doi":"10.1016/j.patter.2024.100990","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100990","url":null,"abstract":"<p>The incidences of mental health illnesses, such as suicidal ideation and depression, are increasing, which highlights the urgent need for early detection methods. There is a growing interest in using natural language processing (NLP) models to analyze textual data from patients, but accessing patients’ data for research purposes can be challenging due to privacy concerns. Federated learning (FL) is a promising approach that can balance the need for centralized learning with data ownership sensitivity. In this study, we examine the effectiveness of FL models in detecting depression by using a simulated multilingual dataset. We analyzed social media posts in five different languages with varying sample sizes. Our findings indicate that FL achieves strong performance in most cases while maintaining clients’ privacy for both independent and non-independent client partitioning.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"9 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140934296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-10DOI: 10.1016/j.patter.2024.100988
Peter S. Park, Simon Goldstein, Aidan O’Gara, Michael Chen, Dan Hendrycks
This paper argues that a range of current AI systems have learned how to deceive humans. We define deception as the systematic inducement of false beliefs in the pursuit of some outcome other than the truth. We first survey empirical examples of AI deception, discussing both special-use AI systems (including Meta’s CICERO) and general-purpose AI systems (including large language models). Next, we detail several risks from AI deception, such as fraud, election tampering, and losing control of AI. Finally, we outline several potential solutions: first, regulatory frameworks should subject AI systems that are capable of deception to robust risk-assessment requirements; second, policymakers should implement bot-or-not laws; and finally, policymakers should prioritize the funding of relevant research, including tools to detect AI deception and to make AI systems less deceptive. Policymakers, researchers, and the broader public should work proactively to prevent AI deception from destabilizing the shared foundations of our society.
本文认为,当前一系列人工智能系统已经学会了如何欺骗人类。我们将欺骗定义为系统性地诱导错误信念,以追求某种非真相的结果。我们首先调查了人工智能欺骗的实证案例,讨论了特殊用途人工智能系统(包括 Meta 的 CICERO)和通用人工智能系统(包括大型语言模型)。接下来,我们详细介绍了人工智能欺骗的几种风险,如欺诈、篡改选举和失去对人工智能的控制。最后,我们概述了几种潜在的解决方案:首先,监管框架应该对能够进行欺骗的人工智能系统提出严格的风险评估要求;其次,政策制定者应该实施 "要么机器人,要么不机器人 "的法律;最后,政策制定者应该优先资助相关研究,包括检测人工智能欺骗行为和减少人工智能系统欺骗性的工具。政策制定者、研究人员和广大公众应积极努力,防止人工智能欺骗行为破坏我们社会的共同基础。
{"title":"AI deception: A survey of examples, risks, and potential solutions","authors":"Peter S. Park, Simon Goldstein, Aidan O’Gara, Michael Chen, Dan Hendrycks","doi":"10.1016/j.patter.2024.100988","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100988","url":null,"abstract":"<p>This paper argues that a range of current AI systems have learned how to deceive humans. We define deception as the systematic inducement of false beliefs in the pursuit of some outcome other than the truth. We first survey empirical examples of AI deception, discussing both special-use AI systems (including Meta’s CICERO) and general-purpose AI systems (including large language models). Next, we detail several risks from AI deception, such as fraud, election tampering, and losing control of AI. Finally, we outline several potential solutions: first, regulatory frameworks should subject AI systems that are capable of deception to robust risk-assessment requirements; second, policymakers should implement bot-or-not laws; and finally, policymakers should prioritize the funding of relevant research, including tools to detect AI deception and to make AI systems less deceptive. Policymakers, researchers, and the broader public should work proactively to prevent AI deception from destabilizing the shared foundations of our society.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"253 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140933222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-10DOI: 10.1016/j.patter.2024.100972
Fabio Crameri, Sari Hason
Color is crucial in scientific visualization, yet it is often misused. Addressing this, we think accessible and accurate techniques, such as color-blind friendly palettes and perceptually even gradients, are vital. Accountability and basic knowledge in data visualization are key in fostering a culture of color integrity, ensuring accurate and inclusive data representation.
{"title":"Navigating color integrity in data visualization","authors":"Fabio Crameri, Sari Hason","doi":"10.1016/j.patter.2024.100972","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100972","url":null,"abstract":"<p>Color is crucial in scientific visualization, yet it is often misused. Addressing this, we think accessible and accurate techniques, such as color-blind friendly palettes and perceptually even gradients, are vital. Accountability and basic knowledge in data visualization are key in fostering a culture of color integrity, ensuring accurate and inclusive data representation.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"66 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140933308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-08DOI: 10.1016/j.patter.2024.100991
Zhehuan Fan, Jie Yu, Xiang Zhang, Yijie Chen, Shihui Sun, Yuanyuan Zhang, Mingan Chen, Fu Xiao, Wenyong Wu, Xutong Li, Mingyue Zheng, Xiaomin Luo, Dingyan Wang
Deep-learning-based classification models are increasingly used for predicting molecular properties in drug development. However, traditional classification models using the Softmax function often give overconfident mispredictions for out-of-distribution samples, highlighting a critical lack of accurate uncertainty estimation. Such limitations can result in substantial costs and should be avoided during drug development. Inspired by advances in evidential deep learning and Posterior Network, we replaced the Softmax function with a normalizing flow to enhance the uncertainty estimation ability of the model in molecular property classification. The proposed strategy was evaluated across diverse scenarios, including simulated experiments based on a synthetic dataset, ADMET predictions, and ligand-based virtual screening. The results demonstrate that compared with the vanilla model, the proposed strategy effectively alleviates the problem of giving overconfident but incorrect predictions. Our findings support the promising application of evidential deep learning in drug development and offer a valuable framework for further research.
{"title":"Reducing overconfident errors in molecular property classification using Posterior Network","authors":"Zhehuan Fan, Jie Yu, Xiang Zhang, Yijie Chen, Shihui Sun, Yuanyuan Zhang, Mingan Chen, Fu Xiao, Wenyong Wu, Xutong Li, Mingyue Zheng, Xiaomin Luo, Dingyan Wang","doi":"10.1016/j.patter.2024.100991","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100991","url":null,"abstract":"<p>Deep-learning-based classification models are increasingly used for predicting molecular properties in drug development. However, traditional classification models using the Softmax function often give overconfident mispredictions for out-of-distribution samples, highlighting a critical lack of accurate uncertainty estimation. Such limitations can result in substantial costs and should be avoided during drug development. Inspired by advances in evidential deep learning and Posterior Network, we replaced the Softmax function with a normalizing flow to enhance the uncertainty estimation ability of the model in molecular property classification. The proposed strategy was evaluated across diverse scenarios, including simulated experiments based on a synthetic dataset, ADMET predictions, and ligand-based virtual screening. The results demonstrate that compared with the vanilla model, the proposed strategy effectively alleviates the problem of giving overconfident but incorrect predictions. Our findings support the promising application of evidential deep learning in drug development and offer a valuable framework for further research.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"29 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140933408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-03DOI: 10.1016/j.patter.2024.100983
Abdelrahman Sharafeldin, Nabil Imam, Hannah Choi
We present an end-to-end architecture for embodied exploration inspired by two biological computations: predictive coding and uncertainty minimization. The architecture can be applied to any exploration setting in a task-independent and intrinsically driven manner. We first demonstrate our approach in a maze navigation task and show that it can discover the underlying transition distributions and spatial features of the environment. Second, we apply our model to a more complex active vision task, whereby an agent actively samples its visual environment to gather information. We show that our model builds unsupervised representations through exploration that allow it to efficiently categorize visual scenes. We further show that using these representations for downstream classification leads to superior data efficiency and learning speed compared to other baselines while maintaining lower parameter complexity. Finally, the modular structure of our model facilitates interpretability, allowing us to probe its internal mechanisms and representations during exploration.
{"title":"Active sensing with predictive coding and uncertainty minimization","authors":"Abdelrahman Sharafeldin, Nabil Imam, Hannah Choi","doi":"10.1016/j.patter.2024.100983","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100983","url":null,"abstract":"<p>We present an end-to-end architecture for embodied exploration inspired by two biological computations: predictive coding and uncertainty minimization. The architecture can be applied to any exploration setting in a task-independent and intrinsically driven manner. We first demonstrate our approach in a maze navigation task and show that it can discover the underlying transition distributions and spatial features of the environment. Second, we apply our model to a more complex active vision task, whereby an agent actively samples its visual environment to gather information. We show that our model builds unsupervised representations through exploration that allow it to efficiently categorize visual scenes. We further show that using these representations for downstream classification leads to superior data efficiency and learning speed compared to other baselines while maintaining lower parameter complexity. Finally, the modular structure of our model facilitates interpretability, allowing us to probe its internal mechanisms and representations during exploration.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"9 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140832974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-02DOI: 10.1016/j.patter.2024.100987
Kelly Rootes-Murdy, Sandeep Panta, Ross Kelly, Javier Romero, Yann Quidé, Murray J. Cairns, Carmel Loughland, Vaughan J. Carr, Stanley V. Catts, Assen Jablensky, Melissa J. Green, Frans Henskens, Dylan Kiltschewskij, Patricia T. Michie, Bryan Mowry, Christos Pantelis, Paul E. Rasser, William R. Reay, Ulrich Schall, Rodney J. Scott, Vince D. Calhoun
Structural neuroimaging studies have identified a combination of shared and disorder-specific patterns of gray matter (GM) deficits across psychiatric disorders. Pooling large data allows for examination of a possible common neuroanatomical basis that may identify a certain vulnerability for mental illness. Large-scale collaborative research is already facilitated by data repositories, institutionally supported databases, and data archives. However, these data-sharing methodologies can suffer from significant barriers. Federated approaches augment these approaches by enabling access or more sophisticated, shareable and scaled-up analyses of large-scale data. We examined GM alterations using Collaborative Informatics and Neuroimaging Suite Toolkit for Anonymous Computation, an open-source, decentralized analysis application. Through federated analysis of eight sites, we identified significant overlap in the GM patterns (n = 4,102) of individuals with schizophrenia, major depressive disorder, and autism spectrum disorder. These results show cortical and subcortical regions that may indicate a shared vulnerability to psychiatric disorders.
{"title":"Cortical similarities in psychiatric and mood disorders identified in federated VBM analysis via COINSTAC","authors":"Kelly Rootes-Murdy, Sandeep Panta, Ross Kelly, Javier Romero, Yann Quidé, Murray J. Cairns, Carmel Loughland, Vaughan J. Carr, Stanley V. Catts, Assen Jablensky, Melissa J. Green, Frans Henskens, Dylan Kiltschewskij, Patricia T. Michie, Bryan Mowry, Christos Pantelis, Paul E. Rasser, William R. Reay, Ulrich Schall, Rodney J. Scott, Vince D. Calhoun","doi":"10.1016/j.patter.2024.100987","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100987","url":null,"abstract":"<p>Structural neuroimaging studies have identified a combination of shared and disorder-specific patterns of gray matter (GM) deficits across psychiatric disorders. Pooling large data allows for examination of a possible common neuroanatomical basis that may identify a certain vulnerability for mental illness. Large-scale collaborative research is already facilitated by data repositories, institutionally supported databases, and data archives. However, these data-sharing methodologies can suffer from significant barriers. Federated approaches augment these approaches by enabling access or more sophisticated, shareable and scaled-up analyses of large-scale data. We examined GM alterations using Collaborative Informatics and Neuroimaging Suite Toolkit for Anonymous Computation, an open-source, decentralized analysis application. Through federated analysis of eight sites, we identified significant overlap in the GM patterns (<em>n</em> = 4,102) of individuals with schizophrenia, major depressive disorder, and autism spectrum disorder. These results show cortical and subcortical regions that may indicate a shared vulnerability to psychiatric disorders.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"9 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140832823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spatially resolved transcriptomics has revolutionized genome-scale transcriptomic profiling by providing high-resolution characterization of transcriptional patterns. Here, we present our spatial transcriptomics analysis framework, MUSTANG (MUlti-sample Spatial Transcriptomics data ANalysis with cross-sample transcriptional similarity Guidance), which is capable of performing multi-sample spatial transcriptomics spot cellular deconvolution by allowing both cross-sample expression-based similarity information sharing as well as spatial correlation in gene expression patterns within samples. Experiments on a semi-synthetic spatial transcriptomics dataset and three real-world spatial transcriptomics datasets demonstrate the effectiveness of MUSTANG in revealing biological insights inherent in the cellular characterization of tissue samples under study.
空间解析转录组学通过提供高分辨率的转录模式表征,彻底改变了基因组规模的转录组学分析。在这里,我们介绍了我们的空间转录组学分析框架 MUSTANG(MUlti-sample Spatial Transcriptomics data ANalysis with cross-sample transcriptional similarity Guidance),它能够通过基于表达的跨样本相似性信息共享以及样本内基因表达模式的空间相关性来执行多样本空间转录组学定点细胞解卷积。在一个半合成空间转录组学数据集和三个真实世界空间转录组学数据集上的实验证明了 MUSTANG 在揭示所研究组织样本细胞特征内在的生物学见解方面的有效性。
{"title":"MUSTANG: Multi-sample spatial transcriptomics data analysis with cross-sample transcriptional similarity guidance","authors":"Seyednami Niyakan, Jianting Sheng, Yuliang Cao, Xiang Zhang, Zhan Xu, Ling Wu, Stephen T.C. Wong, Xiaoning Qian","doi":"10.1016/j.patter.2024.100986","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100986","url":null,"abstract":"<p>Spatially resolved transcriptomics has revolutionized genome-scale transcriptomic profiling by providing high-resolution characterization of transcriptional patterns. Here, we present our spatial transcriptomics analysis framework, MUSTANG (MUlti-sample Spatial Transcriptomics data ANalysis with cross-sample transcriptional similarity Guidance), which is capable of performing multi-sample spatial transcriptomics spot cellular deconvolution by allowing both cross-sample expression-based similarity information sharing as well as spatial correlation in gene expression patterns within samples. Experiments on a semi-synthetic spatial transcriptomics dataset and three real-world spatial transcriptomics datasets demonstrate the effectiveness of MUSTANG in revealing biological insights inherent in the cellular characterization of tissue samples under study.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"32 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140832909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-02DOI: 10.1016/j.patter.2024.100985
Guangyu Wang, Kai Wang, Yuanxu Gao, Longbin Chen, Tianrun Gao, Yuanlin Ma, Zeyu Jiang, Guoxing Yang, Fajin Feng, Shuoping Zhang, Yifan Gu, Guangdong Liu, Lei Chen, Li-Shuang Ma, Ye Sang, Yanwen Xu, Ge Lin, Xiaohong Liu
In vitro fertilization (IVF) has revolutionized infertility treatment, benefiting millions of couples worldwide. However, current clinical practices for embryo selection rely heavily on visual inspection of morphology, which is highly variable and experience dependent. Here, we propose a comprehensive artificial intelligence (AI) system that can interpret embryo-developmental knowledge encoded in vast unlabeled multi-modal datasets and provide personalized embryo selection. This AI platform consists of a transformer-based network backbone named IVFormer and a self-supervised learning framework, VTCLR (visual-temporal contrastive learning of representations), for training multi-modal embryo representations pre-trained on large and unlabeled data. When evaluated on clinical scenarios covering the entire IVF cycle, our pre-trained AI model demonstrates accurate and reliable performance on euploidy ranking and live-birth occurrence prediction. For AI vs. physician for euploidy ranking, our model achieved superior performance across all score categories. The results demonstrate the potential of the AI system as a non-invasive, efficient, and cost-effective tool to improve embryo selection and IVF outcomes.
{"title":"A generalized AI system for human embryo selection covering the entire IVF cycle via multi-modal contrastive learning","authors":"Guangyu Wang, Kai Wang, Yuanxu Gao, Longbin Chen, Tianrun Gao, Yuanlin Ma, Zeyu Jiang, Guoxing Yang, Fajin Feng, Shuoping Zhang, Yifan Gu, Guangdong Liu, Lei Chen, Li-Shuang Ma, Ye Sang, Yanwen Xu, Ge Lin, Xiaohong Liu","doi":"10.1016/j.patter.2024.100985","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100985","url":null,"abstract":"<p><em>In vitro</em> fertilization (IVF) has revolutionized infertility treatment, benefiting millions of couples worldwide. However, current clinical practices for embryo selection rely heavily on visual inspection of morphology, which is highly variable and experience dependent. Here, we propose a comprehensive artificial intelligence (AI) system that can interpret embryo-developmental knowledge encoded in vast unlabeled multi-modal datasets and provide personalized embryo selection. This AI platform consists of a transformer-based network backbone named IVFormer and a self-supervised learning framework, VTCLR (visual-temporal contrastive learning of representations), for training multi-modal embryo representations pre-trained on large and unlabeled data. When evaluated on clinical scenarios covering the entire IVF cycle, our pre-trained AI model demonstrates accurate and reliable performance on euploidy ranking and live-birth occurrence prediction. For AI vs. physician for euploidy ranking, our model achieved superior performance across all score categories. The results demonstrate the potential of the AI system as a non-invasive, efficient, and cost-effective tool to improve embryo selection and IVF outcomes.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"75 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140832827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-01DOI: 10.1016/j.patter.2024.100973
Ruoqi Liu, Pin-Yu Chen, Ping Zhang
Treatment effect estimation (TEE) aims to identify the causal effects of treatments on important outcomes. Current machine-learning-based methods, mainly trained on labeled data for specific treatments or outcomes, can be sub-optimal with limited labeled data. In this article, we propose a new pre-training and fine-tuning framework, CURE (causal treatment effect estimation), for TEE from observational data. CURE is pre-trained on large-scale unlabeled patient data to learn representative contextual patient representations and fine-tuned on labeled patient data for TEE. We present a new sequence encoding approach for longitudinal patient data embedding both structure and time. Evaluated on four downstream TEE tasks, CURE outperforms the state-of-the-art methods, marking a 7% increase in area under the precision-recall curve and an 8% rise in the influence-function-based precision of estimating heterogeneous effects. Validation with four randomized clinical trials confirms its efficacy in producing trial conclusions, highlighting CURE’s capacity to supplement traditional clinical trials.
{"title":"CURE: A deep learning framework pre-trained on large-scale patient data for treatment effect estimation","authors":"Ruoqi Liu, Pin-Yu Chen, Ping Zhang","doi":"10.1016/j.patter.2024.100973","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100973","url":null,"abstract":"<p>Treatment effect estimation (TEE) aims to identify the causal effects of treatments on important outcomes. Current machine-learning-based methods, mainly trained on labeled data for specific treatments or outcomes, can be sub-optimal with limited labeled data. In this article, we propose a new pre-training and fine-tuning framework, CURE (causal treatment effect estimation), for TEE from observational data. CURE is pre-trained on large-scale unlabeled patient data to learn representative contextual patient representations and fine-tuned on labeled patient data for TEE. We present a new sequence encoding approach for longitudinal patient data embedding both structure and time. Evaluated on four downstream TEE tasks, CURE outperforms the state-of-the-art methods, marking a 7% increase in area under the precision-recall curve and an 8% rise in the influence-function-based precision of estimating heterogeneous effects. Validation with four randomized clinical trials confirms its efficacy in producing trial conclusions, highlighting CURE’s capacity to supplement traditional clinical trials.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"2011 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140832895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}