{"title":"临床转录组学数据的受限伪时间排序","authors":"Sachin Mathur;Hamid Mattoo;Ziv Bar-Joseph","doi":"10.1109/TCBB.2024.3442669","DOIUrl":null,"url":null,"abstract":"Time series RNASeq studies can enable understanding of the dynamics of disease progression and treatment response in patients. They also provide information on biomarkers, activated and repressed pathways, and more. While useful, data from multiple patients is challenging to integrate due to the heterogeneity in treatment response among patients, and the small number of timepoints that are usually profiled. Due to the heterogeneity among patients, relying on the sampled time points to integrate data across individuals is challenging and does not lead to correct reconstruction of the response patterns. To address these challenges, we developed a new constrained based pseudo-time ordering method for analyzing transcriptomics data in clinical and response studies. Our method allows the assignment of samples to their correct placement on the response curve while respecting the individual patient order. We use polynomials to represent gene expression over the duration of the study and an EM algorithm to determine parameters and locations. Application to four treatment response datasets shows that our method improves on prior methods and leads to accurate orderings that provide new biological insight on the disease and response.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"2076-2088"},"PeriodicalIF":3.6000,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Constrained Pseudo-Time Ordering for Clinical Transcriptomics Data\",\"authors\":\"Sachin Mathur;Hamid Mattoo;Ziv Bar-Joseph\",\"doi\":\"10.1109/TCBB.2024.3442669\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Time series RNASeq studies can enable understanding of the dynamics of disease progression and treatment response in patients. They also provide information on biomarkers, activated and repressed pathways, and more. While useful, data from multiple patients is challenging to integrate due to the heterogeneity in treatment response among patients, and the small number of timepoints that are usually profiled. Due to the heterogeneity among patients, relying on the sampled time points to integrate data across individuals is challenging and does not lead to correct reconstruction of the response patterns. To address these challenges, we developed a new constrained based pseudo-time ordering method for analyzing transcriptomics data in clinical and response studies. Our method allows the assignment of samples to their correct placement on the response curve while respecting the individual patient order. We use polynomials to represent gene expression over the duration of the study and an EM algorithm to determine parameters and locations. Application to four treatment response datasets shows that our method improves on prior methods and leads to accurate orderings that provide new biological insight on the disease and response.\",\"PeriodicalId\":13344,\"journal\":{\"name\":\"IEEE/ACM Transactions on Computational Biology and Bioinformatics\",\"volume\":\"21 6\",\"pages\":\"2076-2088\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2024-08-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE/ACM Transactions on Computational Biology and Bioinformatics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10634780/\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10634780/","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
摘要
时间序列 RNASeq 研究有助于了解患者的疾病进展动态和治疗反应。它们还能提供生物标记物、激活和抑制通路等方面的信息。来自多个患者的数据虽然有用,但由于患者之间治疗反应的异质性以及通常分析的时间点数量较少,整合这些数据具有挑战性。由于患者之间存在异质性,依靠采样时间点来整合不同个体的数据具有挑战性,而且无法正确重建反应模式。为了应对这些挑战,我们开发了一种新的基于约束的伪时间排序方法,用于分析临床和反应研究中的转录组学数据。我们的方法允许将样本分配到反应曲线上的正确位置,同时尊重患者的个体排序。我们使用多项式来表示研究期间的基因表达,并使用 EM 算法来确定参数和位置。对三个治疗反应数据集的应用表明,我们的方法改进了之前的方法,并能准确排序,为疾病和反应提供新的生物学见解。该方法的代码见 https://github.com/Sanofi-Public/ RDCS-bulkRNASeq-pseudo ordering。
Constrained Pseudo-Time Ordering for Clinical Transcriptomics Data
Time series RNASeq studies can enable understanding of the dynamics of disease progression and treatment response in patients. They also provide information on biomarkers, activated and repressed pathways, and more. While useful, data from multiple patients is challenging to integrate due to the heterogeneity in treatment response among patients, and the small number of timepoints that are usually profiled. Due to the heterogeneity among patients, relying on the sampled time points to integrate data across individuals is challenging and does not lead to correct reconstruction of the response patterns. To address these challenges, we developed a new constrained based pseudo-time ordering method for analyzing transcriptomics data in clinical and response studies. Our method allows the assignment of samples to their correct placement on the response curve while respecting the individual patient order. We use polynomials to represent gene expression over the duration of the study and an EM algorithm to determine parameters and locations. Application to four treatment response datasets shows that our method improves on prior methods and leads to accurate orderings that provide new biological insight on the disease and response.
期刊介绍:
IEEE/ACM Transactions on Computational Biology and Bioinformatics emphasizes the algorithmic, mathematical, statistical and computational methods that are central in bioinformatics and computational biology; the development and testing of effective computer programs in bioinformatics; the development of biological databases; and important biological results that are obtained from the use of these methods, programs and databases; the emerging field of Systems Biology, where many forms of data are used to create a computer-based model of a complex biological system