Suhyun Hwangbo, Sungyoung Lee, Md Mozaffar Hosain, Taewan Goo, Seungyeoun Lee, Inyoung Kim, Taesung Park
{"title":"基于核的分层结构组件模型用于生存表型的通路分析","authors":"Suhyun Hwangbo, Sungyoung Lee, Md Mozaffar Hosain, Taewan Goo, Seungyeoun Lee, Inyoung Kim, Taesung Park","doi":"10.1007/s13258-024-01569-9","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>High-throughput sequencing, particularly RNA-sequencing (RNA-seq), has advanced differential gene expression analysis, revealing pathways involved in various biological conditions. Traditional pathway-based methods generally consider pathways independently, overlooking the correlations among them and ignoring quite a few overlapping biomarkers between pathways. In addition, most pathway-based approaches assume that biomarkers have linear effects on the phenotype of interest.</p><p><strong>Objective: </strong>This study aims to develop the HisCoM-KernelS model to identify survival phenotype-related pathways by accommodating complex, nonlinear relationships between genes and survival outcomes, while accounting for inter-pathway correlations.</p><p><strong>Methods: </strong>We applied HisCoM-KernelS model to the TCGA pancreatic ductal adenocarcinoma (PDAC) RNA-seq dataset, comprising 4,498 protein-coding genes mapped to 186 KEGG pathways from 148 PDAC samples. Kernel machine regression was used to model pathway effects on survival outcomes, incorporating hierarchical gene-pathway structures. Model parameters were estimated using the alternating least squares algorithm, and the significance of pathways was assessed through a permutation test.</p><p><strong>Results: </strong>HisCoM-KernelS identified several pathways significantly associated with pancreatic cancer survival, including those corroborated by previous studies. HisCoM-KernelS, especially with the Gaussian kernel, showed a better balance of detection rate and number of significant pathways compared to four other existing pathway-based methods: HisCoM-PAGE, Global Test, GSEA, and CoxKM.</p><p><strong>Conclusion: </strong>HisCoM-KernelS successfully extends pathway-based analysis to survival outcomes, capturing complex nonlinear gene effects and inter-pathway correlations. Its application to the TCGA PDAC dataset emphasizes its utility in identifying biologically relevant pathways, offering a robust tool for survival phenotype research in high-throughput sequencing data.</p>","PeriodicalId":12675,"journal":{"name":"Genes & genomics","volume":" ","pages":"1415-1421"},"PeriodicalIF":1.6000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Kernel-based hierarchical structural component models for pathway analysis on survival phenotype.\",\"authors\":\"Suhyun Hwangbo, Sungyoung Lee, Md Mozaffar Hosain, Taewan Goo, Seungyeoun Lee, Inyoung Kim, Taesung Park\",\"doi\":\"10.1007/s13258-024-01569-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>High-throughput sequencing, particularly RNA-sequencing (RNA-seq), has advanced differential gene expression analysis, revealing pathways involved in various biological conditions. Traditional pathway-based methods generally consider pathways independently, overlooking the correlations among them and ignoring quite a few overlapping biomarkers between pathways. In addition, most pathway-based approaches assume that biomarkers have linear effects on the phenotype of interest.</p><p><strong>Objective: </strong>This study aims to develop the HisCoM-KernelS model to identify survival phenotype-related pathways by accommodating complex, nonlinear relationships between genes and survival outcomes, while accounting for inter-pathway correlations.</p><p><strong>Methods: </strong>We applied HisCoM-KernelS model to the TCGA pancreatic ductal adenocarcinoma (PDAC) RNA-seq dataset, comprising 4,498 protein-coding genes mapped to 186 KEGG pathways from 148 PDAC samples. Kernel machine regression was used to model pathway effects on survival outcomes, incorporating hierarchical gene-pathway structures. Model parameters were estimated using the alternating least squares algorithm, and the significance of pathways was assessed through a permutation test.</p><p><strong>Results: </strong>HisCoM-KernelS identified several pathways significantly associated with pancreatic cancer survival, including those corroborated by previous studies. HisCoM-KernelS, especially with the Gaussian kernel, showed a better balance of detection rate and number of significant pathways compared to four other existing pathway-based methods: HisCoM-PAGE, Global Test, GSEA, and CoxKM.</p><p><strong>Conclusion: </strong>HisCoM-KernelS successfully extends pathway-based analysis to survival outcomes, capturing complex nonlinear gene effects and inter-pathway correlations. Its application to the TCGA PDAC dataset emphasizes its utility in identifying biologically relevant pathways, offering a robust tool for survival phenotype research in high-throughput sequencing data.</p>\",\"PeriodicalId\":12675,\"journal\":{\"name\":\"Genes & genomics\",\"volume\":\" \",\"pages\":\"1415-1421\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2024-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genes & genomics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1007/s13258-024-01569-9\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/9/26 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q4\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genes & genomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s13258-024-01569-9","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/26 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
Kernel-based hierarchical structural component models for pathway analysis on survival phenotype.
Background: High-throughput sequencing, particularly RNA-sequencing (RNA-seq), has advanced differential gene expression analysis, revealing pathways involved in various biological conditions. Traditional pathway-based methods generally consider pathways independently, overlooking the correlations among them and ignoring quite a few overlapping biomarkers between pathways. In addition, most pathway-based approaches assume that biomarkers have linear effects on the phenotype of interest.
Objective: This study aims to develop the HisCoM-KernelS model to identify survival phenotype-related pathways by accommodating complex, nonlinear relationships between genes and survival outcomes, while accounting for inter-pathway correlations.
Methods: We applied HisCoM-KernelS model to the TCGA pancreatic ductal adenocarcinoma (PDAC) RNA-seq dataset, comprising 4,498 protein-coding genes mapped to 186 KEGG pathways from 148 PDAC samples. Kernel machine regression was used to model pathway effects on survival outcomes, incorporating hierarchical gene-pathway structures. Model parameters were estimated using the alternating least squares algorithm, and the significance of pathways was assessed through a permutation test.
Results: HisCoM-KernelS identified several pathways significantly associated with pancreatic cancer survival, including those corroborated by previous studies. HisCoM-KernelS, especially with the Gaussian kernel, showed a better balance of detection rate and number of significant pathways compared to four other existing pathway-based methods: HisCoM-PAGE, Global Test, GSEA, and CoxKM.
Conclusion: HisCoM-KernelS successfully extends pathway-based analysis to survival outcomes, capturing complex nonlinear gene effects and inter-pathway correlations. Its application to the TCGA PDAC dataset emphasizes its utility in identifying biologically relevant pathways, offering a robust tool for survival phenotype research in high-throughput sequencing data.
期刊介绍:
Genes & Genomics is an official journal of the Korean Genetics Society (http://kgenetics.or.kr/). Although it is an official publication of the Genetics Society of Korea, membership of the Society is not required for contributors. It is a peer-reviewed international journal publishing print (ISSN 1976-9571) and online version (E-ISSN 2092-9293). It covers all disciplines of genetics and genomics from prokaryotes to eukaryotes from fundamental heredity to molecular aspects. The articles can be reviews, research articles, and short communications.