Amarise Little, Ni Zhao, Anna Mikhaylova, Angela Zhang, Wodan Ling, Florian Thibord, Andrew D Johnson, Laura M Raffield, Joanne E Curran, John Blangero, Jeffrey R O'Connell, Huichun Xu, Jerome I Rotter, Stephen S Rich, Kenneth M Rice, Ming-Huei Chen, Alexander Reiner, Charles Kooperberg, Thao Vu, Lifang Hou, Myriam Fornage, Ruth J F Loos, Eimear Kenny, Rasika Mathias, Lewis Becker, Albert V Smith, Eric Boerwinkle, Bing Yu, Timothy Thornton, Michael C Wu
{"title":"General Kernel Machine Methods for Multi-Omics Integration and Genome-Wide Association Testing With Related Individuals.","authors":"Amarise Little, Ni Zhao, Anna Mikhaylova, Angela Zhang, Wodan Ling, Florian Thibord, Andrew D Johnson, Laura M Raffield, Joanne E Curran, John Blangero, Jeffrey R O'Connell, Huichun Xu, Jerome I Rotter, Stephen S Rich, Kenneth M Rice, Ming-Huei Chen, Alexander Reiner, Charles Kooperberg, Thao Vu, Lifang Hou, Myriam Fornage, Ruth J F Loos, Eimear Kenny, Rasika Mathias, Lewis Becker, Albert V Smith, Eric Boerwinkle, Bing Yu, Timothy Thornton, Michael C Wu","doi":"10.1002/gepi.22610","DOIUrl":null,"url":null,"abstract":"<p><p>Integrating multi-omics data may help researchers understand the genetic underpinnings of complex traits and diseases. However, the best ways to integrate multi-omics data and use them to address pressing scientific questions remain a challenge. One important and topical problem is how to assess the aggregate effect of multiple genomic data types (e.g. genotypes and gene expression levels) on a phenotype, particularly while accommodating routine issues, such as having related subjects' data in analyses. In this paper, we extend an existing composite kernel machine regression model to integrate two multi-omics data types, while accommodating for general correlation structures amongst outcomes. Due to the kernel machine regression framework, our methods allow for the integration of high-dimensional omics data with small, nonlinear, and interactive effects, and accommodation of general study designs. Here, we focus on scientific questions that aim to assess the association between a functional grouping (such as a gene or a pathway) and a quantitative trait of interest. We use a kernel machine regression to integrate the two multi-omics data types, as they may relate to the trait, and perform a global test of association. We demonstrate the advantage of this approach over single data type association tests via simulation. Finally, we apply this method to a large, multi-ethnic data set to investigate how predicted gene expression and rare genetic variation may be related to two platelet traits.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 1","pages":"e22610"},"PeriodicalIF":1.7000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetic Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/gepi.22610","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Integrating multi-omics data may help researchers understand the genetic underpinnings of complex traits and diseases. However, the best ways to integrate multi-omics data and use them to address pressing scientific questions remain a challenge. One important and topical problem is how to assess the aggregate effect of multiple genomic data types (e.g. genotypes and gene expression levels) on a phenotype, particularly while accommodating routine issues, such as having related subjects' data in analyses. In this paper, we extend an existing composite kernel machine regression model to integrate two multi-omics data types, while accommodating for general correlation structures amongst outcomes. Due to the kernel machine regression framework, our methods allow for the integration of high-dimensional omics data with small, nonlinear, and interactive effects, and accommodation of general study designs. Here, we focus on scientific questions that aim to assess the association between a functional grouping (such as a gene or a pathway) and a quantitative trait of interest. We use a kernel machine regression to integrate the two multi-omics data types, as they may relate to the trait, and perform a global test of association. We demonstrate the advantage of this approach over single data type association tests via simulation. Finally, we apply this method to a large, multi-ethnic data set to investigate how predicted gene expression and rare genetic variation may be related to two platelet traits.
期刊介绍:
Genetic Epidemiology is a peer-reviewed journal for discussion of research on the genetic causes of the distribution of human traits in families and populations. Emphasis is placed on the relative contribution of genetic and environmental factors to human disease as revealed by genetic, epidemiological, and biologic investigations.
Genetic Epidemiology primarily publishes papers in statistical genetics, a research field that is primarily concerned with development of statistical, bioinformatical, and computational models for analyzing genetic data. Incorporation of underlying biology and population genetics into conceptual models is favored. The Journal seeks original articles comprising either applied research or innovative statistical, mathematical, computational, or genomic methodologies that advance studies in genetic epidemiology. Other types of reports are encouraged, such as letters to the editor, topic reviews, and perspectives from other fields of research that will likely enrich the field of genetic epidemiology.