G. R. Svishcheva, A. V. Kirichenko, N. M. Belonogova, E. E. Elgaeva, Ya. A. Tsepilov, I. V. Zorkoltseva, T. I. Axenovich
{"title":"重构基因内变异基因型相关性矩阵,联合分析推算数据和测序数据","authors":"G. R. Svishcheva, A. V. Kirichenko, N. M. Belonogova, E. E. Elgaeva, Ya. A. Tsepilov, I. V. Zorkoltseva, T. I. Axenovich","doi":"10.1134/s1022795424700418","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">\n<b>Abstract</b>—</h3><p>When combining imputed and sequenced data in a single gene-based association analysis, the problem of reconstructing genetic correlation matrices arises. It is related to the fact that the correlations between genotypes of all imputed variants and the correlations between genotypes of all sequenced variants are known for a gene but we do not know the correlations between genotypes of variants, one of which is imputed, and the other is sequenced. To recover these correlations, we propose an efficient method based on maximising the determinant of the matrix. This method has a number of useful properties and an analytical solution for our task. Approbation of the proposed method was performed by comparing reconstructed and real correlation matrices constructed on individual genotypes from the UK Biobank. Comparison of the results of gene-based association analysis performed by the SKAT, BT, and PCA methods on reconstructed and real matrices using modelled summary statistics and calculated summary statistics on real phenotypes showed high quality of reconstruction and robustness of the method to different gene structures.</p>","PeriodicalId":21441,"journal":{"name":"Russian Journal of Genetics","volume":"17 1","pages":""},"PeriodicalIF":0.6000,"publicationDate":"2024-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reconstruction of a Matrix of Genotypic Correlations between Variants within a Gene for Joint Analysis of Imputed and Sequenced Data\",\"authors\":\"G. R. Svishcheva, A. V. Kirichenko, N. M. Belonogova, E. E. Elgaeva, Ya. A. Tsepilov, I. V. Zorkoltseva, T. I. Axenovich\",\"doi\":\"10.1134/s1022795424700418\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<h3 data-test=\\\"abstract-sub-heading\\\">\\n<b>Abstract</b>—</h3><p>When combining imputed and sequenced data in a single gene-based association analysis, the problem of reconstructing genetic correlation matrices arises. It is related to the fact that the correlations between genotypes of all imputed variants and the correlations between genotypes of all sequenced variants are known for a gene but we do not know the correlations between genotypes of variants, one of which is imputed, and the other is sequenced. To recover these correlations, we propose an efficient method based on maximising the determinant of the matrix. This method has a number of useful properties and an analytical solution for our task. Approbation of the proposed method was performed by comparing reconstructed and real correlation matrices constructed on individual genotypes from the UK Biobank. Comparison of the results of gene-based association analysis performed by the SKAT, BT, and PCA methods on reconstructed and real matrices using modelled summary statistics and calculated summary statistics on real phenotypes showed high quality of reconstruction and robustness of the method to different gene structures.</p>\",\"PeriodicalId\":21441,\"journal\":{\"name\":\"Russian Journal of Genetics\",\"volume\":\"17 1\",\"pages\":\"\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2024-07-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Russian Journal of Genetics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1134/s1022795424700418\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Russian Journal of Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1134/s1022795424700418","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
Reconstruction of a Matrix of Genotypic Correlations between Variants within a Gene for Joint Analysis of Imputed and Sequenced Data
Abstract—
When combining imputed and sequenced data in a single gene-based association analysis, the problem of reconstructing genetic correlation matrices arises. It is related to the fact that the correlations between genotypes of all imputed variants and the correlations between genotypes of all sequenced variants are known for a gene but we do not know the correlations between genotypes of variants, one of which is imputed, and the other is sequenced. To recover these correlations, we propose an efficient method based on maximising the determinant of the matrix. This method has a number of useful properties and an analytical solution for our task. Approbation of the proposed method was performed by comparing reconstructed and real correlation matrices constructed on individual genotypes from the UK Biobank. Comparison of the results of gene-based association analysis performed by the SKAT, BT, and PCA methods on reconstructed and real matrices using modelled summary statistics and calculated summary statistics on real phenotypes showed high quality of reconstruction and robustness of the method to different gene structures.
期刊介绍:
Russian Journal of Genetics is a journal intended to make significant contribution to the development of genetics. The journal publishes reviews and experimental papers in the areas of theoretical and applied genetics. It presents fundamental research on genetic processes at molecular, cell, organism, and population levels, including problems of the conservation and rational management of genetic resources and the functional genomics, evolutionary genomics and medical genetics.