首页 > 最新文献

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing最新文献

英文 中文
Statistical analysis of single-cell protein data. 单细胞蛋白质数据统计分析。
Q2 Computer Science Pub Date : 2023-12-17 DOI: 10.1142/9789811286421_0051
Brooke L. Fridley, Simon Vandekar, Inna Chervoneva, Julia Wrobel, Siyuan Ma
Immune modulation is considered a hallmark of cancer initiation and progression, with immune cell density being consistently associated with clinical outcomes of individuals with cancer. Multiplex immunofluorescence (mIF) microscopy combined with automated image analysis is a novel and increasingly used technique that allows for the assessment and visualization of the tumor microenvironment (TME). Recently, application of this new technology to tissue microarrays (TMAs) or whole tissue sections from large cancer studies has been used to characterize different cell populations in the TME with enhanced reproducibility and accuracy. Generally, mIF data has been used to examine the presence and abundance of immune cells in the tumor and stroma compartments; however, this aggregate measure assumes uniform patterns of immune cells throughout the TME and overlooks spatial heterogeneity. Recently, the spatial contexture of the TME has been explored with a variety of statistical methods. In this PSB workshop, speakers will present some of the state-of-the-art statistical methods for assessing the TIME from mIF data.
免疫调节被认为是癌症发生和发展的一个标志,免疫细胞密度一直与癌症患者的临床预后相关。多重免疫荧光(mIF)显微镜与自动图像分析相结合,是一种新颖的、应用日益广泛的技术,可对肿瘤微环境(TME)进行评估和可视化。最近,将这项新技术应用于组织微阵列(TMA)或大型癌症研究的整个组织切片已被用来描述肿瘤微环境中不同细胞群的特征,并提高了可重复性和准确性。一般来说,mIF 数据被用来检测肿瘤和基质区免疫细胞的存在和丰度;然而,这种综合测量方法假定整个 TME 中免疫细胞的模式是一致的,而忽略了空间异质性。最近,人们利用各种统计方法对肿瘤组织间质的空间背景进行了探索。在本次 PSB 研讨会上,演讲者将介绍从 mIF 数据中评估 TIME 的一些最新统计方法。
{"title":"Statistical analysis of single-cell protein data.","authors":"Brooke L. Fridley, Simon Vandekar, Inna Chervoneva, Julia Wrobel, Siyuan Ma","doi":"10.1142/9789811286421_0051","DOIUrl":"https://doi.org/10.1142/9789811286421_0051","url":null,"abstract":"Immune modulation is considered a hallmark of cancer initiation and progression, with immune cell density being consistently associated with clinical outcomes of individuals with cancer. Multiplex immunofluorescence (mIF) microscopy combined with automated image analysis is a novel and increasingly used technique that allows for the assessment and visualization of the tumor microenvironment (TME). Recently, application of this new technology to tissue microarrays (TMAs) or whole tissue sections from large cancer studies has been used to characterize different cell populations in the TME with enhanced reproducibility and accuracy. Generally, mIF data has been used to examine the presence and abundance of immune cells in the tumor and stroma compartments; however, this aggregate measure assumes uniform patterns of immune cells throughout the TME and overlooks spatial heterogeneity. Recently, the spatial contexture of the TME has been explored with a variety of statistical methods. In this PSB workshop, speakers will present some of the state-of-the-art statistical methods for assessing the TIME from mIF data.","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"82 ","pages":"654-660"},"PeriodicalIF":0.0,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lymphocyte Count Derived Polygenic Score and Interindividual Variability in CD4 T-cell Recovery in Response to Antiretroviral Therapy 淋巴细胞计数得出的多基因评分与抗逆转录病毒疗法 CD4 T 细胞恢复的个体间变异性
Q2 Computer Science Pub Date : 2023-12-17 DOI: 10.1142/9789811286421_0045
Kathleen M. Cardone, S. Dudek, K. Keat, Yuki Bradford, Zinhle Cindi, Eric S. Daar, Roy Gulick, Sharon A. Riddler, Jeffrey L. Lennox, P. Sinxadi, David W. Haas, Marylyn D. Ritchie
Access to safe and effective antiretroviral therapy (ART) is a cornerstone in the global response to the HIV pandemic. Among people living with HIV, there is considerable interindividual variability in absolute CD4 T-cell recovery following initiation of virally suppressive ART. The contribution of host genetics to this variability is not well understood. We explored the contribution of a polygenic score which was derived from large, publicly available summary statistics for absolute lymphocyte count from individuals in the general population (PGSlymph) due to a lack of publicly available summary statistics for CD4 T-cell count. We explored associations with baseline CD4 T-cell count prior to ART initiation (n=4959) and change from baseline to week 48 on ART (n=3274) among treatment-naïve participants in prospective, randomized ART studies of the AIDS Clinical Trials Group. We separately examined an African-ancestry-derived and a European-ancestry-derived PGSlymph, and evaluated their performance across all participants, and also in the African and European ancestral groups separately. Multivariate models that included PGSlymph, baseline plasma HIV-1 RNA, age, sex, and 15 principal components (PCs) of genetic similarity explained ~26-27% of variability in baseline CD4 T-cell count, but PGSlymph accounted for <1% of this variability. Models that also included baseline CD4 T-cell count explained ~7-9% of variability in CD4 T-cell count increase on ART, but PGSlymph accounted for <1% of this variability. In univariate analyses, PGSlymph was not significantly associated with baseline or change in CD4 T-cell count. Among individuals of African ancestry, the African PGSlymph term in the multivariate model was significantly associated with change in CD4 T-cell count while not significant in the univariate model. When applied to lymphocyte count in a general medical biobank population (Penn Medicine BioBank), PGSlymph explained ~6-10% of variability in multivariate models (including age, sex, and PCs) but only ~1% in univariate models. In summary, a lymphocyte count PGS derived from the general population was not consistently associated with CD4 T-cell recovery on ART. Nonetheless, adjusting for clinical covariates is quite important when estimating such polygenic effects.
获得安全有效的抗逆转录病毒疗法(ART)是全球应对艾滋病大流行的基石。在艾滋病病毒感染者中,开始接受病毒抑制性抗逆转录病毒疗法后,CD4 T 细胞的绝对恢复能力在个体间存在相当大的差异。宿主遗传学对这一变异性的贡献尚不十分清楚。由于缺乏可公开获得的 CD4 T 细胞计数汇总统计数据,我们对多基因评分的贡献进行了探讨,该评分来自可公开获得的大量普通人群绝对淋巴细胞计数汇总统计数据(PGSlymph)。我们探讨了艾滋病临床试验组(AIDS Clinical Trials Group)前瞻性随机抗逆转录病毒疗法(ART)研究中未接受过治疗的参与者中,抗逆转录病毒疗法开始前的 CD4 T 细胞计数基线(4959 人)和从基线到抗逆转录病毒疗法第 48 周的变化(3274 人)之间的关联。我们分别研究了非洲裔和欧洲裔的 PGSlymph,并评估了它们在所有参与者中的表现,以及在非洲裔和欧洲裔群体中的表现。包括 PGSlymph、基线血浆 HIV-1 RNA、年龄、性别和 15 个遗传相似性主成分 (PCs) 的多变量模型解释了基线 CD4 T 细胞计数变异性的约 26-27%,但 PGSlymph 只占这一变异性的 <1%。还包括基线 CD4 T 细胞计数的模型可解释抗逆转录病毒疗法后 CD4 T 细胞计数增加的约 7-9% 的变异性,但 PGSlymph 在这一变异性中所占比例小于 1%。在单变量分析中,PGSlymph 与 CD4 T 细胞计数的基线或变化无明显关联。在非洲血统的个体中,多变量模型中的非洲 PGSlymph 项与 CD4 T 细胞计数的变化显著相关,而在单变量模型中则不显著。当应用于普通医学生物库人群(宾夕法尼亚医学生物库)的淋巴细胞计数时,PGSlymph 在多变量模型(包括年龄、性别和 PCs)中解释了 ~6-10% 的变异,但在单变量模型中仅解释了 ~1%。总之,来自普通人群的淋巴细胞计数 PGS 与抗逆转录病毒疗法的 CD4 T 细胞恢复并不一致。然而,在估计此类多基因效应时,调整临床协变量是非常重要的。
{"title":"Lymphocyte Count Derived Polygenic Score and Interindividual Variability in CD4 T-cell Recovery in Response to Antiretroviral Therapy","authors":"Kathleen M. Cardone, S. Dudek, K. Keat, Yuki Bradford, Zinhle Cindi, Eric S. Daar, Roy Gulick, Sharon A. Riddler, Jeffrey L. Lennox, P. Sinxadi, David W. Haas, Marylyn D. Ritchie","doi":"10.1142/9789811286421_0045","DOIUrl":"https://doi.org/10.1142/9789811286421_0045","url":null,"abstract":"Access to safe and effective antiretroviral therapy (ART) is a cornerstone in the global response to the HIV pandemic. Among people living with HIV, there is considerable interindividual variability in absolute CD4 T-cell recovery following initiation of virally suppressive ART. The contribution of host genetics to this variability is not well understood. We explored the contribution of a polygenic score which was derived from large, publicly available summary statistics for absolute lymphocyte count from individuals in the general population (PGSlymph) due to a lack of publicly available summary statistics for CD4 T-cell count. We explored associations with baseline CD4 T-cell count prior to ART initiation (n=4959) and change from baseline to week 48 on ART (n=3274) among treatment-naïve participants in prospective, randomized ART studies of the AIDS Clinical Trials Group. We separately examined an African-ancestry-derived and a European-ancestry-derived PGSlymph, and evaluated their performance across all participants, and also in the African and European ancestral groups separately. Multivariate models that included PGSlymph, baseline plasma HIV-1 RNA, age, sex, and 15 principal components (PCs) of genetic similarity explained ~26-27% of variability in baseline CD4 T-cell count, but PGSlymph accounted for <1% of this variability. Models that also included baseline CD4 T-cell count explained ~7-9% of variability in CD4 T-cell count increase on ART, but PGSlymph accounted for <1% of this variability. In univariate analyses, PGSlymph was not significantly associated with baseline or change in CD4 T-cell count. Among individuals of African ancestry, the African PGSlymph term in the multivariate model was significantly associated with change in CD4 T-cell count while not significant in the univariate model. When applied to lymphocyte count in a general medical biobank population (Penn Medicine BioBank), PGSlymph explained ~6-10% of variability in multivariate models (including age, sex, and PCs) but only ~1% in univariate models. In summary, a lymphocyte count PGS derived from the general population was not consistently associated with CD4 T-cell recovery on ART. Nonetheless, adjusting for clinical covariates is quite important when estimating such polygenic effects.","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"857 ","pages":"594 - 610"},"PeriodicalIF":0.0,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evidence of recent and ongoing admixture in the U.S. and influences on health and disparities. 美国最近和正在发生的混血现象的证据以及对健康和差异的影响。
Q2 Computer Science Pub Date : 2023-12-17 DOI: 10.1142/9789811286421_0029
Hannah M. Seagle, J. Hellwege, Brian S. Mautz, Chun Li, Yaomin Xu, Siwei Zhang, Dan M. Roden, Tracy L. McGregor, D. V. Velez Edwards, Todd L. Edwards
Many researchers in genetics and social science incorporate information about race in their work. However, migrations (historical and forced) and social mobility have brought formerly separated populations of humans together, creating younger generations of individuals who have more complex and diverse ancestry and race profiles than older age groups. Here, we sought to better understand how temporal changes in genetic admixture influence levels of heterozygosity and impact health outcomes. We evaluated variation in genetic ancestry over 100 birth years in a cohort of 35,842 individuals with electronic health record (EHR) information in the Southeastern United States. Using the software STRUCTURE, we analyzed 2,678 ancestrally informative markers relative to three ancestral clusters (African, East Asian, and European) and observed rising levels of admixture for all clinically-defined race groups since 1990. Most race groups also exhibited increases in heterozygosity and long-range linkage disequilibrium over time, further supporting the finding of increasing admixture in young individuals in our cohort. These data are consistent with United States Census information from broader geographic areas and highlight the changing demography of the population. This increased diversity challenges classic approaches to studies of genotype-phenotype relationships which motivated us to explore the relationship between heterozygosity and disease diagnosis. Using a phenome-wide association study approach, we explored the relationship between admixture and disease risk and found that increased admixture resulted in protective associations with female reproductive disorders and increased risk for diseases with links to autoimmune dysfunction. These data suggest that tendencies in the United States population are increasing ancestral complexity over time. Further, these observations imply that, because both prevalence and severity of many diseases vary by race groups, complexity of ancestral origins influences health and disparities.
许多遗传学和社会科学研究人员在其工作中纳入了有关种族的信息。然而,(历史上的和被迫的)迁徙和社会流动将以前分离的人类群体聚集在一起,产生了年轻一代的个体,他们的祖先和种族特征比年龄较大的群体更为复杂和多样。在此,我们试图更好地了解基因混血的时间变化如何影响杂合度水平并对健康结果产生影响。我们评估了美国东南部 35,842 名有电子健康记录(EHR)信息的人在 100 个出生年中的遗传血统变化。利用 STRUCTURE 软件,我们分析了 2,678 个与三个祖先集群(非洲、东亚和欧洲)相关的祖先信息标记,观察到自 1990 年以来,所有临床定义的种族群体的混血水平都在上升。随着时间的推移,大多数种族群体的杂合度和长程连锁不平衡也在增加,这进一步支持了我们队列中年轻个体混血程度增加的发现。这些数据与美国更广泛地区的人口普查信息一致,凸显了人口结构的变化。多样性的增加对研究基因型与表型关系的传统方法提出了挑战,这促使我们探索杂合度与疾病诊断之间的关系。利用全表型关联研究方法,我们探讨了混血与疾病风险之间的关系,发现混血的增加导致女性生殖系统疾病的保护性关联,以及与自身免疫功能障碍有关的疾病风险增加。这些数据表明,随着时间的推移,美国人口的祖先复杂性有增加的趋势。此外,这些观察结果表明,由于许多疾病的发病率和严重程度因种族群体而异,祖先起源的复杂性影响着健康和差异。
{"title":"Evidence of recent and ongoing admixture in the U.S. and influences on health and disparities.","authors":"Hannah M. Seagle, J. Hellwege, Brian S. Mautz, Chun Li, Yaomin Xu, Siwei Zhang, Dan M. Roden, Tracy L. McGregor, D. V. Velez Edwards, Todd L. Edwards","doi":"10.1142/9789811286421_0029","DOIUrl":"https://doi.org/10.1142/9789811286421_0029","url":null,"abstract":"Many researchers in genetics and social science incorporate information about race in their work. However, migrations (historical and forced) and social mobility have brought formerly separated populations of humans together, creating younger generations of individuals who have more complex and diverse ancestry and race profiles than older age groups. Here, we sought to better understand how temporal changes in genetic admixture influence levels of heterozygosity and impact health outcomes. We evaluated variation in genetic ancestry over 100 birth years in a cohort of 35,842 individuals with electronic health record (EHR) information in the Southeastern United States. Using the software STRUCTURE, we analyzed 2,678 ancestrally informative markers relative to three ancestral clusters (African, East Asian, and European) and observed rising levels of admixture for all clinically-defined race groups since 1990. Most race groups also exhibited increases in heterozygosity and long-range linkage disequilibrium over time, further supporting the finding of increasing admixture in young individuals in our cohort. These data are consistent with United States Census information from broader geographic areas and highlight the changing demography of the population. This increased diversity challenges classic approaches to studies of genotype-phenotype relationships which motivated us to explore the relationship between heterozygosity and disease diagnosis. Using a phenome-wide association study approach, we explored the relationship between admixture and disease risk and found that increased admixture resulted in protective associations with female reproductive disorders and increased risk for diseases with links to autoimmune dysfunction. These data suggest that tendencies in the United States population are increasing ancestral complexity over time. Further, these observations imply that, because both prevalence and severity of many diseases vary by race groups, complexity of ancestral origins influences health and disparities.","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"137 ","pages":"374-388"},"PeriodicalIF":0.0,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Spatial Transcriptomics Analysis by Integrating Image-Aware Deep Learning Methods. 通过整合图像感知深度学习方法加强空间转录组学分析
Q2 Computer Science Pub Date : 2023-12-17 DOI: 10.1142/9789811286421_0035
Jiarong Song, Josh Lamstein, Vivek Gopal Ramaswamy, Michelle Webb, Gabriel Zada, Steven Finkbeiner, David W. Craig
Spatial transcriptomics (ST) represents a pivotal advancement in biomedical research, enabling the transcriptional profiling of cells within their morphological context and providing a pivotal tool for understanding spatial heterogeneity in cancer tissues. However, current analytical approaches, akin to single-cell analysis, largely depend on gene expression, underutilizing the rich morphological information inherent in the tissue. We present a novel method integrating spatial transcriptomics and histopathological image data to better capture biologically meaningful patterns in patient data, focusing on aggressive cancer types such as glioblastoma and triple-negative breast cancer. We used a ResNet-based deep learning model to extract key morphological features from high-resolution whole-slide histology images. Spot-level PCA-reduced vectors of both the ResNet-50 analysis of the histological image and the spatial gene expression data were used in Louvain clustering to enable image-aware feature discovery. Assessment of features from image-aware clustering successfully pinpointed key biological features identified by manual histopathology, such as for regions of fibrosis and necrosis, as well as improved edge definition in EGFR-rich areas. Importantly, our combinatorial approach revealed crucial characteristics seen in histopathology that gene-expression-only analysis had missed.Supplemental Material: https://github.com/davcraig75/song_psb2014/blob/main/SupplementaryData.pdf.
空间转录组学(ST)是生物医学研究领域的一项重要进展,它能在细胞形态学背景下对细胞进行转录剖析,为了解癌症组织的空间异质性提供了重要工具。然而,目前类似单细胞分析的分析方法主要依赖于基因表达,对组织中固有的丰富形态学信息利用不足。我们提出了一种整合空间转录组学和组织病理学图像数据的新方法,以更好地捕捉患者数据中具有生物学意义的模式,重点关注侵袭性癌症类型,如胶质母细胞瘤和三阴性乳腺癌。我们使用基于 ResNet 的深度学习模型从高分辨率全切片组织学图像中提取关键形态学特征。对组织学图像的 ResNet-50 分析和空间基因表达数据的点级 PCA 还原向量被用于卢万聚类,以实现图像感知特征发现。通过图像感知聚类对特征进行评估,成功确定了人工组织病理学确定的关键生物学特征,如纤维化和坏死区域,以及表皮生长因子受体富集区域的边缘定义。重要的是,我们的组合方法揭示了组织病理学中的关键特征,而仅有基因表达的分析却忽略了这些特征。补充材料:https://github.com/davcraig75/song_psb2014/blob/main/SupplementaryData.pdf。
{"title":"Enhancing Spatial Transcriptomics Analysis by Integrating Image-Aware Deep Learning Methods.","authors":"Jiarong Song, Josh Lamstein, Vivek Gopal Ramaswamy, Michelle Webb, Gabriel Zada, Steven Finkbeiner, David W. Craig","doi":"10.1142/9789811286421_0035","DOIUrl":"https://doi.org/10.1142/9789811286421_0035","url":null,"abstract":"Spatial transcriptomics (ST) represents a pivotal advancement in biomedical research, enabling the transcriptional profiling of cells within their morphological context and providing a pivotal tool for understanding spatial heterogeneity in cancer tissues. However, current analytical approaches, akin to single-cell analysis, largely depend on gene expression, underutilizing the rich morphological information inherent in the tissue. We present a novel method integrating spatial transcriptomics and histopathological image data to better capture biologically meaningful patterns in patient data, focusing on aggressive cancer types such as glioblastoma and triple-negative breast cancer. We used a ResNet-based deep learning model to extract key morphological features from high-resolution whole-slide histology images. Spot-level PCA-reduced vectors of both the ResNet-50 analysis of the histological image and the spatial gene expression data were used in Louvain clustering to enable image-aware feature discovery. Assessment of features from image-aware clustering successfully pinpointed key biological features identified by manual histopathology, such as for regions of fibrosis and necrosis, as well as improved edge definition in EGFR-rich areas. Importantly, our combinatorial approach revealed crucial characteristics seen in histopathology that gene-expression-only analysis had missed.Supplemental Material: https://github.com/davcraig75/song_psb2014/blob/main/SupplementaryData.pdf.","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"49 21","pages":"450-463"},"PeriodicalIF":0.0,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generating new drug repurposing hypotheses using disease-specific hypergraphs. 利用特定疾病超图生成新的药物再利用假设。
Q2 Computer Science Pub Date : 2023-12-17 DOI: 10.1142/9789811286421_0021
Ayush Jain, Marie Charpignon, Irene Y. Chen, Anthony Philippakis, Ahmed Alaa
The drug development pipeline for a new compound can last 10-20 years and cost over $10 billion. Drug repurposing offers a more time- and cost-effective alternative. Computational approaches based on network graph representations, comprising a mixture of disease nodes and their interactions, have recently yielded new drug repurposing hypotheses, including suitable candidates for COVID-19. However, these interactomes remain aggregate by design and often lack disease specificity. This dilution of information may affect the relevance of drug node embeddings to a particular disease, the resulting drug-disease and drug-drug similarity scores, and therefore our ability to identify new targets or drug synergies. To address this problem, we propose constructing and learning disease-specific hypergraphs in which hyperedges encode biological pathways of various lengths. We use a modified node2vec algorithm to generate pathway embeddings. We evaluate our hypergraph's ability to find repurposing targets for an incurable but prevalent disease, Alzheimer's disease (AD), and compare our ranked-ordered recommendations to those derived from a state-of-the-art knowledge graph, the multiscale interactome. Using our method, we successfully identified 7 promising repurposing candidates for AD that were ranked as unlikely repurposing targets by the multiscale interactome but for which the existing literature provides supporting evidence. Additionally, our drug repositioning suggestions are accompanied by explanations, eliciting plausible biological pathways. In the future, we plan on scaling our proposed method to 800+ diseases, combining single-disease hypergraphs into multi-disease hypergraphs to account for subpopulations with risk factors or encode a given patient's comorbidities to formulate personalized repurposing recommendations.Supplementary materials and code: https://github.com/ayujain04/psb_supplement.
一种新化合物的药物开发周期可能长达 10-20 年,耗资超过 100 亿美元。药物再利用提供了一种时间更短、成本效益更高的替代方案。基于由疾病节点及其相互作用组成的网络图表示的计算方法最近产生了新的药物再利用假说,包括 COVID-19 的合适候选药物。然而,这些相互作用组的设计仍然是聚合的,往往缺乏疾病特异性。这种信息稀释可能会影响药物节点嵌入与特定疾病的相关性、由此产生的药物-疾病和药物-药物相似性得分,从而影响我们识别新靶点或药物协同作用的能力。为了解决这个问题,我们提出了构建和学习特定疾病超图的建议,其中超图编码了不同长度的生物通路。我们使用改进的 node2vec 算法生成路径嵌入。我们评估了我们的超图为阿尔茨海默病(AD)这一无法治愈但普遍存在的疾病寻找再利用目标的能力,并将我们的排序推荐与从最先进的知识图谱--多尺度交互组--中得出的推荐进行了比较。利用我们的方法,我们成功地发现了 7 种有希望重新成为治疗阿尔茨海默病目标的候选药物,这些候选药物在多尺度相互作用组中被列为不可能重新成为目标的药物,但现有文献为其提供了支持性证据。此外,我们的药物重新定位建议还附有解释,引出了合理的生物学途径。未来,我们计划将我们提出的方法推广到800多种疾病,将单病种超图结合到多病种超图中,以考虑具有风险因素的亚人群或编码特定患者的合并症,从而制定个性化的再利用建议。补充材料和代码:https://github.com/ayujain04/psb_supplement。
{"title":"Generating new drug repurposing hypotheses using disease-specific hypergraphs.","authors":"Ayush Jain, Marie Charpignon, Irene Y. Chen, Anthony Philippakis, Ahmed Alaa","doi":"10.1142/9789811286421_0021","DOIUrl":"https://doi.org/10.1142/9789811286421_0021","url":null,"abstract":"The drug development pipeline for a new compound can last 10-20 years and cost over $10 billion. Drug repurposing offers a more time- and cost-effective alternative. Computational approaches based on network graph representations, comprising a mixture of disease nodes and their interactions, have recently yielded new drug repurposing hypotheses, including suitable candidates for COVID-19. However, these interactomes remain aggregate by design and often lack disease specificity. This dilution of information may affect the relevance of drug node embeddings to a particular disease, the resulting drug-disease and drug-drug similarity scores, and therefore our ability to identify new targets or drug synergies. To address this problem, we propose constructing and learning disease-specific hypergraphs in which hyperedges encode biological pathways of various lengths. We use a modified node2vec algorithm to generate pathway embeddings. We evaluate our hypergraph's ability to find repurposing targets for an incurable but prevalent disease, Alzheimer's disease (AD), and compare our ranked-ordered recommendations to those derived from a state-of-the-art knowledge graph, the multiscale interactome. Using our method, we successfully identified 7 promising repurposing candidates for AD that were ranked as unlikely repurposing targets by the multiscale interactome but for which the existing literature provides supporting evidence. Additionally, our drug repositioning suggestions are accompanied by explanations, eliciting plausible biological pathways. In the future, we plan on scaling our proposed method to 800+ diseases, combining single-disease hypergraphs into multi-disease hypergraphs to account for subpopulations with risk factors or encode a given patient's comorbidities to formulate personalized repurposing recommendations.Supplementary materials and code: https://github.com/ayujain04/psb_supplement.","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"53 19","pages":"261-275"},"PeriodicalIF":0.0,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Potential to Enhance Large Scale Molecular Assessments of Skin Photoaging through Virtual Inference of Spatial Transcriptomics from Routine Staining. 通过常规染色虚拟推断空间转录组学,增强大规模皮肤光老化分子评估的潜力。
Q2 Computer Science Pub Date : 2023-12-17 DOI: 10.1142/9789811286421_0037
Gokul Srinivasan, Matthew Davis, M. LeBoeuf, Michael Y Fatemi, Zarif L. Azher, Yunrui Lu, Alos Diallo, Marietta Saldías Montivero, Fred W. Kolling, Laurent Perrard, L. Salas, B. Christensen, Thomas J Palys, M. Karagas, Scott M. Palisoul, G. Tsongalis, L. Vaickus, Sarah Preum, Joshua J. Levy
The advent of spatial transcriptomics technologies has heralded a renaissance in research to advance our understanding of the spatial cellular and transcriptional heterogeneity within tissues. Spatial transcriptomics allows investigation of the interplay between cells, molecular pathways, and the surrounding tissue architecture and can help elucidate developmental trajectories, disease pathogenesis, and various niches in the tumor microenvironment. Photoaging is the histological and molecular skin damage resulting from chronic/acute sun exposure and is a major risk factor for skin cancer. Spatial transcriptomics technologies hold promise for improving the reliability of evaluating photoaging and developing new therapeutics. Challenges to current methods include limited focus on dermal elastosis variations and reliance on self-reported measures, which can introduce subjectivity and inconsistency. Spatial transcriptomics offers an opportunity to assess photoaging objectively and reproducibly in studies of carcinogenesis and discern the effectiveness of therapies that intervene in photoaging and preventing cancer. Evaluation of distinct histological architectures using highly-multiplexed spatial technologies can identify specific cell lineages that have been understudied due to their location beyond the depth of UV penetration. However, the cost and interpatient variability using state-of-the-art assays such as the 10x Genomics Spatial Transcriptomics assays limits the scope and scale of large-scale molecular epidemiologic studies. Here, we investigate the inference of spatial transcriptomics information from routine hematoxylin and eosin-stained (H&E) tissue slides. We employed the Visium CytAssist spatial transcriptomics assay to analyze over 18,000 genes at a 50-micron resolution for four patients from a cohort of 261 skin specimens collected adjacent to surgical resection sites for basal cell and squamous cell keratinocyte tumors. The spatial transcriptomics data was co-registered with 40x resolution whole slide imaging (WSI) information. We developed machine learning models that achieved a macro-averaged median AUC and F1 score of 0.80 and 0.61 and Spearman coefficient of 0.60 in inferring transcriptomic profiles across the slides, and accurately captured biological pathways across various tissue architectures.
空间转录组学技术的出现预示着研究领域的复兴,它将推动我们对组织内部空间细胞和转录异质性的了解。空间转录组学可以研究细胞、分子通路和周围组织结构之间的相互作用,有助于阐明发育轨迹、疾病发病机制和肿瘤微环境中的各种龛位。光老化是慢性/急性日晒造成的皮肤组织学和分子损伤,是皮肤癌的主要风险因素。空间转录组学技术有望提高光老化评估的可靠性并开发新的治疗方法。目前的方法所面临的挑战包括对皮肤弹性变化的关注有限,以及依赖于自我报告的测量方法,这可能会带来主观性和不一致性。空间转录组学提供了一个机会,可以在致癌研究中客观、可重复地评估光老化,并鉴别干预光老化和预防癌症的疗法的有效性。利用高度复用的空间技术对不同的组织学结构进行评估,可以确定特定的细胞系,这些细胞系由于位于紫外线穿透深度之外而未得到充分研究。然而,使用最先进的检测方法(如 10x Genomics 空间转录组学检测方法)所需的成本和患者间的差异限制了大规模分子流行病学研究的范围和规模。在这里,我们研究了从常规苏木精和伊红染色(H&E)组织切片中推断空间转录组学信息的方法。我们采用 Visium CytAssist 空间转录组学分析方法,以 50 微米的分辨率分析了基底细胞和鳞状细胞角朊细胞肿瘤手术切除部位附近采集的 261 份皮肤标本中四名患者的 18,000 多个基因。空间转录组学数据与 40 倍分辨率的全切片成像(WSI)信息共同注册。我们开发的机器学习模型在推断整个切片的转录组概况时,宏观平均中值AUC和F1得分分别为0.80和0.61,斯皮尔曼系数为0.60,并准确捕捉了各种组织结构的生物通路。
{"title":"Potential to Enhance Large Scale Molecular Assessments of Skin Photoaging through Virtual Inference of Spatial Transcriptomics from Routine Staining.","authors":"Gokul Srinivasan, Matthew Davis, M. LeBoeuf, Michael Y Fatemi, Zarif L. Azher, Yunrui Lu, Alos Diallo, Marietta Saldías Montivero, Fred W. Kolling, Laurent Perrard, L. Salas, B. Christensen, Thomas J Palys, M. Karagas, Scott M. Palisoul, G. Tsongalis, L. Vaickus, Sarah Preum, Joshua J. Levy","doi":"10.1142/9789811286421_0037","DOIUrl":"https://doi.org/10.1142/9789811286421_0037","url":null,"abstract":"The advent of spatial transcriptomics technologies has heralded a renaissance in research to advance our understanding of the spatial cellular and transcriptional heterogeneity within tissues. Spatial transcriptomics allows investigation of the interplay between cells, molecular pathways, and the surrounding tissue architecture and can help elucidate developmental trajectories, disease pathogenesis, and various niches in the tumor microenvironment. Photoaging is the histological and molecular skin damage resulting from chronic/acute sun exposure and is a major risk factor for skin cancer. Spatial transcriptomics technologies hold promise for improving the reliability of evaluating photoaging and developing new therapeutics. Challenges to current methods include limited focus on dermal elastosis variations and reliance on self-reported measures, which can introduce subjectivity and inconsistency. Spatial transcriptomics offers an opportunity to assess photoaging objectively and reproducibly in studies of carcinogenesis and discern the effectiveness of therapies that intervene in photoaging and preventing cancer. Evaluation of distinct histological architectures using highly-multiplexed spatial technologies can identify specific cell lineages that have been understudied due to their location beyond the depth of UV penetration. However, the cost and interpatient variability using state-of-the-art assays such as the 10x Genomics Spatial Transcriptomics assays limits the scope and scale of large-scale molecular epidemiologic studies. Here, we investigate the inference of spatial transcriptomics information from routine hematoxylin and eosin-stained (H&E) tissue slides. We employed the Visium CytAssist spatial transcriptomics assay to analyze over 18,000 genes at a 50-micron resolution for four patients from a cohort of 261 skin specimens collected adjacent to surgical resection sites for basal cell and squamous cell keratinocyte tumors. The spatial transcriptomics data was co-registered with 40x resolution whole slide imaging (WSI) information. We developed machine learning models that achieved a macro-averaged median AUC and F1 score of 0.80 and 0.61 and Spearman coefficient of 0.60 in inferring transcriptomic profiles across the slides, and accurately captured biological pathways across various tissue architectures.","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"37 18","pages":"477-491"},"PeriodicalIF":0.0,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session Introduction: Precision Medicine: Innovative methods for advanced understanding of molecular underpinnings of disease. 会议简介:精准医学:通过创新方法深入了解疾病的分子基础。
Q2 Computer Science Pub Date : 2023-12-17 DOI: 10.1142/9789811286421_0034
Yana Bromberg, Hannah Carter, Steven E. Brenner
Precision medicine, also often referred to as personalized medicine, targets the development of treatments and preventative measures specific to the individual's genomic signatures, lifestyle, and environmental conditions. The series of Precision Medicine sessions in PSB has continuously highlighted the advances in this field. Our 2024 collection of manuscripts showcases algorithmic advances that integrate data from distinct modalities and introduce innovative approaches to extract new, medically relevant information from existing data. These evolving technology and analytical methods promise to bring closer the goals of precision medicine to improve health and increase lifespan.
精准医学,也常被称为个性化医学,其目标是针对个人的基因组特征、生活方式和环境条件,开发特定的治疗和预防措施。PSB的精准医学系列会议不断强调这一领域的进展。我们的2024年手稿集展示了算法的进步,这些算法整合了来自不同模式的数据,并引入创新方法从现有数据中提取新的医学相关信息。这些不断发展的技术和分析方法有望进一步实现精准医学的目标,改善健康状况,延长寿命。
{"title":"Session Introduction: Precision Medicine: Innovative methods for advanced understanding of molecular underpinnings of disease.","authors":"Yana Bromberg, Hannah Carter, Steven E. Brenner","doi":"10.1142/9789811286421_0034","DOIUrl":"https://doi.org/10.1142/9789811286421_0034","url":null,"abstract":"Precision medicine, also often referred to as personalized medicine, targets the development of treatments and preventative measures specific to the individual's genomic signatures, lifestyle, and environmental conditions. The series of Precision Medicine sessions in PSB has continuously highlighted the advances in this field. Our 2024 collection of manuscripts showcases algorithmic advances that integrate data from distinct modalities and introduce innovative approaches to extract new, medically relevant information from existing data. These evolving technology and analytical methods promise to bring closer the goals of precision medicine to improve health and increase lifespan.","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"27 11","pages":"446-449"},"PeriodicalIF":0.0,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session Introduction: Artificial Intelligence in Clinical Medicine: Generative and Interactive Systems at the Human-Machine Interface. 会议简介:临床医学中的人工智能:人机界面上的生成和交互系统。
Q2 Computer Science Pub Date : 2023-12-17 DOI: 10.1142/9789811286421_0001
S. Fouladvand, Emma Pierson, Ivana Jankovic, David Ouyang, Jonathan H. Chen, Roxana Daneshjou
Artificial Intelligence (AI) models are substantially enhancing the capability to analyze complex and multi-dimensional datasets. Generative AI and deep learning models have demonstrated significant advancements in extracting knowledge from unstructured text, imaging as well as structured and tabular data. This recent breakthrough in AI has inspired research in medicine, leading to the development of numerous tools for creating clinical decision support systems, monitoring tools, image interpretation, and triaging capabilities. Nevertheless, comprehensive research is imperative to evaluate the potential impact and implications of AI systems in healthcare. At the 2024 Pacific Symposium on Biocomputing (PSB) session entitled "Artificial Intelligence in Clinical Medicine: Generative and Interactive Systems at the Human-Machine Interface", we spotlight research that develops and applies AI algorithms to solve real-world problems in healthcare.
人工智能(AI)模型大大提高了分析复杂和多维数据集的能力。生成式人工智能和深度学习模型在从非结构化文本、图像以及结构化和表格数据中提取知识方面取得了显著进步。人工智能领域的这一最新突破激发了医学研究的灵感,开发出了许多用于创建临床决策支持系统、监测工具、图像解读和分流功能的工具。然而,要评估人工智能系统在医疗保健领域的潜在影响和意义,全面的研究势在必行。在 2024 年太平洋生物计算研讨会(PSB)题为 "人工智能在临床医学中的应用 "的会议上,与会代表就人工智能在医疗保健领域的应用进行了深入探讨:人机界面上的生成和交互系统 "分会上,我们将重点介绍开发和应用人工智能算法解决医疗保健领域实际问题的研究。
{"title":"Session Introduction: Artificial Intelligence in Clinical Medicine: Generative and Interactive Systems at the Human-Machine Interface.","authors":"S. Fouladvand, Emma Pierson, Ivana Jankovic, David Ouyang, Jonathan H. Chen, Roxana Daneshjou","doi":"10.1142/9789811286421_0001","DOIUrl":"https://doi.org/10.1142/9789811286421_0001","url":null,"abstract":"Artificial Intelligence (AI) models are substantially enhancing the capability to analyze complex and multi-dimensional datasets. Generative AI and deep learning models have demonstrated significant advancements in extracting knowledge from unstructured text, imaging as well as structured and tabular data. This recent breakthrough in AI has inspired research in medicine, leading to the development of numerous tools for creating clinical decision support systems, monitoring tools, image interpretation, and triaging capabilities. Nevertheless, comprehensive research is imperative to evaluate the potential impact and implications of AI systems in healthcare. At the 2024 Pacific Symposium on Biocomputing (PSB) session entitled \"Artificial Intelligence in Clinical Medicine: Generative and Interactive Systems at the Human-Machine Interface\", we spotlight research that develops and applies AI algorithms to solve real-world problems in healthcare.","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"29 4","pages":"1-7"},"PeriodicalIF":0.0,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
KombOver: Efficient k-core and K-truss based characterization of perturbations within the human gut microbiome KombOver:基于 K 核心和 K 桁架的人类肠道微生物群扰动高效表征技术
Q2 Computer Science Pub Date : 2023-12-17 DOI: 10.1142/9789811286421_0039
Nicolae Sapoval, Marko Tanevski, T. Treangen
The microbes present in the human gastrointestinal tract are regularly linked to human health and disease outcomes. Thanks to technological and methodological advances in recent years, metagenomic sequencing data, and computational methods designed to analyze metagenomic data, have contributed to improved understanding of the link between the human gut microbiome and disease. However, while numerous methods have been recently developed to extract quantitative and qualitative results from host-associated microbiome data, improved computational tools are still needed to track microbiome dynamics with short-read sequencing data. Previously we have proposed KOMB as a de novo tool for identifying copy number variations in metagenomes for characterizing microbial genome dynamics in response to perturbations. In this work, we present KombOver (KO), which includes four key contributions with respect to our previous work: (i) it scales to large microbiome study cohorts, (ii) it includes both k-core and K-truss based analysis, (iii) we provide the foundation of a theoretical understanding of the relation between various graph-based metagenome representations, and (iv) we provide an improved user experience with easier-to-run code and more descriptive outputs/results. To highlight the aforementioned benefits, we applied KO to nearly 1000 human microbiome samples, requiring less than 10 minutes and 10 GB RAM per sample to process these data. Furthermore, we highlight how graph-based approaches such as k-core and K-truss can be informative for pinpointing microbial community dynamics within a myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) cohort. KO is open source and available for download/use at: https://github.com/treangenlab/komb
人类胃肠道中的微生物经常与人类健康和疾病结果联系在一起。近年来,由于技术和方法上的进步,元基因组测序数据和用于分析元基因组数据的计算方法有助于人们更好地了解人类肠道微生物组与疾病之间的联系。然而,尽管最近已开发出许多方法来从宿主相关微生物组数据中提取定量和定性结果,但仍需要改进计算工具来利用短线程测序数据跟踪微生物组动态。在此之前,我们已经提出了 KOMB 作为一种全新的工具,用于识别元基因组中的拷贝数变异,以描述微生物基因组对扰动的动态响应。在这项工作中,我们提出了 KombOver (KO),它与我们之前的工作相比有四个主要贡献:(i) 它可扩展到大型微生物组研究队列;(ii) 它包括基于 K 核和 K 桁架的分析;(iii) 我们为理解各种基于图的元基因组表示之间的关系提供了理论基础;(iv) 我们提供了更好的用户体验,代码更易于运行,输出/结果更具描述性。为了突出上述优势,我们将 KO 应用于近 1000 个人类微生物组样本,每个样本只需不到 10 分钟和 10 GB 内存就能处理这些数据。此外,我们还强调了基于图的方法(如 K-core 和 K-truss)如何为确定肌痛性脑脊髓炎/慢性疲劳综合征(ME/CFS)队列中的微生物群落动态提供信息。KO 是开放源代码,可在以下网站下载/使用: https://github.com/treangenlab/komb
{"title":"KombOver: Efficient k-core and K-truss based characterization of perturbations within the human gut microbiome","authors":"Nicolae Sapoval, Marko Tanevski, T. Treangen","doi":"10.1142/9789811286421_0039","DOIUrl":"https://doi.org/10.1142/9789811286421_0039","url":null,"abstract":"The microbes present in the human gastrointestinal tract are regularly linked to human health and disease outcomes. Thanks to technological and methodological advances in recent years, metagenomic sequencing data, and computational methods designed to analyze metagenomic data, have contributed to improved understanding of the link between the human gut microbiome and disease. However, while numerous methods have been recently developed to extract quantitative and qualitative results from host-associated microbiome data, improved computational tools are still needed to track microbiome dynamics with short-read sequencing data. Previously we have proposed KOMB as a de novo tool for identifying copy number variations in metagenomes for characterizing microbial genome dynamics in response to perturbations. In this work, we present KombOver (KO), which includes four key contributions with respect to our previous work: (i) it scales to large microbiome study cohorts, (ii) it includes both k-core and K-truss based analysis, (iii) we provide the foundation of a theoretical understanding of the relation between various graph-based metagenome representations, and (iv) we provide an improved user experience with easier-to-run code and more descriptive outputs/results. To highlight the aforementioned benefits, we applied KO to nearly 1000 human microbiome samples, requiring less than 10 minutes and 10 GB RAM per sample to process these data. Furthermore, we highlight how graph-based approaches such as k-core and K-truss can be informative for pinpointing microbial community dynamics within a myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) cohort. KO is open source and available for download/use at: https://github.com/treangenlab/komb","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"28 4","pages":"506 - 520"},"PeriodicalIF":0.0,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transcript-aware analysis of rare predicted loss-of-function variants in the UK Biobank elucidate new isoform-trait associations. 对英国生物库中罕见的预测功能缺失变体进行转录本感知分析,阐明新的同工酶-性状关联。
Q2 Computer Science Pub Date : 2023-12-17 DOI: 10.1142/9789811286421_0020
Rachel A. Hoffing, A. Deaton, Aaron M. Holleman, Lynne Krohn, Philip J. LoGerfo, Mollie E. Plekan, Sebastian Akle Serrano, P. Nioi, Lucas D. Ward
A single gene can produce multiple transcripts with distinct molecular functions. Rare-variant association tests often aggregate all coding variants across individual genes, without accounting for the variants' presence or consequence in resulting transcript isoforms. To evaluate the utility of transcript-aware variant sets, rare predicted loss-of-function (pLOF) variants were aggregated for 17,035 protein-coding genes using 55,558 distinct transcript-specific variant sets. These sets were tested for their association with 728 circulating proteins and 188 quantitative phenotypes across 406,921 individuals in the UK Biobank. The transcript-specific approach resulted in larger estimated effects of pLOF variants decreasing serum cis-protein levels compared to the gene-based approach (pbinom ≤ 2x10-16). Additionally, 251 quantitative trait associations were identified as being significant using the transcript-specific approach but not the gene-based approach, including PCSK5 transcript ENST00000376752 and standing height (transcript-specific statistic, P = 1.3x10-16, effect = 0.7 SD decrease; gene-based statistic, P = 0.02, effect = 0.05 SD decrease) and LDLR transcript ENST00000252444 and apolipoprotein B (transcript-specific statistic, P = 5.7x10-20, effect = 1.0 SD increase; gene-based statistic, P = 3.0x10-4, effect = 0.2 SD increase). This approach demonstrates the importance of considering the effect of pLOFs on specific transcript isoforms when performing rare-variant association studies.
一个基因可以产生多种具有不同分子功能的转录本。罕见变异关联测试通常会汇总单个基因的所有编码变异,而不会考虑变异在转录本异构体中的存在或后果。为了评估转录本感知变异集的效用,我们使用 55558 个不同的转录本特异性变异集汇总了 17035 个蛋白编码基因的罕见预测功能缺失(pLOF)变异。这些变异集与英国生物库中 406921 人的 728 种循环蛋白和 188 种定量表型进行了关联测试。与基于基因的方法相比(pbinom ≤ 2x10-16),转录本特异性方法导致 pLOF 变体降低血清顺式蛋白水平的估计效应更大。此外,使用转录本特异性方法而非基于基因的方法,确定了 251 个数量性状关联具有显著性,包括 PCSK5 转录本 ENST00000376752 和站立高度(转录本特异性统计量,P = 1.3x10-16,效应 = 0.7 SD 下降;基于基因的统计量,P = 0.02,效应 = 0.05 SD 下降)和 LDLR 转录本 ENST00000252444 与脂蛋白 B(转录本特异性统计量,P = 5.7x10-20,效应 = 1.0 SD 增加;基于基因的统计量,P = 3.0x10-4,效应 = 0.2 SD 增加)。这种方法表明,在进行罕见变异关联研究时,考虑 pLOF 对特定转录本同工酶的影响非常重要。
{"title":"Transcript-aware analysis of rare predicted loss-of-function variants in the UK Biobank elucidate new isoform-trait associations.","authors":"Rachel A. Hoffing, A. Deaton, Aaron M. Holleman, Lynne Krohn, Philip J. LoGerfo, Mollie E. Plekan, Sebastian Akle Serrano, P. Nioi, Lucas D. Ward","doi":"10.1142/9789811286421_0020","DOIUrl":"https://doi.org/10.1142/9789811286421_0020","url":null,"abstract":"A single gene can produce multiple transcripts with distinct molecular functions. Rare-variant association tests often aggregate all coding variants across individual genes, without accounting for the variants' presence or consequence in resulting transcript isoforms. To evaluate the utility of transcript-aware variant sets, rare predicted loss-of-function (pLOF) variants were aggregated for 17,035 protein-coding genes using 55,558 distinct transcript-specific variant sets. These sets were tested for their association with 728 circulating proteins and 188 quantitative phenotypes across 406,921 individuals in the UK Biobank. The transcript-specific approach resulted in larger estimated effects of pLOF variants decreasing serum cis-protein levels compared to the gene-based approach (pbinom ≤ 2x10-16). Additionally, 251 quantitative trait associations were identified as being significant using the transcript-specific approach but not the gene-based approach, including PCSK5 transcript ENST00000376752 and standing height (transcript-specific statistic, P = 1.3x10-16, effect = 0.7 SD decrease; gene-based statistic, P = 0.02, effect = 0.05 SD decrease) and LDLR transcript ENST00000252444 and apolipoprotein B (transcript-specific statistic, P = 5.7x10-20, effect = 1.0 SD increase; gene-based statistic, P = 3.0x10-4, effect = 0.2 SD increase). This approach demonstrates the importance of considering the effect of pLOFs on specific transcript isoforms when performing rare-variant association studies.","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"760 ","pages":"247-260"},"PeriodicalIF":0.0,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1