Fundamentals for predicting transcriptional regulations from DNA sequence patterns

IF 2.6 3区 生物学 Q2 GENETICS & HEREDITY Journal of Human Genetics Pub Date : 2024-05-10 DOI:10.1038/s10038-024-01256-3
Masaru Koido, Kohei Tomizuka, Chikashi Terao
{"title":"Fundamentals for predicting transcriptional regulations from DNA sequence patterns","authors":"Masaru Koido, Kohei Tomizuka, Chikashi Terao","doi":"10.1038/s10038-024-01256-3","DOIUrl":null,"url":null,"abstract":"Cell-type-specific regulatory elements, cataloged through extensive experiments and bioinformatics in large-scale consortiums, have enabled enrichment analyses of genetic associations that primarily utilize positional information of the regulatory elements. These analyses have identified cell types and pathways genetically associated with human complex traits. However, our understanding of detailed allelic effects on these elements’ activities and on-off states remains incomplete, hampering the interpretation of human genetic study results. This review introduces machine learning methods to learn sequence-dependent transcriptional regulation mechanisms from DNA sequences for predicting such allelic effects (not associations). We provide a concise history of machine-learning-based approaches, the requirements, and the key computational processes, focusing on primers in machine learning. Convolution and self-attention, pivotal in modern deep-learning models, are explained through geometrical interpretations using dot products. This facilitates understanding of the concept and why these have been used for machine learning for DNA sequences. These will inspire further research in this genetics and genomics field.","PeriodicalId":16077,"journal":{"name":"Journal of Human Genetics","volume":null,"pages":null},"PeriodicalIF":2.6000,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s10038-024-01256-3.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Human Genetics","FirstCategoryId":"99","ListUrlMain":"https://www.nature.com/articles/s10038-024-01256-3","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Cell-type-specific regulatory elements, cataloged through extensive experiments and bioinformatics in large-scale consortiums, have enabled enrichment analyses of genetic associations that primarily utilize positional information of the regulatory elements. These analyses have identified cell types and pathways genetically associated with human complex traits. However, our understanding of detailed allelic effects on these elements’ activities and on-off states remains incomplete, hampering the interpretation of human genetic study results. This review introduces machine learning methods to learn sequence-dependent transcriptional regulation mechanisms from DNA sequences for predicting such allelic effects (not associations). We provide a concise history of machine-learning-based approaches, the requirements, and the key computational processes, focusing on primers in machine learning. Convolution and self-attention, pivotal in modern deep-learning models, are explained through geometrical interpretations using dot products. This facilitates understanding of the concept and why these have been used for machine learning for DNA sequences. These will inspire further research in this genetics and genomics field.

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
从 DNA 序列模式预测转录调控的基本原理。
通过大规模联盟的广泛实验和生物信息学编目,细胞类型特异性调控元件得以主要利用调控元件的位置信息对遗传关联进行富集分析。这些分析确定了与人类复杂性状相关的细胞类型和遗传途径。然而,我们对等位基因对这些元件的活动和通断状态的详细影响的了解仍然不全面,这妨碍了对人类基因研究结果的解释。本综述介绍了从 DNA 序列中学习序列依赖性转录调控机制的机器学习方法,以预测此类等位基因效应(非关联)。我们简明扼要地介绍了基于机器学习的方法的历史、要求和关键计算过程,重点介绍了机器学习的引子。卷积和自注意是现代深度学习模型的关键,我们通过点积的几何解释对其进行了说明。这有助于理解这一概念,以及为什么这些概念被用于 DNA 序列的机器学习。这些都将激励人们在这一遗传学和基因组学领域开展进一步的研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Human Genetics
Journal of Human Genetics 生物-遗传学
CiteScore
7.20
自引率
0.00%
发文量
101
审稿时长
4-8 weeks
期刊介绍: The Journal of Human Genetics is an international journal publishing articles on human genetics, including medical genetics and human genome analysis. It covers all aspects of human genetics, including molecular genetics, clinical genetics, behavioral genetics, immunogenetics, pharmacogenomics, population genetics, functional genomics, epigenetics, genetic counseling and gene therapy. Articles on the following areas are especially welcome: genetic factors of monogenic and complex disorders, genome-wide association studies, genetic epidemiology, cancer genetics, personal genomics, genotype-phenotype relationships and genome diversity.
期刊最新文献
Novel homozygous ESAM variants in two families with perinatal strokes showing variable neuroradiologic and clinical findings. Biallelic missense CEP55 variants cause prenatal MARCH syndrome. Two-hit mutation causes Wilms tumor in an individual with FBXW7-related neurodevelopmental syndrome. Genetic analysis of a Yayoi individual from the Doigahama site provides insights into the origins of immigrants to the Japanese Archipelago. Development of a method for the imputation of the multi-allelic serotonin-transporter-linked polymorphic region (5-HTTLPR) in the Japanese population.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1