Genomic language models: opportunities and challenges.

IF 13.6 2区 生物学 Q1 GENETICS & HEREDITY Trends in Genetics Pub Date : 2025-01-02 DOI:10.1016/j.tig.2024.11.013
Gonzalo Benegas, Chengzhong Ye, Carlos Albors, Jianan Canal Li, Yun S Song
{"title":"Genomic language models: opportunities and challenges.","authors":"Gonzalo Benegas, Chengzhong Ye, Carlos Albors, Jianan Canal Li, Yun S Song","doi":"10.1016/j.tig.2024.11.013","DOIUrl":null,"url":null,"abstract":"<p><p>Large language models (LLMs) are having transformative impacts across a wide range of scientific fields, particularly in the biomedical sciences. Just as the goal of natural language processing is to understand sequences of words, a major objective in biology is to understand biological sequences. Genomic language models (gLMs), which are LLMs trained on DNA sequences, have the potential to significantly advance our understanding of genomes and how DNA elements at various scales interact to give rise to complex functions. To showcase this potential, we highlight key applications of gLMs, including functional constraint prediction, sequence design, and transfer learning. Despite notable recent progress, however, developing effective and efficient gLMs presents numerous challenges, especially for species with large, complex genomes. Here, we discuss major considerations for developing and evaluating gLMs.</p>","PeriodicalId":54413,"journal":{"name":"Trends in Genetics","volume":" ","pages":""},"PeriodicalIF":13.6000,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Trends in Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.tig.2024.11.013","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Large language models (LLMs) are having transformative impacts across a wide range of scientific fields, particularly in the biomedical sciences. Just as the goal of natural language processing is to understand sequences of words, a major objective in biology is to understand biological sequences. Genomic language models (gLMs), which are LLMs trained on DNA sequences, have the potential to significantly advance our understanding of genomes and how DNA elements at various scales interact to give rise to complex functions. To showcase this potential, we highlight key applications of gLMs, including functional constraint prediction, sequence design, and transfer learning. Despite notable recent progress, however, developing effective and efficient gLMs presents numerous challenges, especially for species with large, complex genomes. Here, we discuss major considerations for developing and evaluating gLMs.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
求助全文
约1分钟内获得全文 去求助
来源期刊
Trends in Genetics
Trends in Genetics 生物-遗传学
CiteScore
20.90
自引率
0.90%
发文量
160
审稿时长
6-12 weeks
期刊介绍: Launched in 1985, Trends in Genetics swiftly established itself as a "must-read" for geneticists, offering concise, accessible articles covering a spectrum of topics from developmental biology to evolution. This reputation endures, making TiG a cherished resource in the genetic research community. While evolving with the field, the journal now embraces new areas like genomics, epigenetics, and computational genetics, alongside its continued coverage of traditional subjects such as transcriptional regulation, population genetics, and chromosome biology. Despite expanding its scope, the core objective of TiG remains steadfast: to furnish researchers and students with high-quality, innovative reviews, commentaries, and discussions, fostering an appreciation for advances in genetic research. Each issue of TiG presents lively and up-to-date Reviews and Opinions, alongside shorter articles like Science & Society and Spotlight pieces. Invited from leading researchers, Reviews objectively chronicle recent developments, Opinions provide a forum for debate and hypothesis, and shorter articles explore the intersection of genetics with science and policy, as well as emerging ideas in the field. All articles undergo rigorous peer-review.
期刊最新文献
Microproteins: emerging roles as antibiotics. Conserved dynamics of natal down-to-juvenile feather transition. Finding functional microproteins. Genomic language models: opportunities and challenges. A new hypothesis to explain disease dominance.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1