W2V 重复指数:基于重复片段预测增强子及其强度

IF 3.4 2区 生物学 Q2 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Genomics Pub Date : 2024-07-29 DOI:10.1016/j.ygeno.2024.110906
Weiming Xie , Zhaomin Yao , Yizhe Yuan , Jingwei Too , Fei Li , Hongyu Wang , Ying Zhan , Xiaodan Wu , Zhiguo Wang , Guoxu Zhang
{"title":"W2V 重复指数:基于重复片段预测增强子及其强度","authors":"Weiming Xie ,&nbsp;Zhaomin Yao ,&nbsp;Yizhe Yuan ,&nbsp;Jingwei Too ,&nbsp;Fei Li ,&nbsp;Hongyu Wang ,&nbsp;Ying Zhan ,&nbsp;Xiaodan Wu ,&nbsp;Zhiguo Wang ,&nbsp;Guoxu Zhang","doi":"10.1016/j.ygeno.2024.110906","DOIUrl":null,"url":null,"abstract":"<div><p>Enhancers are crucial in gene expression regulation, dictating the specificity and timing of transcriptional activity, which highlights the importance of their identification for unravelling the intricacies of genetic regulation. Therefore, it is critical to identify enhancers and their strengths. Repeated sequences in the genome are repeats of the same or symmetrical fragments. There has been a great deal of evidence that repetitive sequences contain enormous amounts of genetic information. Thus, We introduce the W2V-Repeated Index, designed to identify enhancer sequence fragments and evaluates their strength through the analysis of repeated K-mer sequences in enhancer regions. Utilizing the word2vector algorithm for numerical conversion and Manta Ray Foraging Optimization for feature selection, this method effectively captures the frequency and distribution of K-mer sequences. By concentrating on repeated K-mer sequences, it minimizes computational complexity and facilitates the analysis of larger K values. Experiments indicate that our method performs better than all other advanced methods on almost all indicators.</p></div>","PeriodicalId":12521,"journal":{"name":"Genomics","volume":"116 5","pages":"Article 110906"},"PeriodicalIF":3.4000,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0888754324001277/pdfft?md5=0d0ea4c1427e7c0c571c9c7409b124e9&pid=1-s2.0-S0888754324001277-main.pdf","citationCount":"0","resultStr":"{\"title\":\"W2V-repeated index: Prediction of enhancers and their strength based on repeated fragments\",\"authors\":\"Weiming Xie ,&nbsp;Zhaomin Yao ,&nbsp;Yizhe Yuan ,&nbsp;Jingwei Too ,&nbsp;Fei Li ,&nbsp;Hongyu Wang ,&nbsp;Ying Zhan ,&nbsp;Xiaodan Wu ,&nbsp;Zhiguo Wang ,&nbsp;Guoxu Zhang\",\"doi\":\"10.1016/j.ygeno.2024.110906\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Enhancers are crucial in gene expression regulation, dictating the specificity and timing of transcriptional activity, which highlights the importance of their identification for unravelling the intricacies of genetic regulation. Therefore, it is critical to identify enhancers and their strengths. Repeated sequences in the genome are repeats of the same or symmetrical fragments. There has been a great deal of evidence that repetitive sequences contain enormous amounts of genetic information. Thus, We introduce the W2V-Repeated Index, designed to identify enhancer sequence fragments and evaluates their strength through the analysis of repeated K-mer sequences in enhancer regions. Utilizing the word2vector algorithm for numerical conversion and Manta Ray Foraging Optimization for feature selection, this method effectively captures the frequency and distribution of K-mer sequences. By concentrating on repeated K-mer sequences, it minimizes computational complexity and facilitates the analysis of larger K values. Experiments indicate that our method performs better than all other advanced methods on almost all indicators.</p></div>\",\"PeriodicalId\":12521,\"journal\":{\"name\":\"Genomics\",\"volume\":\"116 5\",\"pages\":\"Article 110906\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-07-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0888754324001277/pdfft?md5=0d0ea4c1427e7c0c571c9c7409b124e9&pid=1-s2.0-S0888754324001277-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genomics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0888754324001277\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOTECHNOLOGY & APPLIED MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genomics","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0888754324001277","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

增强子在基因表达调控中至关重要,它决定了转录活动的特异性和时间,这凸显了识别增强子对于揭示错综复杂的基因调控的重要性。因此,识别增强子及其强度至关重要。基因组中的重复序列是相同或对称片段的重复。已有大量证据表明,重复序列包含大量遗传信息。因此,我们引入了 W2V-Repeated Index,旨在通过分析增强子区域中重复的 K-mer 序列来识别增强子序列片段并评估其强度。该方法利用 word2vector 算法进行数值转换,并利用 Manta Ray Foraging Optimization 进行特征选择,从而有效捕捉 K-mer 序列的频率和分布。通过集中分析重复的 K-mer 序列,该方法最大程度地降低了计算复杂度,便于分析较大的 K 值。实验表明,在几乎所有指标上,我们的方法都优于所有其他先进方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
W2V-repeated index: Prediction of enhancers and their strength based on repeated fragments

Enhancers are crucial in gene expression regulation, dictating the specificity and timing of transcriptional activity, which highlights the importance of their identification for unravelling the intricacies of genetic regulation. Therefore, it is critical to identify enhancers and their strengths. Repeated sequences in the genome are repeats of the same or symmetrical fragments. There has been a great deal of evidence that repetitive sequences contain enormous amounts of genetic information. Thus, We introduce the W2V-Repeated Index, designed to identify enhancer sequence fragments and evaluates their strength through the analysis of repeated K-mer sequences in enhancer regions. Utilizing the word2vector algorithm for numerical conversion and Manta Ray Foraging Optimization for feature selection, this method effectively captures the frequency and distribution of K-mer sequences. By concentrating on repeated K-mer sequences, it minimizes computational complexity and facilitates the analysis of larger K values. Experiments indicate that our method performs better than all other advanced methods on almost all indicators.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Genomics
Genomics 生物-生物工程与应用微生物
CiteScore
9.60
自引率
2.30%
发文量
260
审稿时长
60 days
期刊介绍: Genomics is a forum for describing the development of genome-scale technologies and their application to all areas of biological investigation. As a journal that has evolved with the field that carries its name, Genomics focuses on the development and application of cutting-edge methods, addressing fundamental questions with potential interest to a wide audience. Our aim is to publish the highest quality research and to provide authors with rapid, fair and accurate review and publication of manuscripts falling within our scope.
期刊最新文献
Identification of CCR7 as a potential biomarker in polycystic ovary syndrome through transcriptome sequencing and integrated bioinformatics. Rapid sequencing and identification for 18-STRs long amplicon panel using portable devices and nanopore sequencer. Retraction notice to "LncRNA HOTAIR regulates the expression of E-cadherin to affect nasopharyngeal carcinoma progression by recruiting histone methylase EZH2 to mediate H3K27 trimethylation" [Genomics Volume 113, Issue 4, July 2021, Pages 2276-2289]. "Genome-based in silico assessment of biosynthetic gene clusters in Planctomycetota: Evidences of its wide divergent nature". Unveiling the intricate structural variability induced by repeat-mediated recombination in the complete mitochondrial genome of Cuscuta gronovii Willd.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1