首页 > 最新文献

GigaScience最新文献

英文 中文
De novo assembly and characterization of a highly degenerated ZW sex chromosome in the fish Megaleporinus macrocephalus. 从头组装并鉴定巨头鱼高度退化的 ZW 性染色体。
IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae085
Carolina Heloisa Souza-Borges, Ricardo Utsunomia, Alessandro M Varani, Marcela Uliano-Silva, Lieschen Valeria G Lira, Arno J Butzge, John F Gomez Agudelo, Shisley Manso, Milena V Freitas, Raquel B Ariede, Vito A Mastrochirico-Filho, Carolina Penaloza, Agustín Barria, Fábio Porto-Foresti, Fausto Foresti, Ricardo Hattori, Yann Guiguen, Ross D Houston, Diogo Teruo Hashimoto

Background: Megaleporinus macrocephalus (piauçu) is a Neotropical fish within Characoidei that presents a well-established heteromorphic ZZ/ZW sex determination system and thus constitutes a good model for studying W and Z chromosomes in fishes. We used PacBio reads and Hi-C to assemble a chromosome-level reference genome for M. macrocephalus. We generated family segregation information to construct a genetic map, pool sequencing of males and females to characterize its sex system, and RNA sequencing to highlight candidate genes of M. macrocephalus sex determination.

Results: The reference genome of M. macrocephalus is 1,282,030,339 bp in length and has a contig and scaffold N50 of 5.0 Mb and 45.03 Mb, respectively. In the sex chromosome, based on patterns of recombination suppression, coverage, FST, and sex-specific SNPs, we distinguished a putative W-specific region that is highly differentiated, a region where Z and W still share some similarities and is undergoing degeneration, and the PAR. The sex chromosome gene repertoire includes genes from the TGF-β family (amhr2, bmp7) and the Wnt/β-catenin pathway (wnt4, wnt7a), some of which are differentially expressed.

Conclusions: The chromosome-level genome of piauçu exhibits high quality, establishing a valuable resource for advancing research within the group. Our discoveries offer insights into the evolutionary dynamics of Z and W sex chromosomes in fish, emphasizing ongoing degenerative processes and indicating complex interactions between Z and W sequences in specific genomic regions. Notably, amhr2 and bmp7 are potential candidate genes for sex determination in M. macrocephalus.

背景:巨头鱼(Megaleporinus macrocephalus,piauçu)是一种新热带鱼类,属于Characoidei科,具有完善的异形ZZ/ZW性别决定系统,因此是研究鱼类W和Z染色体的良好模型。我们利用 PacBio reads 和 Hi-C 为巨头鱼组装了染色体水平的参考基因组。我们生成了家系分离信息以构建遗传图谱,对雄性和雌性进行集合测序以描述其性别系统的特征,并通过RNA测序突出了大口鲶性别决定的候选基因:结果:大头蝠参考基因组长度为1,282,030,339 bp,等位基因和支架N50分别为5.0 Mb和45.03 Mb。在性染色体中,根据重组抑制模式、覆盖率、FST和性别特异性SNPs,我们区分出了一个高度分化的假定W特异性区域、一个Z和W仍有一些相似性并正在退化的区域以及PAR。性染色体基因库包括来自 TGF-β 家族(amhr2、bmp7)和 Wnt/β-catenin 通路(wnt4、wnt7a)的基因,其中一些基因的表达存在差异:piauçu染色体级基因组具有很高的质量,为推动该群体的研究提供了宝贵的资源。我们的发现为鱼类 Z 和 W 性染色体的进化动态提供了见解,强调了正在进行的退化过程,并显示了特定基因组区域中 Z 和 W 序列之间复杂的相互作用。值得注意的是,amhr2和bmp7是巨头鱼性别决定的潜在候选基因。
{"title":"De novo assembly and characterization of a highly degenerated ZW sex chromosome in the fish Megaleporinus macrocephalus.","authors":"Carolina Heloisa Souza-Borges, Ricardo Utsunomia, Alessandro M Varani, Marcela Uliano-Silva, Lieschen Valeria G Lira, Arno J Butzge, John F Gomez Agudelo, Shisley Manso, Milena V Freitas, Raquel B Ariede, Vito A Mastrochirico-Filho, Carolina Penaloza, Agustín Barria, Fábio Porto-Foresti, Fausto Foresti, Ricardo Hattori, Yann Guiguen, Ross D Houston, Diogo Teruo Hashimoto","doi":"10.1093/gigascience/giae085","DOIUrl":"10.1093/gigascience/giae085","url":null,"abstract":"<p><strong>Background: </strong>Megaleporinus macrocephalus (piauçu) is a Neotropical fish within Characoidei that presents a well-established heteromorphic ZZ/ZW sex determination system and thus constitutes a good model for studying W and Z chromosomes in fishes. We used PacBio reads and Hi-C to assemble a chromosome-level reference genome for M. macrocephalus. We generated family segregation information to construct a genetic map, pool sequencing of males and females to characterize its sex system, and RNA sequencing to highlight candidate genes of M. macrocephalus sex determination.</p><p><strong>Results: </strong>The reference genome of M. macrocephalus is 1,282,030,339 bp in length and has a contig and scaffold N50 of 5.0 Mb and 45.03 Mb, respectively. In the sex chromosome, based on patterns of recombination suppression, coverage, FST, and sex-specific SNPs, we distinguished a putative W-specific region that is highly differentiated, a region where Z and W still share some similarities and is undergoing degeneration, and the PAR. The sex chromosome gene repertoire includes genes from the TGF-β family (amhr2, bmp7) and the Wnt/β-catenin pathway (wnt4, wnt7a), some of which are differentially expressed.</p><p><strong>Conclusions: </strong>The chromosome-level genome of piauçu exhibits high quality, establishing a valuable resource for advancing research within the group. Our discoveries offer insights into the evolutionary dynamics of Z and W sex chromosomes in fish, emphasizing ongoing degenerative processes and indicating complex interactions between Z and W sequences in specific genomic regions. Notably, amhr2 and bmp7 are potential candidate genes for sex determination in M. macrocephalus.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11590113/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142715761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
"UDE DIATOMS in the Wild 2024": a new image dataset of freshwater diatoms for training deep learning models. “野外硅藻2024”:用于训练深度学习模型的淡水硅藻的新图像数据集。
IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae087
Aishwarya Venkataramanan, Michael Kloster, Andrea Burfeid-Castellanos, Mimoza Dani, Ntambwe A S Mayombo, Danijela Vidakovic, Daniel Langenkämper, Mingkun Tan, Cedric Pradalier, Tim Nattkemper, Martin Laviale, Bánk Beszteri

Background: Diatoms are microalgae with finely ornamented microscopic silica shells. Their taxonomic identification by light microscopy is routinely used as part of community ecological research as well as ecological status assessment of aquatic ecosystems, and a need for digitalization of these methods has long been recognized. Alongside their high taxonomic and morphological diversity, several other factors make diatoms highly challenging for deep learning-based identification using light microscopy images. These include (i) an unusually high intraclass variability combined with small between-class differences, (ii) a rather different visual appearance of specimens depending on their orientation on the microscope slide, and (iii) the limited availability of diatom experts for accurate taxonomic annotation.

Findings: We present the largest diatom image dataset thus far, aimed at facilitating the application and benchmarking of innovative deep learning methods to the diatom identification problem on realistic research data, "UDE DIATOMS in the Wild 2024." The dataset contains 83,570 images of 611 diatom taxa, 101 of which are represented by at least 100 examples and 144 by at least 50 examples each. We showcase this dataset in 2 innovative analyses that address individual aspects of the above challenges using subclustering to deal with visually heterogeneous classes, out-of-distribution sample detection, and semi-supervised learning.

Conclusions: The problem of image-based identification of diatoms is both important for environmental research and challenging from the machine learning perspective. By making available the so far largest image dataset, accompanied by innovative analyses, this contribution will facilitate addressing these points by the scientific community.

背景:硅藻是一种微藻,具有精细装饰的微观二氧化硅外壳。它们的光学显微镜分类鉴定通常被用作群落生态学研究和水生生态系统生态状况评估的一部分,并且这些方法的数字化需求早已被认识到。除了高度的分类和形态多样性外,其他几个因素使硅藻在使用光学显微镜图像进行基于深度学习的识别时极具挑战性。这些包括:(i)异常高的类内变异性结合了小的类间差异,(ii)根据标本在显微镜载玻片上的方向不同,标本的视觉外观相当不同,以及(iii)硅藻专家进行准确分类注释的有限可用性。研究结果:我们提供了迄今为止最大的硅藻图像数据集,旨在促进创新深度学习方法在现实研究数据硅藻识别问题上的应用和基准测试,“野外硅藻2024”。该数据集包含611个硅藻分类的83570张图像,其中101张至少有100个样本,144张每个至少有50个样本。我们在两个创新分析中展示了这个数据集,这些分析使用子聚类来处理视觉异构类、分布外样本检测和半监督学习,解决了上述挑战的各个方面。结论:基于图像的硅藻识别问题对环境研究具有重要意义,对机器学习具有挑战性。通过提供迄今为止最大的图像数据集,以及创新的分析,这一贡献将有助于科学界解决这些问题。
{"title":"\"UDE DIATOMS in the Wild 2024\": a new image dataset of freshwater diatoms for training deep learning models.","authors":"Aishwarya Venkataramanan, Michael Kloster, Andrea Burfeid-Castellanos, Mimoza Dani, Ntambwe A S Mayombo, Danijela Vidakovic, Daniel Langenkämper, Mingkun Tan, Cedric Pradalier, Tim Nattkemper, Martin Laviale, Bánk Beszteri","doi":"10.1093/gigascience/giae087","DOIUrl":"10.1093/gigascience/giae087","url":null,"abstract":"<p><strong>Background: </strong>Diatoms are microalgae with finely ornamented microscopic silica shells. Their taxonomic identification by light microscopy is routinely used as part of community ecological research as well as ecological status assessment of aquatic ecosystems, and a need for digitalization of these methods has long been recognized. Alongside their high taxonomic and morphological diversity, several other factors make diatoms highly challenging for deep learning-based identification using light microscopy images. These include (i) an unusually high intraclass variability combined with small between-class differences, (ii) a rather different visual appearance of specimens depending on their orientation on the microscope slide, and (iii) the limited availability of diatom experts for accurate taxonomic annotation.</p><p><strong>Findings: </strong>We present the largest diatom image dataset thus far, aimed at facilitating the application and benchmarking of innovative deep learning methods to the diatom identification problem on realistic research data, \"UDE DIATOMS in the Wild 2024.\" The dataset contains 83,570 images of 611 diatom taxa, 101 of which are represented by at least 100 examples and 144 by at least 50 examples each. We showcase this dataset in 2 innovative analyses that address individual aspects of the above challenges using subclustering to deal with visually heterogeneous classes, out-of-distribution sample detection, and semi-supervised learning.</p><p><strong>Conclusions: </strong>The problem of image-based identification of diatoms is both important for environmental research and challenging from the machine learning perspective. By making available the so far largest image dataset, accompanied by innovative analyses, this contribution will facilitate addressing these points by the scientific community.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11604061/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142750299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Functional annotation of regulatory elements in rainbow trout uncovers roles of the epigenome in genetic selection and genome evolution. 虹鳟调控元件的功能注释揭示了表观基因组在遗传选择和基因组进化中的作用。
IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae092
Mohamed Salem, Rafet Al-Tobasei, Ali Ali, Liqi An, Ying Wang, Xuechen Bai, Ye Bi, Huaijun Zhou

Rainbow trout (RBT) has gained widespread attention as a biological model across various fields and has been rapidly adopted for aquaculture and recreational purposes on 6 continents. Despite significant efforts to develop genome sequences for RBT, the functional genomic basis of RBT's environmental, phenotypic, and evolutionary variations still requires epigenome reference annotations. This study has produced a comprehensive catalog and epigenome annotation tracks of RBT, detecting gene regulatory elements, including chromatin histone modifications, chromatin accessibility, and DNA methylation. By integrating chromatin immunoprecipitation sequencing, ATAC sequencing, Methyl Mini-seq, and RNA sequencing data, this new regulatory element catalog has helped to characterize the epigenome dynamics and its correlation with gene expression. The study has also identified potential causal variants and transcription factors regulating complex domestication phenotypic traits. This research also provides valuable insights into the epigenome's role in gene evolution and the mechanism of duplicate gene retention 100 million years after RBT whole-genome duplication and during re-diploidization. The newly developed epigenome annotation maps are among the first in fish and are expected to enhance the accuracy and efficiency of genomic studies and applications, including genome-wide association studies, causative variation identification, and genomic selection in RBT and fish comparative genomics.

虹鳟鱼(RBT)作为一种生物学模式在各个领域得到了广泛关注,并在六大洲迅速被用于水产养殖和娱乐目的。尽管为RBT开发了大量的基因组序列,但RBT的环境、表型和进化变异的功能基因组基础仍然需要表观基因组参考注释。本研究建立了RBT的综合目录和表观基因组注释轨迹,检测基因调控元件,包括染色质组蛋白修饰、染色质可及性和DNA甲基化。通过整合染色质免疫沉淀测序、ATAC测序、甲基Mini-seq和RNA测序数据,这个新的调控元件目录有助于表征表观基因组动力学及其与基因表达的相关性。该研究还确定了潜在的因果变异和调节复杂驯化表型性状的转录因子。该研究还为表观基因组在基因进化中的作用以及RBT全基因组复制后1亿年和再二倍体化过程中重复基因保留的机制提供了有价值的见解。新开发的表观基因组注释图谱是鱼类中最早的表观基因组注释图谱,有望提高基因组研究和应用的准确性和效率,包括全基因组关联研究、致病变异鉴定和RBT和鱼类比较基因组学中的基因组选择。
{"title":"Functional annotation of regulatory elements in rainbow trout uncovers roles of the epigenome in genetic selection and genome evolution.","authors":"Mohamed Salem, Rafet Al-Tobasei, Ali Ali, Liqi An, Ying Wang, Xuechen Bai, Ye Bi, Huaijun Zhou","doi":"10.1093/gigascience/giae092","DOIUrl":"10.1093/gigascience/giae092","url":null,"abstract":"<p><p>Rainbow trout (RBT) has gained widespread attention as a biological model across various fields and has been rapidly adopted for aquaculture and recreational purposes on 6 continents. Despite significant efforts to develop genome sequences for RBT, the functional genomic basis of RBT's environmental, phenotypic, and evolutionary variations still requires epigenome reference annotations. This study has produced a comprehensive catalog and epigenome annotation tracks of RBT, detecting gene regulatory elements, including chromatin histone modifications, chromatin accessibility, and DNA methylation. By integrating chromatin immunoprecipitation sequencing, ATAC sequencing, Methyl Mini-seq, and RNA sequencing data, this new regulatory element catalog has helped to characterize the epigenome dynamics and its correlation with gene expression. The study has also identified potential causal variants and transcription factors regulating complex domestication phenotypic traits. This research also provides valuable insights into the epigenome's role in gene evolution and the mechanism of duplicate gene retention 100 million years after RBT whole-genome duplication and during re-diploidization. The newly developed epigenome annotation maps are among the first in fish and are expected to enhance the accuracy and efficiency of genomic studies and applications, including genome-wide association studies, causative variation identification, and genomic selection in RBT and fish comparative genomics.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11629980/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142828078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ntsm: an alignment-free, ultra-low-coverage, sequencing technology agnostic, intraspecies sample comparison tool for sample swap detection. ntsm:一种无配准、超低覆盖率、与测序技术无关、用于样本交换检测的种内样本比较工具。
IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae024
Justin Chu, Jiazhen Rong, Xiaowen Feng, Heng Li

Background: Due to human error, sample swapping in large cohort studies with heterogeneous data types (e.g., mix of Oxford Nanopore Technologies, Pacific Bioscience, Illumina data, etc.) remains a common issue plaguing large-scale studies. At present, all sample swapping detection methods require costly and unnecessary (e.g., if data are only used for genome assembly) alignment, positional sorting, and indexing of the data in order to compare similarly. As studies include more samples and new sequencing data types, robust quality control tools will become increasingly important.

Findings: The similarity between samples can be determined using indexed k-mer sequence variants. To increase statistical power, we use coverage information on variant sites, calculating similarity using a likelihood ratio-based test. Per sample error rate, and coverage bias (i.e., missing sites) can also be estimated with this information, which can be used to determine if a spatially indexed principal component analysis (PCA)-based prescreening method can be used, which can greatly speed up analysis by preventing exhaustive all-to-all comparisons.

Conclusions: Because this tool processes raw data, is faster than alignment, and can be used on very low-coverage data, it can save an immense degree of computational resources in standard quality control (QC) pipelines. It is robust enough to be used on different sequencing data types, important in studies that leverage the strengths of different sequencing technologies. In addition to its primary use case of sample swap detection, this method also provides information useful in QC, such as error rate and coverage bias, as well as population-level PCA ancestry analysis visualization.

背景:由于人为错误,在具有异质数据类型(如牛津纳米孔技术公司、太平洋生物科学公司、Illumina 数据的混合等)的大型队列研究中,样本交换仍然是困扰大规模研究的一个常见问题。目前,所有样本交换检测方法都需要对数据进行成本高昂且不必要的(例如,如果数据仅用于基因组组装)比对、位置排序和索引,以便进行类似比较。随着研究包括更多的样本和新的测序数据类型,强大的质量控制工具将变得越来越重要:样本间的相似性可通过索引 k-mer 序列变异来确定。为了提高统计能力,我们使用了变异位点的覆盖信息,通过基于似然比的检验来计算相似性。利用这些信息还可以估算出每个样本的错误率和覆盖偏差(即缺失位点),从而确定是否可以使用基于空间索引主成分分析(PCA)的预选方法,这种方法可以避免穷举式的全对全比较,从而大大加快分析速度:由于该工具处理原始数据的速度比配准更快,而且可用于覆盖率极低的数据,因此可为标准质量控制(QC)管道节省大量计算资源。它足够强大,可用于不同的测序数据类型,这对充分利用不同测序技术优势的研究非常重要。除了样本交换检测这一主要用途外,该方法还能提供质量控制方面的有用信息,如错误率和覆盖偏差,以及种群级 PCA 祖先分析可视化。
{"title":"ntsm: an alignment-free, ultra-low-coverage, sequencing technology agnostic, intraspecies sample comparison tool for sample swap detection.","authors":"Justin Chu, Jiazhen Rong, Xiaowen Feng, Heng Li","doi":"10.1093/gigascience/giae024","DOIUrl":"10.1093/gigascience/giae024","url":null,"abstract":"<p><strong>Background: </strong>Due to human error, sample swapping in large cohort studies with heterogeneous data types (e.g., mix of Oxford Nanopore Technologies, Pacific Bioscience, Illumina data, etc.) remains a common issue plaguing large-scale studies. At present, all sample swapping detection methods require costly and unnecessary (e.g., if data are only used for genome assembly) alignment, positional sorting, and indexing of the data in order to compare similarly. As studies include more samples and new sequencing data types, robust quality control tools will become increasingly important.</p><p><strong>Findings: </strong>The similarity between samples can be determined using indexed k-mer sequence variants. To increase statistical power, we use coverage information on variant sites, calculating similarity using a likelihood ratio-based test. Per sample error rate, and coverage bias (i.e., missing sites) can also be estimated with this information, which can be used to determine if a spatially indexed principal component analysis (PCA)-based prescreening method can be used, which can greatly speed up analysis by preventing exhaustive all-to-all comparisons.</p><p><strong>Conclusions: </strong>Because this tool processes raw data, is faster than alignment, and can be used on very low-coverage data, it can save an immense degree of computational resources in standard quality control (QC) pipelines. It is robust enough to be used on different sequencing data types, important in studies that leverage the strengths of different sequencing technologies. In addition to its primary use case of sample swap detection, this method also provides information useful in QC, such as error rate and coverage bias, as well as population-level PCA ancestry analysis visualization.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11148594/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141237337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Near telomere-to-telomere genome assembly of Mongolian cattle: implications for population genetic variation and beef quality. 蒙古牛近端粒到端粒基因组组装:对群体遗传变异和牛肉品质的影响。
IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae099
Rina Su, Hao Zhou, Wenhao Yang, Sorgog Moqir, Xiji Ritu, Lei Liu, Ying Shi, Ai Dong, Menghe Bayier, Yibu Letu, Xin Manxi, Hasi Chulu, Narenhua Nasenochir, He Meng, Muren Herrid

Background: Mongolian cattle, a unique breed indigenous to China, represent valuable genetic resources and serve as important sources of meat and milk. However, there is a lack of high-quality genomes in cattle, which limits biological research and breeding improvement.

Findings: In this study, we conducted whole-genome sequencing on a Mongolian bull. This effort yielded a 3.1 Gb Mongolian cattle genome sequence, with a BUSCO integrity assessment of 95.9%. The assembly achieved both contig N50 and scaffold N50 values of 110.9 Mb, with only 3 gaps identified across the entire genome. Additionally, we successfully assembled the Y chromosome among the 31 chromosomes. Notably, 3 chromosomes were identified as having telomeres at both ends. The annotation data include 54.31% repetitive sequences and 29,794 coding genes. Furthermore, a population genetic variation analysis was conducted on 332 individuals from 56 breeds, through which we identified variant loci and potentially discovered genes associated with the formation of marbling patterns in beef, predominantly located on chromosome 12.

Conclusions: This study produced a genome with high continuity, completeness, and accuracy, marking the first assembly and annotation of a near telomere-to-telomere genome in cattle. Based on this, we generated a variant database comprising 332 individuals. The assembly of the genome and the analysis of population variants provide significant insights into cattle evolution and enhance our understanding of breeding selection.

背景:蒙古牛是中国特有的畜种,具有宝贵的遗传资源和重要的肉奶来源。然而,牛缺乏高质量的基因组,这限制了生物学研究和育种改进。在这项研究中,我们对一头蒙古公牛进行了全基因组测序。这项工作产生了3.1 Gb的蒙古牛基因组序列,BUSCO完整性评估为95.9%。该组装体的N50和支架N50值均为110.9 Mb,在整个基因组中仅鉴定出3个缺口。此外,我们成功地组装了31条染色体中的Y染色体。值得注意的是,有3条染色体在两端都有端粒。注释数据包含54.31%的重复序列和29794个编码基因。此外,对来自56个品种的332个个体进行了群体遗传变异分析,通过这些分析,我们确定了变异位点,并潜在地发现了与牛肉大理石花纹形成相关的基因,这些基因主要位于12号染色体上。结论:该研究产生了一个具有高连续性、完整性和准确性的基因组,标志着牛中第一个近端粒到端粒基因组的组装和注释。在此基础上,我们生成了一个包含332个个体的变体数据库。基因组的组装和种群变异的分析为牛的进化提供了重要的见解,并增强了我们对育种选择的理解。
{"title":"Near telomere-to-telomere genome assembly of Mongolian cattle: implications for population genetic variation and beef quality.","authors":"Rina Su, Hao Zhou, Wenhao Yang, Sorgog Moqir, Xiji Ritu, Lei Liu, Ying Shi, Ai Dong, Menghe Bayier, Yibu Letu, Xin Manxi, Hasi Chulu, Narenhua Nasenochir, He Meng, Muren Herrid","doi":"10.1093/gigascience/giae099","DOIUrl":"10.1093/gigascience/giae099","url":null,"abstract":"<p><strong>Background: </strong>Mongolian cattle, a unique breed indigenous to China, represent valuable genetic resources and serve as important sources of meat and milk. However, there is a lack of high-quality genomes in cattle, which limits biological research and breeding improvement.</p><p><strong>Findings: </strong>In this study, we conducted whole-genome sequencing on a Mongolian bull. This effort yielded a 3.1 Gb Mongolian cattle genome sequence, with a BUSCO integrity assessment of 95.9%. The assembly achieved both contig N50 and scaffold N50 values of 110.9 Mb, with only 3 gaps identified across the entire genome. Additionally, we successfully assembled the Y chromosome among the 31 chromosomes. Notably, 3 chromosomes were identified as having telomeres at both ends. The annotation data include 54.31% repetitive sequences and 29,794 coding genes. Furthermore, a population genetic variation analysis was conducted on 332 individuals from 56 breeds, through which we identified variant loci and potentially discovered genes associated with the formation of marbling patterns in beef, predominantly located on chromosome 12.</p><p><strong>Conclusions: </strong>This study produced a genome with high continuity, completeness, and accuracy, marking the first assembly and annotation of a near telomere-to-telomere genome in cattle. Based on this, we generated a variant database comprising 332 individuals. The assembly of the genome and the analysis of population variants provide significant insights into cattle evolution and enhance our understanding of breeding selection.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11653892/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142853779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Habitat suitability maps for Australian flora and fauna under CMIP6 climate scenarios. CMIP6 气候情景下澳大利亚动植物栖息地适宜性地图。
IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae002
Carla L Archibald, David M Summers, Erin M Graham, Brett A Bryan

Background: Spatial information about the location and suitability of areas for native plant and animal species under different climate futures is an important input to land use and conservation planning and management. Australia, renowned for its abundant species diversity and endemism, often relies on modeled data to assess species distributions due to the country's vast size and the challenges associated with conducting on-ground surveys on such a large scale. The objective of this article is to develop habitat suitability maps for Australian flora and fauna under different climate futures.

Results: Using MaxEnt, we produced Australia-wide habitat suitability maps under RCP2.6-SSP1, RCP4.5-SSP2, RCP7.0-SSP3, and RCP8.5-SSP5 climate futures for 1,382 terrestrial vertebrates and 9,251 vascular plants vascular plants at 5 km2 for open access. This represents 60% of all Australian mammal species, 77% of amphibian species, 50% of reptile species, 71% of bird species, and 44% of vascular plant species. We also include tabular data, which include summaries of total quality-weighted habitat area of species under different climate scenarios and time periods.

Conclusions: The spatial data supplied can help identify important and sensitive locations for species under various climate futures. Additionally, the supplied tabular data can provide insights into the impacts of climate change on biodiversity in Australia. These habitat suitability maps can be used as input data for landscape and conservation planning or species management, particularly under different climate change scenarios in Australia.

背景:在不同的未来气候条件下,有关本地动植物物种分布位置和适宜性的空间信息是土地利用和保护规划与管理的重要依据。澳大利亚以其丰富的物种多样性和特有性闻名于世,但由于国土面积辽阔,在如此大的范围内进行实地调查存在诸多挑战,因此通常依赖模型数据来评估物种分布。本文旨在绘制不同气候条件下澳大利亚动植物的栖息地适宜性地图:使用 MaxEnt,我们绘制了澳大利亚全境在 RCP2.6-SSP1、RCP4.5-SSP2、RCP7.0-SSP3 和 RCP8.5-SSP5 气候未来下的栖息地适宜性地图,涉及 1,382 种陆生脊椎动物和 9,251 种维管束植物,面积为 5 平方公里,可公开获取。这代表了澳大利亚所有哺乳动物物种的 60%、两栖动物物种的 77%、爬行动物物种的 50%、鸟类物种的 71% 和维管植物物种的 44%。我们还提供了表格数据,其中包括不同气候情景和时间段下的物种质量加权栖息地总面积汇总:所提供的空间数据有助于确定不同气候未来下物种的重要和敏感地点。此外,所提供的表格数据还能让人们深入了解气候变化对澳大利亚生物多样性的影响。这些栖息地适宜性地图可用作景观和保护规划或物种管理的输入数据,尤其是在澳大利亚不同的气候变化情景下。
{"title":"Habitat suitability maps for Australian flora and fauna under CMIP6 climate scenarios.","authors":"Carla L Archibald, David M Summers, Erin M Graham, Brett A Bryan","doi":"10.1093/gigascience/giae002","DOIUrl":"10.1093/gigascience/giae002","url":null,"abstract":"<p><strong>Background: </strong>Spatial information about the location and suitability of areas for native plant and animal species under different climate futures is an important input to land use and conservation planning and management. Australia, renowned for its abundant species diversity and endemism, often relies on modeled data to assess species distributions due to the country's vast size and the challenges associated with conducting on-ground surveys on such a large scale. The objective of this article is to develop habitat suitability maps for Australian flora and fauna under different climate futures.</p><p><strong>Results: </strong>Using MaxEnt, we produced Australia-wide habitat suitability maps under RCP2.6-SSP1, RCP4.5-SSP2, RCP7.0-SSP3, and RCP8.5-SSP5 climate futures for 1,382 terrestrial vertebrates and 9,251 vascular plants vascular plants at 5 km2 for open access. This represents 60% of all Australian mammal species, 77% of amphibian species, 50% of reptile species, 71% of bird species, and 44% of vascular plant species. We also include tabular data, which include summaries of total quality-weighted habitat area of species under different climate scenarios and time periods.</p><p><strong>Conclusions: </strong>The spatial data supplied can help identify important and sensitive locations for species under various climate futures. Additionally, the supplied tabular data can provide insights into the impacts of climate change on biodiversity in Australia. These habitat suitability maps can be used as input data for landscape and conservation planning or species management, particularly under different climate change scenarios in Australia.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10939329/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140039094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chromosome-level genome of the poultry shaft louse Menopon gallinae provides insight into the host-switching and adaptive evolution of parasitic lice. 家禽轴虱 Menopon gallinae 染色体水平的基因组有助于深入了解寄生虱的宿主转换和适应性进化。
IF 3.5 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae004
Ye Xu, Ling Ma, Shanlin Liu, Yanxin Liang, Qiaoqiao Liu, Zhixin He, Li Tian, Yuange Duan, Wanzhi Cai, Hu Li, Fan Song

Background: Lice (Psocodea: Phthiraptera) are one important group of parasites that infects birds and mammals. It is believed that the ancestor of parasitic lice originated on the ancient avian host, and ancient mammals acquired these parasites via host-switching from birds. Here we present the first chromosome-level genome of Menopon gallinae in Amblycera (earliest diverging lineage of parasitic lice). We explore the transition of louse host-switching from birds to mammals at the genomic level by identifying numerous idiosyncratic genomic variations.

Results: The assembled genome is 155 Mb in length, with a contig N50 of 27.42 Mb. Hi-C scaffolding assigned 97% of the bases to 5 chromosomes. The genome of M. gallinae retains a basal insect repertoire of 11,950 protein-coding genes. By comparing the genomes of lice to those of multiple representative insects in other orders, we discovered that gene families of digestion, detoxification, and immunity-related are generally conserved between bird lice and mammal lice, while mammal lice have undergone a significant reduction in genes related to chemosensory systems and temperature. This suggests that mammal lice have lost some of these genes through the adaption to environment and temperatures after host-switching. Furthermore, 7 genes related to hematophagy were positively selected in mammal lice, suggesting their involvement in the hematophagous behavior.

Conclusions: Our high-quality genome of M. gallinae provides a valuable resource for comparative genomic research in Phthiraptera and facilitates further studies on adaptive evolution of host-switching within parasitic lice.

背景:虱子(Psocodea: Phthiraptera)是感染鸟类和哺乳动物的一类重要寄生虫。据认为,寄生虱的祖先起源于古代鸟类宿主,古代哺乳动物通过宿主转换从鸟类获得这些寄生虫。在这里,我们首次在染色体组水平上展示了寄生虱子最早分化世系(Amblycera)中的Menopon gallinae基因组。我们通过识别大量特异性基因组变异,在基因组水平上探索了虱子宿主从鸟类向哺乳动物转换的过程:组装的基因组长度为 155 Mb,等位基因 N50 为 27.42 Mb。Hi-C脚手架将97%的碱基分配到5条染色体上。五倍子甲虫的基因组保留了昆虫基本的 11,950 个编码蛋白质的基因。通过将虱子的基因组与其他目多种代表性昆虫的基因组进行比较,我们发现消化、解毒和免疫相关的基因家族在鸟类虱子和哺乳类虱子之间基本保持一致,而哺乳类虱子中与化感系统和温度相关的基因则显著减少。这表明,哺乳动物的虱子在宿主转换后,由于对环境和温度的适应而丢失了其中的一些基因。此外,哺乳动物虱子中有7个与血噬有关的基因被正选择,表明它们参与了血噬行为:我们高质量的M. gallinae基因组为Phthiraptera的比较基因组研究提供了宝贵的资源,有助于进一步研究寄生虱宿主转换的适应性进化。
{"title":"Chromosome-level genome of the poultry shaft louse Menopon gallinae provides insight into the host-switching and adaptive evolution of parasitic lice.","authors":"Ye Xu, Ling Ma, Shanlin Liu, Yanxin Liang, Qiaoqiao Liu, Zhixin He, Li Tian, Yuange Duan, Wanzhi Cai, Hu Li, Fan Song","doi":"10.1093/gigascience/giae004","DOIUrl":"10.1093/gigascience/giae004","url":null,"abstract":"<p><strong>Background: </strong>Lice (Psocodea: Phthiraptera) are one important group of parasites that infects birds and mammals. It is believed that the ancestor of parasitic lice originated on the ancient avian host, and ancient mammals acquired these parasites via host-switching from birds. Here we present the first chromosome-level genome of Menopon gallinae in Amblycera (earliest diverging lineage of parasitic lice). We explore the transition of louse host-switching from birds to mammals at the genomic level by identifying numerous idiosyncratic genomic variations.</p><p><strong>Results: </strong>The assembled genome is 155 Mb in length, with a contig N50 of 27.42 Mb. Hi-C scaffolding assigned 97% of the bases to 5 chromosomes. The genome of M. gallinae retains a basal insect repertoire of 11,950 protein-coding genes. By comparing the genomes of lice to those of multiple representative insects in other orders, we discovered that gene families of digestion, detoxification, and immunity-related are generally conserved between bird lice and mammal lice, while mammal lice have undergone a significant reduction in genes related to chemosensory systems and temperature. This suggests that mammal lice have lost some of these genes through the adaption to environment and temperatures after host-switching. Furthermore, 7 genes related to hematophagy were positively selected in mammal lice, suggesting their involvement in the hematophagous behavior.</p><p><strong>Conclusions: </strong>Our high-quality genome of M. gallinae provides a valuable resource for comparative genomic research in Phthiraptera and facilitates further studies on adaptive evolution of host-switching within parasitic lice.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 1","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10904027/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139899653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging citizen science for monitoring urban forageable plants. 利用公民科学监测城市可食用植物。
IF 3.5 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae007
Filipi Miranda Soares, Luís Ferreira Pires, Maria Carolina Garcia, Yamine Bouzembrak, Lidio Coradin, Natalia Pirani Ghilardi-Lopes, Rubens Rangel Silva, Aline Martins de Carvalho, Benildes Coura Moreira Dos Santos Maculan, Sheina Koffler, Uiara Bandineli Montedo, Debora Pignatari Drucker, Raquel Santiago, Anand Gavai, Maria Clara Peres de Carvalho, Ana Carolina da Silva Lima, Hillary Dandara Elias Gabriel, Stephanie Gabriele Mendonça de França, Karoline Reis de Almeida, Bárbara Junqueira Dos Santos, Antonio Mauro Saraiva

Urbanization brings forth social challenges in emerging countries such as Brazil, encompassing food scarcity, health deterioration, air pollution, and biodiversity loss. Despite this, urban areas like the city of São Paulo still boast ample green spaces, offering opportunities for nature appreciation and conservation, enhancing city resilience and livability. Citizen science is a collaborative endeavor between professional scientists and nonprofessional scientists in scientific research that may help to understand the dynamics of urban ecosystems. We believe citizen science has the potential to promote human and nature connection in urban areas and provide useful data on urban biodiversity.

城市化给巴西等新兴国家带来了社会挑战,包括粮食短缺、健康恶化、空气污染和生物多样性丧失。尽管如此,圣保罗市等城市地区仍然拥有大量绿地,为欣赏和保护自然提供了机会,提高了城市的抗灾能力和宜居性。公民科学是专业科学家和非专业科学家在科学研究方面的合作努力,有助于了解城市生态系统的动态。我们相信,公民科学有可能促进城市地区人与自然的联系,并提供有关城市生物多样性的有用数据。
{"title":"Leveraging citizen science for monitoring urban forageable plants.","authors":"Filipi Miranda Soares, Luís Ferreira Pires, Maria Carolina Garcia, Yamine Bouzembrak, Lidio Coradin, Natalia Pirani Ghilardi-Lopes, Rubens Rangel Silva, Aline Martins de Carvalho, Benildes Coura Moreira Dos Santos Maculan, Sheina Koffler, Uiara Bandineli Montedo, Debora Pignatari Drucker, Raquel Santiago, Anand Gavai, Maria Clara Peres de Carvalho, Ana Carolina da Silva Lima, Hillary Dandara Elias Gabriel, Stephanie Gabriele Mendonça de França, Karoline Reis de Almeida, Bárbara Junqueira Dos Santos, Antonio Mauro Saraiva","doi":"10.1093/gigascience/giae007","DOIUrl":"10.1093/gigascience/giae007","url":null,"abstract":"<p><p>Urbanization brings forth social challenges in emerging countries such as Brazil, encompassing food scarcity, health deterioration, air pollution, and biodiversity loss. Despite this, urban areas like the city of São Paulo still boast ample green spaces, offering opportunities for nature appreciation and conservation, enhancing city resilience and livability. Citizen science is a collaborative endeavor between professional scientists and nonprofessional scientists in scientific research that may help to understand the dynamics of urban ecosystems. We believe citizen science has the potential to promote human and nature connection in urban areas and provide useful data on urban biodiversity.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10914215/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140039095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RicePilaf: a post-GWAS/QTL dashboard to integrate pangenomic, coexpression, regulatory, epigenomic, ontology, pathway, and text-mining information to provide functional insights into rice QTLs and GWAS loci. RicePilaf:GWAS/QTL 后仪表板,用于整合泛基因组学、共表达、调控、表观基因组学、本体论、通路和文本挖掘信息,为水稻 QTL 和 GWAS 基因座提供功能性见解。
IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae013
Anish M S Shrestha, Mark Edward M Gonzales, Phoebe Clare L Ong, Pierre Larmande, Hyun-Sook Lee, Ji-Ung Jeung, Ajay Kohli, Dmytro Chebotarov, Ramil P Mauleon, Jae-Sung Lee, Kenneth L McNally

Background: As the number of genome-wide association study (GWAS) and quantitative trait locus (QTL) mappings in rice continues to grow, so does the already long list of genomic loci associated with important agronomic traits. Typically, loci implicated by GWAS/QTL analysis contain tens to hundreds to thousands of single-nucleotide polmorphisms (SNPs)/genes, not all of which are causal and many of which are in noncoding regions. Unraveling the biological mechanisms that tie the GWAS regions and QTLs to the trait of interest is challenging, especially since it requires collating functional genomics information about the loci from multiple, disparate data sources.

Results: We present RicePilaf, a web app for post-GWAS/QTL analysis, that performs a slew of novel bioinformatics analyses to cross-reference GWAS results and QTL mappings with a host of publicly available rice databases. In particular, it integrates (i) pangenomic information from high-quality genome builds of multiple rice varieties, (ii) coexpression information from genome-scale coexpression networks, (iii) ontology and pathway information, (iv) regulatory information from rice transcription factor databases, (v) epigenomic information from multiple high-throughput epigenetic experiments, and (vi) text-mining information extracted from scientific abstracts linking genes and traits. We demonstrate the utility of RicePilaf by applying it to analyze GWAS peaks of preharvest sprouting and genes underlying yield-under-drought QTLs.

Conclusions: RicePilaf enables rice scientists and breeders to shed functional light on their GWAS regions and QTLs, and it provides them with a means to prioritize SNPs/genes for further experiments. The source code, a Docker image, and a demo version of RicePilaf are publicly available at https://github.com/bioinfodlsu/rice-pilaf.

背景:随着水稻全基因组关联研究(GWAS)和数量性状位点(QTL)图谱的数量不断增加,与重要农艺性状相关的基因组位点清单也越来越长。通常情况下,GWAS/QTL 分析所涉及的位点包含几十个、几百个到几千个单核苷酸多态性(SNPs)/基因,其中并非所有基因都是因果关系,而且许多基因都位于非编码区。揭示将 GWAS 区域和 QTL 与相关性状联系起来的生物学机制具有挑战性,特别是因为这需要整理来自多个不同数据源的有关基因座的功能基因组学信息:我们介绍了一款用于 GWAS/QTL 后分析的网络应用程序 RicePilaf,它能执行一系列新颖的生物信息学分析,将 GWAS 结果和 QTL 映射与大量公开可用的水稻数据库进行交叉引用。特别是,它整合了(i)来自多个水稻品种高质量基因组构建的泛基因组信息;(ii)来自基因组规模共表达网络的共表达信息;(iii)本体和通路信息;(iv)来自水稻转录因子数据库的调控信息;(v)来自多个高通量表观遗传学实验的表观基因组信息;以及(vi)从连接基因和性状的科学摘要中提取的文本挖掘信息。我们应用 RicePilaf 分析了收获前发芽的 GWAS 峰值和干旱下产量 QTLs 的潜在基因,从而证明了 RicePilaf 的实用性:RicePilaf使水稻科学家和育种家能够对他们的GWAS区域和QTLs进行功能阐释,并为他们提供了一种方法来优先选择SNPs/基因进行进一步的实验。RicePilaf 的源代码、Docker 镜像和演示版可在 https://github.com/bioinfodlsu/rice-pilaf 上公开获取。
{"title":"RicePilaf: a post-GWAS/QTL dashboard to integrate pangenomic, coexpression, regulatory, epigenomic, ontology, pathway, and text-mining information to provide functional insights into rice QTLs and GWAS loci.","authors":"Anish M S Shrestha, Mark Edward M Gonzales, Phoebe Clare L Ong, Pierre Larmande, Hyun-Sook Lee, Ji-Ung Jeung, Ajay Kohli, Dmytro Chebotarov, Ramil P Mauleon, Jae-Sung Lee, Kenneth L McNally","doi":"10.1093/gigascience/giae013","DOIUrl":"10.1093/gigascience/giae013","url":null,"abstract":"<p><strong>Background: </strong>As the number of genome-wide association study (GWAS) and quantitative trait locus (QTL) mappings in rice continues to grow, so does the already long list of genomic loci associated with important agronomic traits. Typically, loci implicated by GWAS/QTL analysis contain tens to hundreds to thousands of single-nucleotide polmorphisms (SNPs)/genes, not all of which are causal and many of which are in noncoding regions. Unraveling the biological mechanisms that tie the GWAS regions and QTLs to the trait of interest is challenging, especially since it requires collating functional genomics information about the loci from multiple, disparate data sources.</p><p><strong>Results: </strong>We present RicePilaf, a web app for post-GWAS/QTL analysis, that performs a slew of novel bioinformatics analyses to cross-reference GWAS results and QTL mappings with a host of publicly available rice databases. In particular, it integrates (i) pangenomic information from high-quality genome builds of multiple rice varieties, (ii) coexpression information from genome-scale coexpression networks, (iii) ontology and pathway information, (iv) regulatory information from rice transcription factor databases, (v) epigenomic information from multiple high-throughput epigenetic experiments, and (vi) text-mining information extracted from scientific abstracts linking genes and traits. We demonstrate the utility of RicePilaf by applying it to analyze GWAS peaks of preharvest sprouting and genes underlying yield-under-drought QTLs.</p><p><strong>Conclusions: </strong>RicePilaf enables rice scientists and breeders to shed functional light on their GWAS regions and QTLs, and it provides them with a means to prioritize SNPs/genes for further experiments. The source code, a Docker image, and a demo version of RicePilaf are publicly available at https://github.com/bioinfodlsu/rice-pilaf.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11148593/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141237423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CheRRI-Accurate classification of the biological relevance of putative RNA-RNA interaction sites. CheRRI--对假定的 RNA-RNA 相互作用位点的生物学相关性进行精确分类。
IF 3.5 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae022
Teresa Müller, Stefan Mautner, Pavankumar Videm, Florian Eggenhofer, Martin Raden, Rolf Backofen

Background: RNA-RNA interactions are key to a wide range of cellular functions. The detection of potential interactions helps to understand the underlying processes. However, potential interactions identified via in silico or experimental high-throughput methods can lack precision because of a high false-positive rate.

Results: We present CheRRI, the first tool to evaluate the biological relevance of putative RNA-RNA interaction sites. CheRRI filters candidates via a machine learning-based model trained on experimental RNA-RNA interactome data. Its unique setup combines interactome data and an established thermodynamic prediction tool to integrate experimental data with state-of-the-art computational models. Applying these data to an automated machine learning approach provides the opportunity to not only filter data for potential false positives but also tailor the underlying interaction site model to specific needs.

Conclusions: CheRRI is a stand-alone postprocessing tool to filter either predicted or experimentally identified potential RNA-RNA interactions on a genomic level to enhance the quality of interaction candidates. It is easy to install (via conda, pip packages), use (via Galaxy), and integrate into existing RNA-RNA interaction pipelines.

背景:RNA-RNA 相互作用是多种细胞功能的关键。检测潜在的相互作用有助于了解潜在的过程。然而,由于假阳性率较高,通过硅学或实验高通量方法确定的潜在相互作用可能缺乏精确性:结果:我们提出了 CheRRI,这是第一个评估假定 RNA-RNA 相互作用位点生物学相关性的工具。CheRRI通过基于实验RNA-RNA相互作用组数据训练的机器学习模型筛选候选者。其独特的设置结合了相互作用组数据和成熟的热力学预测工具,将实验数据与最先进的计算模型整合在一起。将这些数据应用于自动机器学习方法,不仅可以过滤潜在的假阳性数据,还可以根据具体需要定制底层相互作用位点模型:CheRRI是一种独立的后处理工具,可在基因组水平上过滤预测或实验确定的潜在RNA-RNA相互作用,以提高候选相互作用的质量。它易于安装(通过 conda、pip 包)、使用(通过 Galaxy),并能集成到现有的 RNA-RNA 相互作用管道中。
{"title":"CheRRI-Accurate classification of the biological relevance of putative RNA-RNA interaction sites.","authors":"Teresa Müller, Stefan Mautner, Pavankumar Videm, Florian Eggenhofer, Martin Raden, Rolf Backofen","doi":"10.1093/gigascience/giae022","DOIUrl":"10.1093/gigascience/giae022","url":null,"abstract":"<p><strong>Background: </strong>RNA-RNA interactions are key to a wide range of cellular functions. The detection of potential interactions helps to understand the underlying processes. However, potential interactions identified via in silico or experimental high-throughput methods can lack precision because of a high false-positive rate.</p><p><strong>Results: </strong>We present CheRRI, the first tool to evaluate the biological relevance of putative RNA-RNA interaction sites. CheRRI filters candidates via a machine learning-based model trained on experimental RNA-RNA interactome data. Its unique setup combines interactome data and an established thermodynamic prediction tool to integrate experimental data with state-of-the-art computational models. Applying these data to an automated machine learning approach provides the opportunity to not only filter data for potential false positives but also tailor the underlying interaction site model to specific needs.</p><p><strong>Conclusions: </strong>CheRRI is a stand-alone postprocessing tool to filter either predicted or experimentally identified potential RNA-RNA interactions on a genomic level to enhance the quality of interaction candidates. It is easy to install (via conda, pip packages), use (via Galaxy), and integrate into existing RNA-RNA interaction pipelines.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11152173/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141261603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
GigaScience
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1