首页 > 最新文献

NAR Genomics and Bioinformatics最新文献

英文 中文
A systematic analysis of contemporary whole exome sequencing capture kits to optimise high-coverage capture of CCDS regions. 对当代全外显子组测序捕获试剂盒进行系统分析,以优化CCDS区域的高覆盖率捕获。
IF 2.8 Q1 GENETICS & HEREDITY Pub Date : 2025-09-01 DOI: 10.1093/nargab/lqaf115
Fernando Vázquez López, James J Ashton, Guo Cheng, Sarah Ennis

Whole exome sequencing (WES) is a well-established tool for clinical diagnostics, is more cost-effective and faster to analyse than whole genome sequencing and has been implemented to uplift diagnostic rates in human disease. However, challenges remain to achieve comprehensive and uniform coverage of targets, and high sensitivity and specificity. Differences in genomic target regions and exome capture mechanism between kits may lead to differences in overall coverage uniformity and capture efficiency. Here, we analyse the efficiency of a range of off-the-shelf exome sequencing (ES) kits in capturing their reported targets and the consensus coding sequence (CCDS) regions. Our results show Twist Custom Exome, Twist Human Comprehensive Exome, and Roche KAPA HyperExome V1 perform particularly well at capturing their target regions at 10X and 20X coverage and achieve the highest capture efficiency of CCDS regions upon read downsampling. This was the case despite both Twist kits targeting less than 37Mb in the genome. Our analysis highlights the impact of kit target design on capture efficiency in WES, with kit target size and uniformity of coverage impacting the capture efficiency of CCDS regions. This benchmark will help researchers to make an informed decision based on their needs.

全外显子组测序(WES)是一种完善的临床诊断工具,比全基因组测序更具成本效益,分析速度更快,并已用于提高人类疾病的诊断率。然而,如何实现对目标的全面和统一覆盖,以及高灵敏度和特异性仍然存在挑战。试剂盒间基因组靶区和外显子组捕获机制的差异可能导致整体覆盖均匀性和捕获效率的差异。在这里,我们分析了一系列现成的外显子组测序(ES)试剂盒在捕获其报道的靶标和共识编码序列(CCDS)区域方面的效率。我们的研究结果表明,Twist Custom Exome、Twist Human Comprehensive Exome和Roche KAPA HyperExome V1在10倍和20倍覆盖率下捕获其目标区域表现特别好,并且在读取下采样时达到最高的CCDS区域捕获效率。尽管两种Twist试剂盒的目标基因组都小于37Mb,但情况仍然如此。我们的分析强调了试剂盒靶设计对WES捕获效率的影响,试剂盒靶尺寸和覆盖均匀性影响CCDS区域的捕获效率。这一基准将帮助研究人员根据他们的需求做出明智的决定。
{"title":"A systematic analysis of contemporary whole exome sequencing capture kits to optimise high-coverage capture of CCDS regions.","authors":"Fernando Vázquez López, James J Ashton, Guo Cheng, Sarah Ennis","doi":"10.1093/nargab/lqaf115","DOIUrl":"10.1093/nargab/lqaf115","url":null,"abstract":"<p><p>Whole exome sequencing (WES) is a well-established tool for clinical diagnostics, is more cost-effective and faster to analyse than whole genome sequencing and has been implemented to uplift diagnostic rates in human disease. However, challenges remain to achieve comprehensive and uniform coverage of targets, and high sensitivity and specificity. Differences in genomic target regions and exome capture mechanism between kits may lead to differences in overall coverage uniformity and capture efficiency. Here, we analyse the efficiency of a range of off-the-shelf exome sequencing (ES) kits in capturing their reported targets and the consensus coding sequence (CCDS) regions. Our results show Twist Custom Exome, Twist Human Comprehensive Exome, and Roche KAPA HyperExome V1 perform particularly well at capturing their target regions at 10X and 20X coverage and achieve the highest capture efficiency of CCDS regions upon read downsampling. This was the case despite both Twist kits targeting less than 37Mb in the genome. Our analysis highlights the impact of kit target design on capture efficiency in WES, with kit target size and uniformity of coverage impacting the capture efficiency of CCDS regions. This benchmark will help researchers to make an informed decision based on their needs.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 3","pages":"lqaf115"},"PeriodicalIF":2.8,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12408908/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145016393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assembly-free typing of Nanopore and Illumina data through proximity scoring with KMA. 通过KMA接近评分对Nanopore和Illumina数据进行无装配分类。
IF 2.8 Q1 GENETICS & HEREDITY Pub Date : 2025-09-01 DOI: 10.1093/nargab/lqaf116
Philip T L C Clausen, Malte B Hallgren, Søren Overballe-Petersen, Vanessa R Marcelino, Henrik Hasman, Frank M Aarestrup

Advances in Oxford Nanopore Technologies (ONT) with the introduction of the r10.4.1 flow cell have reduced the sequencing error rates to <1%. When a reference sequence is known, this allows for accurate variant calling comparable with what is known from the second-generation short-read sequencing technologies, such as Illumina. Additionally, the longer sequence reads provided by ONT enable more efficient mappings, which means the amount of multimapping reads is reduced. However, when the correct reference is not known in advance, and the target reference is highly similar to other references, the multimapping problem is still a concern. Although the ConClave algorithm has provided an accurate solution to the multimapping problem of the second-generation short-read sequencing technologies, it is less effective when resolving the multimapping problems arising from third-generation long-read sequencing technologies. To overcome this problem, we are introducing proximity scoring of alleles, which aids the ConClave algorithm to accurately assign specific alleles from databases containing loci with a high degree of redundancy. Using multilocus sequence typing as a test case, we show that this approach matches the results obtained from sequencing data of Illumina while using limited computational resources that essentially correspond to that of today's smartphones.

随着r10.4.1流动池的引入,牛津纳米孔技术(ONT)的进步降低了测序错误率。ConClave算法为第二代短读测序技术的多映射问题提供了准确的解决方案,但在解决第三代长读测序技术的多映射问题时效果较差。为了克服这个问题,我们引入了等位基因的接近度评分,这有助于ConClave算法从包含高度冗余位点的数据库中准确地分配特定的等位基因。使用多位点序列分型作为测试案例,我们表明这种方法与从Illumina测序数据中获得的结果相匹配,同时使用有限的计算资源,基本上相当于今天的智能手机。
{"title":"Assembly-free typing of Nanopore and Illumina data through proximity scoring with KMA.","authors":"Philip T L C Clausen, Malte B Hallgren, Søren Overballe-Petersen, Vanessa R Marcelino, Henrik Hasman, Frank M Aarestrup","doi":"10.1093/nargab/lqaf116","DOIUrl":"10.1093/nargab/lqaf116","url":null,"abstract":"<p><p>Advances in Oxford Nanopore Technologies (ONT) with the introduction of the r10.4.1 flow cell have reduced the sequencing error rates to <1%. When a reference sequence is known, this allows for accurate variant calling comparable with what is known from the second-generation short-read sequencing technologies, such as Illumina. Additionally, the longer sequence reads provided by ONT enable more efficient mappings, which means the amount of multimapping reads is reduced. However, when the correct reference is not known in advance, and the target reference is highly similar to other references, the multimapping problem is still a concern. Although the <i>ConClave</i> algorithm has provided an accurate solution to the multimapping problem of the second-generation short-read sequencing technologies, it is less effective when resolving the multimapping problems arising from third-generation long-read sequencing technologies. To overcome this problem, we are introducing proximity scoring of alleles, which aids the <i>ConClave</i> algorithm to accurately assign specific alleles from databases containing loci with a high degree of redundancy. Using multilocus sequence typing as a test case, we show that this approach matches the results obtained from sequencing data of Illumina while using limited computational resources that essentially correspond to that of today's smartphones.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 3","pages":"lqaf116"},"PeriodicalIF":2.8,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12408904/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145016321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A pan-cancer, pan-treatment model for predicting drug responses from patient-derived xenografts. 用于预测患者来源的异种移植物药物反应的泛癌症,泛治疗模型。
IF 2.8 Q1 GENETICS & HEREDITY Pub Date : 2025-08-28 eCollection Date: 2025-09-01 DOI: 10.1093/nargab/lqaf111
Shruti Gupta, Vikash K Mohani, Ghita Ghislat, Pedro J Ballester, Shandar Ahmad

The translatability of patient-derived xenograft (PDX)-generated clinical data into patient-specific outcomes for therapeutic guidance is limited by the challenges in generalizability of models across patients, treatments, and cancer types. Previously, machine learning (ML) models have been developed for the two most abundant cancer types, i.e. breast cancer and colorectal cancer, but these are unusable in other cancer types because each treatment/cancer type requires a different model to be trained. Here, we provide an ML framework to train a single pan-cancer, pan-treatment model for predicting treatment outcomes. We show that such models give promising results for all cancer types considered and reproduce the accuracy levels of individually trained cancer types. In the proposed model, all PDX genomic profiles from all cancer types are used as the training data, and instead of partitioning them into cancer types for each model, the cancer type and treatment name are appended as the input features of the training model. Using genomic-only and treatment-only embeddings and combining them with principal component analysis-based dimensionality reduction, our models show promising results and provide a framework for further improvements and real-time use for best treatment selections for cancer patients.

患者来源的异种移植物(PDX)产生的临床数据转化为患者特异性的治疗指导结果的可翻译性受到患者、治疗和癌症类型之间模型通用性的挑战的限制。以前,机器学习(ML)模型已经开发用于两种最常见的癌症类型,即乳腺癌和结直肠癌,但这些模型无法用于其他癌症类型,因为每种治疗/癌症类型需要训练不同的模型。在这里,我们提供了一个机器学习框架来训练一个单一的泛癌症,泛治疗模型来预测治疗结果。我们表明,这些模型对所有考虑的癌症类型都给出了有希望的结果,并重现了单独训练的癌症类型的准确性水平。在本文提出的模型中,所有癌症类型的所有PDX基因组图谱被用作训练数据,而不是将它们划分为每个模型的癌症类型,而是将癌症类型和治疗名称附加作为训练模型的输入特征。使用纯基因组和纯治疗嵌入,并将它们与基于主成分分析的降维相结合,我们的模型显示出有希望的结果,并为进一步改进和实时使用癌症患者最佳治疗选择提供了框架。
{"title":"A pan-cancer, pan-treatment model for predicting drug responses from patient-derived xenografts.","authors":"Shruti Gupta, Vikash K Mohani, Ghita Ghislat, Pedro J Ballester, Shandar Ahmad","doi":"10.1093/nargab/lqaf111","DOIUrl":"10.1093/nargab/lqaf111","url":null,"abstract":"<p><p>The translatability of patient-derived xenograft (PDX)-generated clinical data into patient-specific outcomes for therapeutic guidance is limited by the challenges in generalizability of models across patients, treatments, and cancer types. Previously, machine learning (ML) models have been developed for the two most abundant cancer types, i.e. breast cancer and colorectal cancer, but these are unusable in other cancer types because each treatment/cancer type requires a different model to be trained. Here, we provide an ML framework to train a single pan-cancer, pan-treatment model for predicting treatment outcomes. We show that such models give promising results for all cancer types considered and reproduce the accuracy levels of individually trained cancer types. In the proposed model, all PDX genomic profiles from all cancer types are used as the training data, and instead of partitioning them into cancer types for each model, the cancer type and treatment name are appended as the input features of the training model. Using genomic-only and treatment-only embeddings and combining them with principal component analysis-based dimensionality reduction, our models show promising results and provide a framework for further improvements and real-time use for best treatment selections for cancer patients.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 3","pages":"lqaf111"},"PeriodicalIF":2.8,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12408900/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145016337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Consistent asymmetry in DNA damage artefacts across target regions in exome sequencing data. 外显子组测序数据中目标区域DNA损伤伪影的一致不对称性。
IF 2.8 Q1 GENETICS & HEREDITY Pub Date : 2025-08-27 eCollection Date: 2025-09-01 DOI: 10.1093/nargab/lqaf120
Tyler D Medina, Declan Bennett, Cathal Seoighe

Oxidative damage can introduce G>T mutations upon DNA replication. When this damage occurs ex vivo, sequenced DNA exhibits strand asymmetry, whereby sequence alignment yields G>T mismatches without corresponding C>A mismatches on the complementary strand at a given locus. Strand asymmetry is used to identify potential sequencing artefacts in somatic variant calls in cancer sequencing projects. Consistent with previous studies, we found that the strandedness of this asymmetry is frequently shared across targeted capture regions. However, while some exome sequencing datasets displayed consistent asymmetry relative to the forward reference strand, some surprisingly showed asymmetry relative to the transcription strand. Though oxidation is the principle cause of artefactual G>T mutations, we propose that the asymmetry stems from the use of single-stranded exome capture probes, as we did not find consistent asymmetry in matched whole genome sequencing. We further propose that high levels of asymmetry can be indicative of oxidation artefacts in the reported somatic variant calls of some samples. While most analysed cohorts showed low to moderate asymmetry, in one cohort (testicular germ cell tumour), approximately half of the reported G>T somatic mutations were likely to be oxidative damage artefacts, as indicated by the extent of asymmetry in mismatches and variants.

氧化损伤可在DNA复制时引入G bbbbt突变。当这种损伤发生在体外时,测序的DNA表现出链不对称,即序列比对产生G>T错配,而在给定位点的互补链上没有相应的C>A错配。在癌症测序项目中,链不对称被用于鉴定体细胞变异呼叫中潜在的测序伪影。与先前的研究一致,我们发现这种不对称的链结性经常在目标捕获区域共享。然而,尽管一些外显子组测序数据集显示出与正向参考链一致的不对称性,但令人惊讶的是,一些外显子组测序数据集显示出与转录链相对的不对称性。虽然氧化是人为G b> T突变的主要原因,但我们认为这种不对称源于单链外显子组捕获探针的使用,因为我们在匹配的全基因组测序中没有发现一致的不对称。我们进一步提出,高水平的不对称性可以表明在一些样本的体细胞变异呼叫中氧化伪影。虽然大多数分析的队列显示低至中度不对称,但在一个队列(睾丸生殖细胞肿瘤)中,报告的G bbbbt体细胞突变中约有一半可能是氧化损伤的产物,这表明了不匹配和变异的不对称程度。
{"title":"Consistent asymmetry in DNA damage artefacts across target regions in exome sequencing data.","authors":"Tyler D Medina, Declan Bennett, Cathal Seoighe","doi":"10.1093/nargab/lqaf120","DOIUrl":"10.1093/nargab/lqaf120","url":null,"abstract":"<p><p>Oxidative damage can introduce G>T mutations upon DNA replication. When this damage occurs <i>ex vivo</i>, sequenced DNA exhibits strand asymmetry, whereby sequence alignment yields G>T mismatches without corresponding C>A mismatches on the complementary strand at a given locus. Strand asymmetry is used to identify potential sequencing artefacts in somatic variant calls in cancer sequencing projects. Consistent with previous studies, we found that the strandedness of this asymmetry is frequently shared across targeted capture regions. However, while some exome sequencing datasets displayed consistent asymmetry relative to the forward reference strand, some surprisingly showed asymmetry relative to the transcription strand. Though oxidation is the principle cause of artefactual G>T mutations, we propose that the asymmetry stems from the use of single-stranded exome capture probes, as we did not find consistent asymmetry in matched whole genome sequencing. We further propose that high levels of asymmetry can be indicative of oxidation artefacts in the reported somatic variant calls of some samples. While most analysed cohorts showed low to moderate asymmetry, in one cohort (testicular germ cell tumour), approximately half of the reported G>T somatic mutations were likely to be oxidative damage artefacts, as indicated by the extent of asymmetry in mismatches and variants.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 3","pages":"lqaf120"},"PeriodicalIF":2.8,"publicationDate":"2025-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12390751/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144972279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EASYstrata: an all-in-one workflow for genome annotation and genomic divergence analysis. EASYstrata:基因组注释和基因组差异分析的一体化工作流程。
IF 2.8 Q1 GENETICS & HEREDITY Pub Date : 2025-08-27 eCollection Date: 2025-09-01 DOI: 10.1093/nargab/lqaf110
Quentin Rougemont, Elise Lucotte, Loreleï Boyer, Alexandra Jalaber de Dinechin, Alodie Snirc, Tatiana Giraud, Ricardo C Rodríguez de la Vega

New reference genomes and transcriptomes are increasingly available across the tree of life, opening new avenues to tackle exciting questions. However, there are still challenges associated with annotating genomes and inferring evolutionary processes and with a lack of methodological standardisation. Here, we propose a new workflow designed for evolutionary analyses to overcome these challenges, facilitating the detection of recombination suppression and its consequences in terms of rearrangements and transposable element accumulation. To do so, we assemble multiple bioinformatic steps in a single easy-to-use workflow. We combine state-of-the-art tools to detect transposable elements, annotate genomes, infer gene orthology relationships, compute divergence between sequences, infer evolutionary strata (i.e. footprints of stepwise extension of recombination suppression) and their structural rearrangements, and visualise the results. This workflow, called EASYstrata, was applied to reannotate 42 published genomes from Microbotryum fungi. We show in further case examples from a plant and an animal that we recover the same strata as previously described. While this tool was developed with the goal to infer divergence between sex or mating-type chromosomes, it can be applied to any pair of haplotypes whose pattern of divergence is of interest. This workflow will facilitate the study of non-model species for which newly sequenced phased diploid genomes are becoming available.

新的参考基因组和转录组在生命之树上越来越多地可用,为解决令人兴奋的问题开辟了新的途径。然而,仍然存在与注释基因组和推断进化过程相关的挑战,并且缺乏方法标准化。在这里,我们提出了一种新的进化分析工作流程,以克服这些挑战,促进检测重组抑制及其在重排和转座元件积累方面的后果。为此,我们将多个生物信息学步骤组装在一个易于使用的工作流程中。我们结合最先进的工具来检测转座元件,注释基因组,推断基因同源关系,计算序列之间的差异,推断进化层(即重组抑制逐步扩展的足迹)及其结构重排,并将结果可视化。该工作流程被称为EASYstrata,用于重新注释42个已发表的微生物真菌基因组。在进一步的例子中,我们从一种植物和一种动物中发现了与前面描述的相同的地层。虽然这个工具的开发目的是推断性或交配型染色体之间的差异,但它可以应用于任何对其差异模式感兴趣的单倍型。该工作流程将促进非模式物种的研究,其中新测序的分期二倍体基因组正在变得可用。
{"title":"EASYstrata: an all-in-one workflow for genome annotation and genomic divergence analysis.","authors":"Quentin Rougemont, Elise Lucotte, Loreleï Boyer, Alexandra Jalaber de Dinechin, Alodie Snirc, Tatiana Giraud, Ricardo C Rodríguez de la Vega","doi":"10.1093/nargab/lqaf110","DOIUrl":"10.1093/nargab/lqaf110","url":null,"abstract":"<p><p>New reference genomes and transcriptomes are increasingly available across the tree of life, opening new avenues to tackle exciting questions. However, there are still challenges associated with annotating genomes and inferring evolutionary processes and with a lack of methodological standardisation. Here, we propose a new workflow designed for evolutionary analyses to overcome these challenges, facilitating the detection of recombination suppression and its consequences in terms of rearrangements and transposable element accumulation. To do so, we assemble multiple bioinformatic steps in a single easy-to-use workflow. We combine state-of-the-art tools to detect transposable elements, annotate genomes, infer gene orthology relationships, compute divergence between sequences, infer evolutionary strata (i.e. footprints of stepwise extension of recombination suppression) and their structural rearrangements, and visualise the results. This workflow, called EASYstrata, was applied to reannotate 42 published genomes from <i>Microbotryum</i> fungi. We show in further case examples from a plant and an animal that we recover the same strata as previously described. While this tool was developed with the goal to infer divergence between sex or mating-type chromosomes, it can be applied to any pair of haplotypes whose pattern of divergence is of interest. This workflow will facilitate the study of non-model species for which newly sequenced phased diploid genomes are becoming available.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 3","pages":"lqaf110"},"PeriodicalIF":2.8,"publicationDate":"2025-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12390748/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144972247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrative multi-omic analysis reveals a PAX8-driven gene network linking tumor stemness to therapy response in ovarian cancer. 综合多组学分析揭示了pax8驱动的基因网络,将卵巢癌的肿瘤干性与治疗反应联系起来。
IF 2.8 Q1 GENETICS & HEREDITY Pub Date : 2025-08-27 eCollection Date: 2025-09-01 DOI: 10.1093/nargab/lqaf113
José M Santos-Pereira, Amancio Carnero, Sandra Muñoz-Galván

The transcription factor PAX8 is expressed in most ovarian tumors, being associated with increased tumorigenesis. Although recent studies have addressed the gene regulatory functions of PAX8 in ovarian cancer, an integrative analysis of multi-omic and patient data is required to identify the core regulatory network of PAX8 and its prognostic and therapeutic value. Here, we integrate PAX8 chromatin binding and accessibility data in ovarian cancer cells with transcriptomic and patients' data to gain insight into the core gene regulatory network orchestrated by PAX8 in ovarian tumors. Integration of differential chromatin accessibility, transcription factor binding, and gene expression upon PAX8 knockout provides a core regulatory network that explains most of the genes regulated by PAX8. We combine these target genes with patient expression data and find a PAX8 gene signature associated with tumor stemness, a property related to therapy resistance. Indeed, we show that the PAX8 gene signature predicts disease outcome and response to therapy in ovarian cancer patients. Finally, we validated experimentally our results from bioinformatic analyses, thus reassuring their robustness. Our findings uncover a PAX8 core network that represents a promising strategy for targeted antitumor therapies and open new pathways to fight against ovarian cancer resistance.

转录因子PAX8在大多数卵巢肿瘤中表达,与肿瘤发生增加有关。虽然最近的研究已经解决了PAX8在卵巢癌中的基因调控功能,但需要对多组学和患者数据进行综合分析,以确定PAX8的核心调控网络及其预后和治疗价值。在此,我们将卵巢癌细胞中PAX8染色质结合和可及性数据与转录组学和患者数据相结合,以深入了解PAX8在卵巢癌中精心策划的核心基因调控网络。PAX8基因敲除过程中差异染色质可及性、转录因子结合和基因表达的整合提供了一个核心调控网络,可以解释PAX8调控的大多数基因。我们将这些靶基因与患者表达数据结合起来,发现了与肿瘤干性相关的PAX8基因特征,这是一种与治疗耐药性相关的特性。事实上,我们表明PAX8基因标记可以预测卵巢癌患者的疾病结局和对治疗的反应。最后,我们通过实验验证了我们的生物信息学分析结果,从而保证了它们的稳健性。我们的研究结果揭示了PAX8核心网络,它代表了一种有前途的靶向抗肿瘤治疗策略,并开辟了对抗卵巢癌耐药性的新途径。
{"title":"Integrative multi-omic analysis reveals a PAX8-driven gene network linking tumor stemness to therapy response in ovarian cancer.","authors":"José M Santos-Pereira, Amancio Carnero, Sandra Muñoz-Galván","doi":"10.1093/nargab/lqaf113","DOIUrl":"10.1093/nargab/lqaf113","url":null,"abstract":"<p><p>The transcription factor PAX8 is expressed in most ovarian tumors, being associated with increased tumorigenesis. Although recent studies have addressed the gene regulatory functions of PAX8 in ovarian cancer, an integrative analysis of multi-omic and patient data is required to identify the core regulatory network of PAX8 and its prognostic and therapeutic value. Here, we integrate PAX8 chromatin binding and accessibility data in ovarian cancer cells with transcriptomic and patients' data to gain insight into the core gene regulatory network orchestrated by PAX8 in ovarian tumors. Integration of differential chromatin accessibility, transcription factor binding, and gene expression upon PAX8 knockout provides a core regulatory network that explains most of the genes regulated by PAX8. We combine these target genes with patient expression data and find a PAX8 gene signature associated with tumor stemness, a property related to therapy resistance. Indeed, we show that the PAX8 gene signature predicts disease outcome and response to therapy in ovarian cancer patients. Finally, we validated experimentally our results from bioinformatic analyses, thus reassuring their robustness. Our findings uncover a PAX8 core network that represents a promising strategy for targeted antitumor therapies and open new pathways to fight against ovarian cancer resistance.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 3","pages":"lqaf113"},"PeriodicalIF":2.8,"publicationDate":"2025-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12390758/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144972277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Creating a consensus genome assembly of Myxococcus xanthus DZ2 by resolving discrepancies between two complete genomes of the same strain and uncovering the Mx-alpha prophage region diversity across phylum Myxococcota. 通过解决同一菌株的两个完整基因组之间的差异,揭示粘球菌门中mx - α噬菌体区域的多样性,建立了黄粘球菌DZ2的一致基因组组装。
IF 2.8 Q1 GENETICS & HEREDITY Pub Date : 2025-08-27 eCollection Date: 2025-09-01 DOI: 10.1093/nargab/lqaf112
Utkarsha Mahanta, Gaurav Sharma

Myxococcus xanthus DZ2, a model myxobacterium, has three reported genome assemblies, including two recent complete assemblies (MxDZ2_Tam and MxDZ2_Nan) from the same culture stock. These assemblies misreported their circular nature and differed by 6.4 kb, raising questions about their accuracy. After removing duplicate ends, aligning genomes to the origin of replication, and circularization, this computational analysis revealed a minimal 32 bp difference, with MxDZ2_Tam being slightly larger. Forty sequence variations including 38 indels and two substitutions, were impacting 18 coding genes via frameshift mutations. Although PacBio-HiFi technology boasts a low error rate, it remains higher than the 454-platform used for the earlier MxDZ2_Kirby draft assembly. Therefore, using MxDZ2_Kirby as a reference, we constructed a "truly circular" genome for M. xanthus DZ2. Additionally, analysis of Mx-alpha regions, involved in antagonism via the toxin gene sitA, across 61 myxobacterial genomes identified their presence in five taxonomically polyphyletic species, potentially influencing their physiology, development, and ecological interactions beyond predation. Only M. xanthus DZ2 and DZF1 contained all three Mx-alpha regions, whereas M. xanthus DK1622 has only one. Overall, this study underscores the need for meticulous validation of sequencing-based genome assemblies and their variations and provides novel insights into Mx-alpha regions as potential adaptive elements in myxobacteria.

黄粘球菌(Myxococcus xanthus) DZ2是一种模式黏菌,有3个已报道的基因组片段,包括来自同一培养源的两个最近完整的片段(MxDZ2_Tam和MxDZ2_Nan)。这些组件错误地报告了它们的圆形性质,并且相差6.4 kb,这引起了对其准确性的质疑。在去除重复末端,将基因组与复制原点对齐并进行循环化后,该计算分析显示,MxDZ2_Tam的差异最小为32 bp,其中MxDZ2_Tam略大。40个序列变异包括38个索引和2个替换,通过移码突变影响了18个编码基因。尽管PacBio-HiFi技术具有较低的错误率,但它仍然高于早期MxDZ2_Kirby草稿组件使用的454平台。因此,我们以MxDZ2_Kirby为参考,构建了一个“真正的环状”基因组。此外,通过毒素基因sitA对61个黏菌基因组中参与拮抗的mx - α区域进行了分析,发现它们存在于5个多系物种中,可能影响它们的生理、发育和捕食之外的生态相互作用。只有黄豆DZ2和DZF1包含所有三个Mx-alpha区域,而黄豆DK1622只有一个。总的来说,这项研究强调了对基于测序的基因组组装及其变异进行细致验证的必要性,并为mx - α区域作为黏菌中潜在的适应性元件提供了新的见解。
{"title":"Creating a consensus genome assembly of <i>Myxococcus xanthus</i> DZ2 by resolving discrepancies between two complete genomes of the same strain and uncovering the Mx-alpha prophage region diversity across phylum Myxococcota.","authors":"Utkarsha Mahanta, Gaurav Sharma","doi":"10.1093/nargab/lqaf112","DOIUrl":"10.1093/nargab/lqaf112","url":null,"abstract":"<p><p><i>Myxococcus xanthus</i> DZ2, a model myxobacterium, has three reported genome assemblies, including two recent complete assemblies (MxDZ2_Tam and MxDZ2_Nan) from the same culture stock. These assemblies misreported their circular nature and differed by 6.4 kb, raising questions about their accuracy. After removing duplicate ends, aligning genomes to the origin of replication, and circularization, this computational analysis revealed a minimal 32 bp difference, with MxDZ2_Tam being slightly larger. Forty sequence variations including 38 indels and two substitutions, were impacting 18 coding genes via frameshift mutations. Although PacBio-HiFi technology boasts a low error rate, it remains higher than the 454-platform used for the earlier MxDZ2_Kirby draft assembly. Therefore, using MxDZ2_Kirby as a reference, we constructed a \"truly circular\" genome for <i>M. xanthus</i> DZ2. Additionally, analysis of Mx-alpha regions, involved in antagonism via the toxin gene <i>sitA</i>, across 61 myxobacterial genomes identified their presence in five taxonomically polyphyletic species, potentially influencing their physiology, development, and ecological interactions beyond predation. Only <i>M. xanthus</i> DZ2 and DZF1 contained all three Mx-alpha regions, whereas <i>M. xanthus</i> DK1622 has only one. Overall, this study underscores the need for meticulous validation of sequencing-based genome assemblies and their variations and provides novel insights into Mx-alpha regions as potential adaptive elements in myxobacteria.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 3","pages":"lqaf112"},"PeriodicalIF":2.8,"publicationDate":"2025-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12390754/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144972263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ST-deconv: an accurate deconvolution approach for spatial transcriptome data utilizing self-encoding and contrastive learning. ST-deconv:利用自编码和对比学习的空间转录组数据的精确反褶积方法。
IF 2.8 Q1 GENETICS & HEREDITY Pub Date : 2025-08-27 eCollection Date: 2025-09-01 DOI: 10.1093/nargab/lqaf109
Shurui Dai, Jiawei Li, Zhiliang Xia, Jingfeng Ou, Yan Guo, Limin Jiang, Jijun Tang

Single-cell RNA sequencing (scRNA-seq) has significantly deepened our understanding of cellular heterogeneity and cell type interactions, providing insights into how cell populations adapt to environmental variability. However, its lack of spatial context limits intercellular analysis. Similarly, existing spatial transcriptomics (ST) data often lack single-cell resolution, restricting cellular mapping. To address these limitations, we introduce ST-deconv, a deep learning-based deconvolution model that integrates spatial information. ST-deconv leverages contrastive learning to enhance the spatial representation of adjacent spots, improving spatial relationship inference. It also employs domain-adversarial networks to improve generalization and deconvolution across diverse datasets. Moreover, ST-deconv can generate large-scale, high-resolution spatial transcriptomic data with cell type labels from single-cell input, facilitating the learning of spatial cell type composition. In benchmarking experiments, ST-deconv outperforms traditional methods, reducing the root mean square error (RMSE) by 13% to 60%, with an RMSE as low as 0.03 for high spatial correlation datasets and 0.07 for low spatial correlation datasets across different transcriptomic contexts. Reconstructing real tissue structure, a purity of 0.68 on mouse olfactory bulb (MOB) and a cell type correlation of 0.76 on human pancreatic ductal adenocarcinoma (PDAC) were achieved. These advancements make ST-deconv a powerful tool for enhancing spatial transcriptomics and downstream analyses of intercellular interactions.

单细胞RNA测序(scRNA-seq)极大地加深了我们对细胞异质性和细胞类型相互作用的理解,为细胞群体如何适应环境可变性提供了见解。然而,它缺乏空间背景限制了细胞间分析。同样,现有的空间转录组学(ST)数据往往缺乏单细胞分辨率,限制了细胞定位。为了解决这些限制,我们引入了ST-deconv,这是一种基于深度学习的反卷积模型,它集成了空间信息。ST-deconv利用对比学习来增强相邻点的空间表征,提高空间关系推断。它还采用域对抗网络来改进不同数据集的泛化和反卷积。此外,ST-deconv可以从单细胞输入中生成大规模、高分辨率的带有细胞类型标记的空间转录组数据,便于对空间细胞类型组成的学习。在基准测试实验中,ST-deconv优于传统方法,将均方根误差(RMSE)降低了13%至60%,在不同转录组背景下,高空间相关性数据集的RMSE低至0.03,低空间相关性数据集的RMSE低至0.07。重建真实组织结构,小鼠嗅球(MOB)的纯度为0.68,人胰腺导管腺癌(PDAC)的细胞类型相关性为0.76。这些进展使ST-deconv成为增强空间转录组学和细胞间相互作用下游分析的有力工具。
{"title":"ST-deconv: an accurate deconvolution approach for spatial transcriptome data utilizing self-encoding and contrastive learning.","authors":"Shurui Dai, Jiawei Li, Zhiliang Xia, Jingfeng Ou, Yan Guo, Limin Jiang, Jijun Tang","doi":"10.1093/nargab/lqaf109","DOIUrl":"10.1093/nargab/lqaf109","url":null,"abstract":"<p><p>Single-cell RNA sequencing (scRNA-seq) has significantly deepened our understanding of cellular heterogeneity and cell type interactions, providing insights into how cell populations adapt to environmental variability. However, its lack of spatial context limits intercellular analysis. Similarly, existing spatial transcriptomics (ST) data often lack single-cell resolution, restricting cellular mapping. To address these limitations, we introduce ST-deconv, a deep learning-based deconvolution model that integrates spatial information. ST-deconv leverages contrastive learning to enhance the spatial representation of adjacent spots, improving spatial relationship inference. It also employs domain-adversarial networks to improve generalization and deconvolution across diverse datasets. Moreover, ST-deconv can generate large-scale, high-resolution spatial transcriptomic data with cell type labels from single-cell input, facilitating the learning of spatial cell type composition. In benchmarking experiments, ST-deconv outperforms traditional methods, reducing the root mean square error (RMSE) by 13% to 60%, with an RMSE as low as 0.03 for high spatial correlation datasets and 0.07 for low spatial correlation datasets across different transcriptomic contexts. Reconstructing real tissue structure, a purity of 0.68 on mouse olfactory bulb (MOB) and a cell type correlation of 0.76 on human pancreatic ductal adenocarcinoma (PDAC) were achieved. These advancements make ST-deconv a powerful tool for enhancing spatial transcriptomics and downstream analyses of intercellular interactions.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 3","pages":"lqaf109"},"PeriodicalIF":2.8,"publicationDate":"2025-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12390763/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144972310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PRISM: a Python package for interactive and integrated analysis of multiplexed tissue microarrays. PRISM:用于多路组织微阵列的交互式和集成分析的Python包。
IF 2.8 Q1 GENETICS & HEREDITY Pub Date : 2025-08-21 eCollection Date: 2025-09-01 DOI: 10.1093/nargab/lqaf114
Rafael Tubelleza, Aaron Kilgallon, Chin Wee Tan, James Monkman, John F Fraser, Arutha Kulasinghe

Tissue microarrays (TMAs) enable researchers to analyse hundreds of tissue samples simultaneously by embedding multiple samples into single arrays, enabling conservation of valuable tissue samples and experimental reagents. Moreover, profiling TMAs allows efficient screening of tissue samples for translational and clinical applications. Multiplexed imaging technologies allow for spatial profiling of proteins at single-cell resolution, providing insights into tumour microenvironments and disease mechanisms. High-plex spatial single-cell protein profiling is a powerful tool for biomarker discovery and translational cancer research; however, there remain limited options for end-to-end computational analysis of this type of data. Here, we introduce PRISM, a Python package for interactive, end-to-end analyses of TMAs with a focus on translational and clinical research using multiplexed proteomic data. PRISM leverages the SpatialData framework to standardize data storage and ensure interoperability with single-cell and spatial analysis tools. It consists of two main components: TMA Image Analysis for marker-based tissue masking, TMA dearraying, cell segmentation, and single-cell feature extraction; and AnnData Analysis for quality control, clustering, iterative cell-type annotation, and spatial analysis. Integrated as a plugin within napari, PRISM provides an intuitive and purely interactive graphical interface for real time and human-in-the-loop analyses. PRISM supports efficient multi-resolution image processing and accelerates bioinformatics workflows using efficient scalable data structures, parallelization and GPU acceleration. By combining modular flexibility, computational efficiency, and a completely interactive interface, PRISM simplifies the translation of raw multiplexed images to actionable clinical insights, empowering researchers to explore and interact effectively with spatial omics data.

组织微阵列(TMAs)通过将多个样本嵌入到单个阵列中,使研究人员能够同时分析数百个组织样本,从而使有价值的组织样本和实验试剂得以保存。此外,分析tma可以有效筛选组织样本,用于翻译和临床应用。多路成像技术允许在单细胞分辨率下对蛋白质进行空间分析,从而深入了解肿瘤微环境和疾病机制。高plex空间单细胞蛋白谱分析是生物标志物发现和转化性癌症研究的有力工具;然而,对这类数据进行端到端计算分析的选择仍然有限。在这里,我们介绍PRISM,这是一个Python包,用于交互式端到端tma分析,重点是使用多路蛋白质组学数据进行翻译和临床研究。PRISM利用SpatialData框架来标准化数据存储,并确保与单细胞和空间分析工具的互操作性。它由两个主要部分组成:基于标记的组织掩蔽的TMA图像分析、TMA绘制、细胞分割和单细胞特征提取;以及用于质量控制、聚类、迭代细胞类型注释和空间分析的AnnData Analysis。PRISM作为一个插件集成在napari中,为实时和人在循环分析提供了一个直观和纯交互的图形界面。PRISM支持高效的多分辨率图像处理,并使用高效的可扩展数据结构、并行化和GPU加速来加速生物信息学工作流程。通过结合模块化的灵活性、计算效率和完全交互的界面,PRISM简化了从原始多路图像到可操作的临床见解的转换,使研究人员能够有效地探索和交互空间组学数据。
{"title":"PRISM: a Python package for interactive and integrated analysis of multiplexed tissue microarrays.","authors":"Rafael Tubelleza, Aaron Kilgallon, Chin Wee Tan, James Monkman, John F Fraser, Arutha Kulasinghe","doi":"10.1093/nargab/lqaf114","DOIUrl":"https://doi.org/10.1093/nargab/lqaf114","url":null,"abstract":"<p><p>Tissue microarrays (TMAs) enable researchers to analyse hundreds of tissue samples simultaneously by embedding multiple samples into single arrays, enabling conservation of valuable tissue samples and experimental reagents. Moreover, profiling TMAs allows efficient screening of tissue samples for translational and clinical applications. Multiplexed imaging technologies allow for spatial profiling of proteins at single-cell resolution, providing insights into tumour microenvironments and disease mechanisms. High-plex spatial single-cell protein profiling is a powerful tool for biomarker discovery and translational cancer research; however, there remain limited options for end-to-end computational analysis of this type of data. Here, we introduce PRISM, a Python package for interactive, end-to-end analyses of TMAs with a focus on translational and clinical research using multiplexed proteomic data. PRISM leverages the SpatialData framework to standardize data storage and ensure interoperability with single-cell and spatial analysis tools. It consists of two main components: TMA Image Analysis for marker-based tissue masking, TMA dearraying, cell segmentation, and single-cell feature extraction; and AnnData Analysis for quality control, clustering, iterative cell-type annotation, and spatial analysis. Integrated as a plugin within napari, PRISM provides an intuitive and purely interactive graphical interface for real time and human-in-the-loop analyses. PRISM supports efficient multi-resolution image processing and accelerates bioinformatics workflows using efficient scalable data structures, parallelization and GPU acceleration. By combining modular flexibility, computational efficiency, and a completely interactive interface, PRISM simplifies the translation of raw multiplexed images to actionable clinical insights, empowering researchers to explore and interact effectively with spatial omics data.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 3","pages":"lqaf114"},"PeriodicalIF":2.8,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12370624/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144972335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explicit Scale Simulation for analysis of RNA-sequencing count data with ALDEx2. 用ALDEx2分析rna测序计数数据的显式尺度模拟。
IF 2.8 Q1 GENETICS & HEREDITY Pub Date : 2025-08-19 eCollection Date: 2025-09-01 DOI: 10.1093/nargab/lqaf108
Gregory B Gloor, Michelle Pistner Nixon, Justin D Silverman

In high-throughput sequencing (HTS) studies, sample-to-sample variation in sequencing depth is driven by technical factors, and not by variation in the scale (size) of the biological system. Typically a statistical normalization removes unwanted technical variation in the data or the parameters of the model to enable differential abundance analyses. We recently showed that all normalizations make implicit assumptions about the unmeasured system scale and that errors in these assumptions can dramatically increase false positive and false negative rates. We demonstrated that these errors can be mitigated by accounting for uncertainty using a scale model, which we integrated into the ALDEx2 R package. This article provides new insights focusing on the application to transcriptomic analysis. We provide transcriptomic case studies demonstrating how scale models, rather than traditional normalizations, can reduce false positive and false negative rates in practice while enhancing the transparency and reproducibility of analyses. These scale models replace the need for dual cutoff approaches often used to address the disconnect between practical and statistical significance. We demonstrate the utility of scale models built based on known housekeeping genes in complex metatranscriptomic datasets. Thus this work provides guidance on how to incorporate scale into transcriptomic data sets.

在高通量测序(HTS)研究中,样品间测序深度的差异是由技术因素驱动的,而不是由生物系统的规模(大小)变化驱动的。通常,统计归一化可以消除数据或模型参数中不需要的技术变化,从而实现差异丰度分析。我们最近表明,所有的归一化都对未测量的系统规模做了隐含的假设,这些假设中的错误会极大地增加假阳性和假阴性率。我们证明,这些错误可以通过使用比例模型来考虑不确定性来减轻,我们将其集成到aldex2r包中。本文就转录组学分析的应用提供了新的见解。我们提供转录组学案例研究,展示了比例模型如何在实践中减少假阳性和假阴性率,而不是传统的归一化,同时提高分析的透明度和可重复性。这些比例模型取代了通常用于解决实际意义和统计意义之间脱节的双重截止方法的需要。我们展示了在复杂的亚转录组数据集中基于已知管家基因建立的比例模型的实用性。因此,这项工作为如何将规模纳入转录组数据集提供了指导。
{"title":"Explicit Scale Simulation for analysis of RNA-sequencing count data with ALDEx2.","authors":"Gregory B Gloor, Michelle Pistner Nixon, Justin D Silverman","doi":"10.1093/nargab/lqaf108","DOIUrl":"https://doi.org/10.1093/nargab/lqaf108","url":null,"abstract":"<p><p>In high-throughput sequencing (HTS) studies, sample-to-sample variation in sequencing depth is driven by technical factors, and not by variation in the scale (size) of the biological system. Typically a statistical normalization removes unwanted technical variation in the data or the parameters of the model to enable differential abundance analyses. We recently showed that all normalizations make implicit assumptions about the unmeasured system scale and that errors in these assumptions can dramatically increase false positive and false negative rates. We demonstrated that these errors can be mitigated by accounting for uncertainty using a <i>scale model</i>, which we integrated into the ALDEx2 R package. This article provides new insights focusing on the application to transcriptomic analysis. We provide transcriptomic case studies demonstrating how scale models, rather than traditional normalizations, can reduce false positive and false negative rates in practice while enhancing the transparency and reproducibility of analyses. These scale models replace the need for dual cutoff approaches often used to address the disconnect between practical and statistical significance. We demonstrate the utility of scale models built based on known housekeeping genes in complex metatranscriptomic datasets. Thus this work provides guidance on how to incorporate scale into transcriptomic data sets.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 3","pages":"lqaf108"},"PeriodicalIF":2.8,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12362245/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144972320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
NAR Genomics and Bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1