Pub Date : 2024-09-16DOI: 10.1186/s13059-024-03386-5
Sarah M. Goggin, Eli R. Zunder
Clustering is widely used for single-cell analysis, but current methods are limited in accuracy, robustness, ease of use, and interpretability. To address these limitations, we developed an ensemble clustering method that outperforms other methods at hard clustering without the need for hyperparameter tuning. It also performs soft clustering to characterize continuum-like regions and quantify clustering uncertainty, demonstrated here by mapping the connectivity and intermediate transitions between MNIST handwritten digits and between hypothalamic tanycyte subpopulations. This hyperparameter-randomized ensemble approach improves the accuracy, robustness, ease of use, and interpretability of single-cell clustering, and may prove useful in other fields as well.
{"title":"ESCHR: a hyperparameter-randomized ensemble approach for robust clustering across diverse datasets","authors":"Sarah M. Goggin, Eli R. Zunder","doi":"10.1186/s13059-024-03386-5","DOIUrl":"https://doi.org/10.1186/s13059-024-03386-5","url":null,"abstract":"Clustering is widely used for single-cell analysis, but current methods are limited in accuracy, robustness, ease of use, and interpretability. To address these limitations, we developed an ensemble clustering method that outperforms other methods at hard clustering without the need for hyperparameter tuning. It also performs soft clustering to characterize continuum-like regions and quantify clustering uncertainty, demonstrated here by mapping the connectivity and intermediate transitions between MNIST handwritten digits and between hypothalamic tanycyte subpopulations. This hyperparameter-randomized ensemble approach improves the accuracy, robustness, ease of use, and interpretability of single-cell clustering, and may prove useful in other fields as well.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":12.3,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142234448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-16DOI: 10.1186/s13059-024-03379-4
Kuan-Hao Chao, Alan Mao, Steven L. Salzberg, Mihaela Pertea
The process of splicing messenger RNA to remove introns plays a central role in creating genes and gene variants. We describe Splam, a novel method for predicting splice junctions in DNA using deep residual convolutional neural networks. Unlike previous models, Splam looks at a 400-base-pair window flanking each splice site, reflecting the biological splicing process that relies primarily on signals within this window. Splam also trains on donor and acceptor pairs together, mirroring how the splicing machinery recognizes both ends of each intron. Compared to SpliceAI, Splam is consistently more accurate, achieving 96% accuracy in predicting human splice junctions.
{"title":"Splam: a deep-learning-based splice site predictor that improves spliced alignments","authors":"Kuan-Hao Chao, Alan Mao, Steven L. Salzberg, Mihaela Pertea","doi":"10.1186/s13059-024-03379-4","DOIUrl":"https://doi.org/10.1186/s13059-024-03379-4","url":null,"abstract":"The process of splicing messenger RNA to remove introns plays a central role in creating genes and gene variants. We describe Splam, a novel method for predicting splice junctions in DNA using deep residual convolutional neural networks. Unlike previous models, Splam looks at a 400-base-pair window flanking each splice site, reflecting the biological splicing process that relies primarily on signals within this window. Splam also trains on donor and acceptor pairs together, mirroring how the splicing machinery recognizes both ends of each intron. Compared to SpliceAI, Splam is consistently more accurate, achieving 96% accuracy in predicting human splice junctions.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":12.3,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142234461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-16DOI: 10.1186/s13059-024-03388-3
Yueqi Tao, Wenfei Xian, Zhigui Bao, Fernando A. Rabanal, Andrea Movilli, Christa Lanz, Gautam Shirsekar, Detlef Weigel
Telomeric repeat arrays at the ends of chromosomes are highly dynamic in composition, but their repetitive nature and technological limitations have made it difficult to assess their true variation in genome diversity surveys. We have comprehensively characterized the sequence variation immediately adjacent to the canonical telomeric repeat arrays at the very ends of chromosomes in 74 genetically diverse Arabidopsis thaliana accessions. We first describe several types of distinct telomeric repeat units and then identify evolutionary processes such as local homogenization and higher-order repeat formation that shape diversity of chromosome ends. By comparing largely isogenic samples, we also determine repeat number variation of the degenerate and variant telomeric repeat array at both the germline and somatic levels. Finally, our analysis of haplotype structure uncovers chromosome end-specific patterns in the distribution of variant telomeric repeats, and their linkage to the more proximal non-coding region. Our findings illustrate the spectrum of telomeric repeat variation at multiple levels in A. thaliana—in germline and soma, across all chromosome ends, and across genetic groups—thereby expanding our knowledge of the evolution of chromosome ends.
{"title":"Atlas of telomeric repeat diversity in Arabidopsis thaliana","authors":"Yueqi Tao, Wenfei Xian, Zhigui Bao, Fernando A. Rabanal, Andrea Movilli, Christa Lanz, Gautam Shirsekar, Detlef Weigel","doi":"10.1186/s13059-024-03388-3","DOIUrl":"https://doi.org/10.1186/s13059-024-03388-3","url":null,"abstract":"Telomeric repeat arrays at the ends of chromosomes are highly dynamic in composition, but their repetitive nature and technological limitations have made it difficult to assess their true variation in genome diversity surveys. We have comprehensively characterized the sequence variation immediately adjacent to the canonical telomeric repeat arrays at the very ends of chromosomes in 74 genetically diverse Arabidopsis thaliana accessions. We first describe several types of distinct telomeric repeat units and then identify evolutionary processes such as local homogenization and higher-order repeat formation that shape diversity of chromosome ends. By comparing largely isogenic samples, we also determine repeat number variation of the degenerate and variant telomeric repeat array at both the germline and somatic levels. Finally, our analysis of haplotype structure uncovers chromosome end-specific patterns in the distribution of variant telomeric repeats, and their linkage to the more proximal non-coding region. Our findings illustrate the spectrum of telomeric repeat variation at multiple levels in A. thaliana—in germline and soma, across all chromosome ends, and across genetic groups—thereby expanding our knowledge of the evolution of chromosome ends.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":12.3,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142234447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Advances in single-cell transcriptomics provide an unprecedented opportunity to explore complex biological processes. However, computational methods for analyzing single-cell transcriptomics still have room for improvement especially in dimension reduction, cell clustering, and cell–cell communication inference. Herein, we propose a versatile method, named DcjComm, for comprehensive analysis of single-cell transcriptomics. DcjComm detects functional modules to explore expression patterns and performs dimension reduction and clustering to discover cellular identities by the non-negative matrix factorization-based joint learning model. DcjComm then infers cell–cell communication by integrating ligand-receptor pairs, transcription factors, and target genes. DcjComm demonstrates superior performance compared to state-of-the-art methods.
{"title":"Dimension reduction, cell clustering, and cell–cell communication inference for single-cell transcriptomics with DcjComm","authors":"Qian Ding, Wenyi Yang, Guangfu Xue, Hongxin Liu, Yideng Cai, Jinhao Que, Xiyun Jin, Meng Luo, Fenglan Pang, Yuexin Yang, Yi Lin, Yusong Liu, Haoxiu Sun, Renjie Tan, Pingping Wang, Zhaochun Xu, Qinghua Jiang","doi":"10.1186/s13059-024-03385-6","DOIUrl":"https://doi.org/10.1186/s13059-024-03385-6","url":null,"abstract":"Advances in single-cell transcriptomics provide an unprecedented opportunity to explore complex biological processes. However, computational methods for analyzing single-cell transcriptomics still have room for improvement especially in dimension reduction, cell clustering, and cell–cell communication inference. Herein, we propose a versatile method, named DcjComm, for comprehensive analysis of single-cell transcriptomics. DcjComm detects functional modules to explore expression patterns and performs dimension reduction and clustering to discover cellular identities by the non-negative matrix factorization-based joint learning model. DcjComm then infers cell–cell communication by integrating ligand-receptor pairs, transcription factors, and target genes. DcjComm demonstrates superior performance compared to state-of-the-art methods.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":12.3,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142158747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-06DOI: 10.1186/s13059-024-03381-w
Kirsten Seale, Andrew Teschendorff, Alexander P. Reiner, Sarah Voisin, Nir Eynon
During aging, the human methylome undergoes both differential and variable shifts, accompanied by increased entropy. The distinction between variably methylated positions (VMPs) and differentially methylated positions (DMPs), their contribution to epigenetic age, and the role of cell type heterogeneity remain unclear. We conduct a comprehensive analysis of > 32,000 human blood methylomes from 56 datasets (age range = 6–101 years). We find a significant proportion of the blood methylome that is differentially methylated with age (48% DMPs; FDR < 0.005) and variably methylated with age (37% VMPs; FDR < 0.005), with considerable overlap between the two groups (59% of DMPs are VMPs). Bivalent and Polycomb regions become increasingly methylated and divergent between individuals, while quiescent regions lose methylation more uniformly. Both chronological and biological clocks, but not pace-of-aging clocks, show a strong enrichment for CpGs undergoing both mean and variance changes during aging. The accumulation of DMPs shifting towards a methylation fraction of 50% drives the increase in entropy, smoothening the epigenetic landscape. However, approximately a quarter of DMPs exhibit anti-entropic effects, opposing this direction of change. While changes in cell type composition minimally affect DMPs, VMPs and entropy measurements are moderately sensitive to such alterations. This study represents the largest investigation to date of genome-wide DNA methylation changes and aging in a single tissue, providing valuable insights into primary molecular changes relevant to chronological and biological aging.
{"title":"A comprehensive map of the aging blood methylome in humans","authors":"Kirsten Seale, Andrew Teschendorff, Alexander P. Reiner, Sarah Voisin, Nir Eynon","doi":"10.1186/s13059-024-03381-w","DOIUrl":"https://doi.org/10.1186/s13059-024-03381-w","url":null,"abstract":"During aging, the human methylome undergoes both differential and variable shifts, accompanied by increased entropy. The distinction between variably methylated positions (VMPs) and differentially methylated positions (DMPs), their contribution to epigenetic age, and the role of cell type heterogeneity remain unclear. We conduct a comprehensive analysis of > 32,000 human blood methylomes from 56 datasets (age range = 6–101 years). We find a significant proportion of the blood methylome that is differentially methylated with age (48% DMPs; FDR < 0.005) and variably methylated with age (37% VMPs; FDR < 0.005), with considerable overlap between the two groups (59% of DMPs are VMPs). Bivalent and Polycomb regions become increasingly methylated and divergent between individuals, while quiescent regions lose methylation more uniformly. Both chronological and biological clocks, but not pace-of-aging clocks, show a strong enrichment for CpGs undergoing both mean and variance changes during aging. The accumulation of DMPs shifting towards a methylation fraction of 50% drives the increase in entropy, smoothening the epigenetic landscape. However, approximately a quarter of DMPs exhibit anti-entropic effects, opposing this direction of change. While changes in cell type composition minimally affect DMPs, VMPs and entropy measurements are moderately sensitive to such alterations. This study represents the largest investigation to date of genome-wide DNA methylation changes and aging in a single tissue, providing valuable insights into primary molecular changes relevant to chronological and biological aging.\u0000","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":12.3,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142142592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-04DOI: 10.1186/s13059-024-03387-4
Alessandro Vinceti, Rafaele M. Iannuzzi, Isabella Boyle, Lucia Trastulla, Catarina D. Campbell, Francisca Vazquez, Joshua M. Dempster, Francesco Iorio
<p><b>Correction</b><b>: </b><b>Genome Biol 25, 192 (2024)</b></p><p><b>https://doi.org/10.1186/s13059-024-03336-1</b></p><br/><p>Following publication of the original article [1], the authors identified an omission in the completing interests section. The omitted text is given in bold below.</p><p><b>Competing interests</b></p><p>FI receives funding from Open Targets, a public-private initiative involving academia and industry and performs consultancy for the joint CRUK-AstraZeneca Functional Genomics Centre and for Mosaic TX. JD is a consultant for and holds equity in Jumble Therapeutics. CDC performs consultancy for Droplet Biosciences and is a shareholder of Novartis. <b>FV receives research support from the Dependency Map Consortium, Riva Therapeutics, Bristol Myers Squibb, Merck, Illumina, and Deerfield Management. FV is on the scientific advisory board of GSK, is a consultant and holds equity in Riva Therapeutics and is a co-founder and holds equity in Jumble Therapeutics</b>. All other authors declare that they have no competing interests.</p><p>The original article [1] is corrected.</p><ol data-track-component="outbound reference" data-track-context="references section"><li data-counter="1."><p>Vinceti A, Iannuzzi RM, Boyle I, et al. A benchmark of computational methods for correcting biases of established and unknown origin in CRISPR-Cas9 screening data. Genome Biol. 2024;25:192. https://doi.org/10.1186/s13059-024-03336-1.</p><p>Article PubMed PubMed Central Google Scholar </p></li></ol><p>Download references<svg aria-hidden="true" focusable="false" height="16" role="img" width="16"><use xlink:href="#icon-eds-i-download-medium" xmlns:xlink="http://www.w3.org/1999/xlink"></use></svg></p><h3>Authors and Affiliations</h3><ol><li><p>Computational Biology Research Centre, Human Technopole, Milan, Italy</p><p>Alessandro Vinceti, Rafaele M. Iannuzzi, Lucia Trastulla & Francesco Iorio</p></li><li><p>Broad Institute of Harvard and MIT, Cambridge, MA, USA</p><p>Isabella Boyle, Catarina D. Campbell, Francisca Vazquez & Joshua M. Dempster</p></li></ol><span>Authors</span><ol><li><span>Alessandro Vinceti</span>View author publications<p>You can also search for this author in <span>PubMed<span> </span>Google Scholar</span></p></li><li><span>Rafaele M. Iannuzzi</span>View author publications<p>You can also search for this author in <span>PubMed<span> </span>Google Scholar</span></p></li><li><span>Isabella Boyle</span>View author publications<p>You can also search for this author in <span>PubMed<span> </span>Google Scholar</span></p></li><li><span>Lucia Trastulla</span>View author publications<p>You can also search for this author in <span>PubMed<span> </span>Google Scholar</span></p></li><li><span>Catarina D. Campbell</span>View author publications<p>You can also search for this author in <span>PubMed<span> </span>Google Scholar</span></p></li><li><span>Francisca Vazquez</span>View author publications<p>You can also search for this author in <
更正:Genome Biol 25, 192 (2024)https://doi.org/10.1186/s13059-024-03336-1Following 原文[1]发表后,作者发现利益完成部分有一处遗漏。竞争利益FI接受学术界和工业界共同参与的公私合作计划 Open Targets 的资助,并为 CRUK-AstraZeneca 联合功能基因组学中心和 Mosaic TX 提供咨询服务。JD 是 Jumble Therapeutics 的顾问并持有其股份。CDC 为 Droplet Biosciences 提供顾问服务,并且是诺华公司的股东。FV 从依赖性图谱联盟(Dependency Map Consortium)、Riva Therapeutics、Bristol Myers Squibb、Merck、Illumina 和 Deerfield Management 获得研究支持。FV 是葛兰素史克公司的科学顾问委员会成员,是 Riva Therapeutics 公司的顾问并持有该公司的股份,还是 Jumble Therapeutics 公司的联合创始人并持有该公司的股份。原文[1]已更正。A benchmark of computational methods for correcting biases of established and unknown origin in CRISPR-Cas9 screening data.Genome Biol. 2024;25:192. https://doi.org/10.1186/s13059-024-03336-1.Article PubMed PubMed Central Google Scholar Download references作者和单位意大利米兰人类技术中心计算生物学研究中心Alessandro Vinceti, Rafaele M. Iannuzzi, Lucia Trastulla & Francesco Iorio美国马萨诸塞州剑桥市哈佛和麻省理工学院布罗德研究所Isabella Boyle, Catarina D. Campbell, Francisca Vazquez & Joshua M. DempsterDempsterAuthorsAlessandro VincetiView Author publications您也可以在PubMed Google Scholar中搜索该作者Rafaele M. IannuzziView Author publications您也可以在PubMed Google Scholar中搜索该作者Isabella BoyleView Author publications您也可以在PubMed Google Scholar中搜索该作者Lucia TrastullaView Author publications您也可以在PubMed Google Scholar中搜索该作者Catarina D. CampbellCampbellView author publications您也可以在PubMed Google Scholar中搜索该作者Francisca VazquezView author publications您也可以在PubMed Google Scholar中搜索该作者Joshua M. DempsterView author publications您也可以在PubMed Google Scholar中搜索该作者Francesco IorioView author publications您也可以在PubMed Google Scholar中搜索该作者Corresponding authorCorrespondence to Francesco Iorio.开放存取 本文采用知识共享署名 4.0 国际许可协议进行许可,该协议允许以任何媒介或格式使用、共享、改编、分发和复制本文,但须注明原作者和出处,提供知识共享许可协议链接,并说明是否进行了修改。本文中的图片或其他第三方材料均包含在文章的知识共享许可协议中,除非在材料的署名栏中另有说明。如果材料未包含在文章的知识共享许可协议中,且您打算使用的材料不符合法律规定或超出许可使用范围,则您需要直接从版权所有者处获得许可。要查看该许可的副本,请访问 http://creativecommons.org/licenses/by/4.0/。除非在数据的信用行中另有说明,否则知识共享公共领域专用免责声明 (http://creativecommons.org/publicdomain/zero/1.0/) 适用于本文提供的数据。转载与许可引用本文Vinceti, A., Iannuzzi, R.M., Boyle, I. et al. Author Correction:用于纠正 CRISPR-Cas9 筛选数据中已确定和未知来源偏差的计算方法基准。Genome Biol 25, 239 (2024). https://doi.org/10.1186/s13059-024-03387-4Download citationPublished: 04 September 2024DOI: https://doi.org/10.1186/s13059-024-03387-4Share this articleAnyone you share the following link with will be able to read this content:Get shareable linkSorry, a shareable link is not currently available for this article.Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative.
{"title":"Author Correction: A benchmark of computational methods for correcting biases of established and unknown origin in CRISPR-Cas9 screening data","authors":"Alessandro Vinceti, Rafaele M. Iannuzzi, Isabella Boyle, Lucia Trastulla, Catarina D. Campbell, Francisca Vazquez, Joshua M. Dempster, Francesco Iorio","doi":"10.1186/s13059-024-03387-4","DOIUrl":"https://doi.org/10.1186/s13059-024-03387-4","url":null,"abstract":"<p><b>Correction</b><b>: </b><b>Genome Biol 25, 192 (2024)</b></p><p><b>https://doi.org/10.1186/s13059-024-03336-1</b></p><br/><p>Following publication of the original article [1], the authors identified an omission in the completing interests section. The omitted text is given in bold below.</p><p><b>Competing interests</b></p><p>FI receives funding from Open Targets, a public-private initiative involving academia and industry and performs consultancy for the joint CRUK-AstraZeneca Functional Genomics Centre and for Mosaic TX. JD is a consultant for and holds equity in Jumble Therapeutics. CDC performs consultancy for Droplet Biosciences and is a shareholder of Novartis. <b>FV receives research support from the Dependency Map Consortium, Riva Therapeutics, Bristol Myers Squibb, Merck, Illumina, and Deerfield Management. FV is on the scientific advisory board of GSK, is a consultant and holds equity in Riva Therapeutics and is a co-founder and holds equity in Jumble Therapeutics</b>. All other authors declare that they have no competing interests.</p><p>The original article [1] is corrected.</p><ol data-track-component=\"outbound reference\" data-track-context=\"references section\"><li data-counter=\"1.\"><p>Vinceti A, Iannuzzi RM, Boyle I, et al. A benchmark of computational methods for correcting biases of established and unknown origin in CRISPR-Cas9 screening data. Genome Biol. 2024;25:192. https://doi.org/10.1186/s13059-024-03336-1.</p><p>Article PubMed PubMed Central Google Scholar </p></li></ol><p>Download references<svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" role=\"img\" width=\"16\"><use xlink:href=\"#icon-eds-i-download-medium\" xmlns:xlink=\"http://www.w3.org/1999/xlink\"></use></svg></p><h3>Authors and Affiliations</h3><ol><li><p>Computational Biology Research Centre, Human Technopole, Milan, Italy</p><p>Alessandro Vinceti, Rafaele M. Iannuzzi, Lucia Trastulla & Francesco Iorio</p></li><li><p>Broad Institute of Harvard and MIT, Cambridge, MA, USA</p><p>Isabella Boyle, Catarina D. Campbell, Francisca Vazquez & Joshua M. Dempster</p></li></ol><span>Authors</span><ol><li><span>Alessandro Vinceti</span>View author publications<p>You can also search for this author in <span>PubMed<span> </span>Google Scholar</span></p></li><li><span>Rafaele M. Iannuzzi</span>View author publications<p>You can also search for this author in <span>PubMed<span> </span>Google Scholar</span></p></li><li><span>Isabella Boyle</span>View author publications<p>You can also search for this author in <span>PubMed<span> </span>Google Scholar</span></p></li><li><span>Lucia Trastulla</span>View author publications<p>You can also search for this author in <span>PubMed<span> </span>Google Scholar</span></p></li><li><span>Catarina D. Campbell</span>View author publications<p>You can also search for this author in <span>PubMed<span> </span>Google Scholar</span></p></li><li><span>Francisca Vazquez</span>View author publications<p>You can also search for this author in <","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":12.3,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142130839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-03DOI: 10.1186/s13059-024-03377-6
Sean M. Flynn, Somdutta Dhir, Krzysztof Herka, Colm Doyle, Larry Melidis, Angela Simeone, Winnie W. I. Hui, Rafael de Cesaris Araujo Tavares, Stefan Schoenfelder, David Tannahill, Shankar Balasubramanian
Methods to measure chromatin contacts at genomic regions bound by histone modifications or proteins are important tools to investigate chromatin organization. However, such methods do not capture the possible involvement of other epigenomic features such as G-quadruplex DNA secondary structures (G4s). To bridge this gap, we introduce ViCAR (viewpoint HiCAR), for the direct antibody-based capture of chromatin interactions at folded G4s. Through ViCAR, we showcase the first G4-3D interaction landscape. Using histone marks, we also demonstrate how ViCAR improves on earlier approaches yielding increased signal-to-noise. ViCAR is a practical and powerful tool to explore epigenetic marks and 3D genome interactomes.
{"title":"Improved simultaneous mapping of epigenetic features and 3D chromatin structure via ViCAR","authors":"Sean M. Flynn, Somdutta Dhir, Krzysztof Herka, Colm Doyle, Larry Melidis, Angela Simeone, Winnie W. I. Hui, Rafael de Cesaris Araujo Tavares, Stefan Schoenfelder, David Tannahill, Shankar Balasubramanian","doi":"10.1186/s13059-024-03377-6","DOIUrl":"https://doi.org/10.1186/s13059-024-03377-6","url":null,"abstract":"Methods to measure chromatin contacts at genomic regions bound by histone modifications or proteins are important tools to investigate chromatin organization. However, such methods do not capture the possible involvement of other epigenomic features such as G-quadruplex DNA secondary structures (G4s). To bridge this gap, we introduce ViCAR (viewpoint HiCAR), for the direct antibody-based capture of chromatin interactions at folded G4s. Through ViCAR, we showcase the first G4-3D interaction landscape. Using histone marks, we also demonstrate how ViCAR improves on earlier approaches yielding increased signal-to-noise. ViCAR is a practical and powerful tool to explore epigenetic marks and 3D genome interactomes.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":12.3,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142123681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-03DOI: 10.1186/s13059-024-03376-7
Brennan H. Baker, Sheela Sathyanarayana, Adam A. Szpiro, James W. MacDonald, Alison G. Paquette
Missing covariate data is a common problem that has not been addressed in observational studies of gene expression. Here, we present a multiple imputation method that accommodates high dimensional gene expression data by incorporating principal component analysis of the transcriptome into the multiple imputation prediction models to avoid bias. Simulation studies using three datasets show that this method outperforms complete case and single imputation analyses at uncovering true positive differentially expressed genes, limiting false discovery rates, and minimizing bias. This method is easily implemented via an R Bioconductor package, RNAseqCovarImpute that integrates with the limma-voom pipeline for differential expression analysis.
协变量数据缺失是基因表达观测研究中尚未解决的一个常见问题。在这里,我们提出了一种多重估算方法,通过将转录组的主成分分析纳入多重估算预测模型来避免偏差,从而适应高维基因表达数据。使用三个数据集进行的模拟研究表明,该方法在发现真正的阳性差异表达基因、限制误发现率和最小化偏倚方面优于完全情况分析和单一归因分析。这种方法可通过 R Bioconductor 软件包 RNAseqCovarImpute 轻松实现,该软件包可与 limma-voom 差异表达分析管道集成。
{"title":"RNAseqCovarImpute: a multiple imputation procedure that outperforms complete case and single imputation differential expression analysis","authors":"Brennan H. Baker, Sheela Sathyanarayana, Adam A. Szpiro, James W. MacDonald, Alison G. Paquette","doi":"10.1186/s13059-024-03376-7","DOIUrl":"https://doi.org/10.1186/s13059-024-03376-7","url":null,"abstract":"Missing covariate data is a common problem that has not been addressed in observational studies of gene expression. Here, we present a multiple imputation method that accommodates high dimensional gene expression data by incorporating principal component analysis of the transcriptome into the multiple imputation prediction models to avoid bias. Simulation studies using three datasets show that this method outperforms complete case and single imputation analyses at uncovering true positive differentially expressed genes, limiting false discovery rates, and minimizing bias. This method is easily implemented via an R Bioconductor package, RNAseqCovarImpute that integrates with the limma-voom pipeline for differential expression analysis.\u0000","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":12.3,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142123682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-02DOI: 10.1186/s13059-024-03374-9
Olivier B. Poirion, Wulin Zuo, Catrina Spruce, Candice N. Baker, Sandra L. Daigle, Ashley Olson, Daniel A. Skelly, Elissa J. Chesler, Christopher L. Baker, Brian S. White
Enhlink is a computational tool for scATAC-seq data analysis, facilitating precise interrogation of enhancer function at the single-cell level. It employs an ensemble approach incorporating technical and biological covariates to infer condition-specific regulatory DNA linkages. Enhlink can integrate multi-omic data for enhanced specificity, when available. Evaluation with simulated and real data, including multi-omic datasets from the mouse striatum and novel promoter capture Hi-C data, demonstrate that Enhlink outperfoms alternative methods. Coupled with eQTL analysis, it identified a putative super-enhancer in striatal neurons. Overall, Enhlink offers accuracy, power, and potential for revealing novel biological insights in gene regulation.
{"title":"Enhlink infers distal and context-specific enhancer–promoter linkages","authors":"Olivier B. Poirion, Wulin Zuo, Catrina Spruce, Candice N. Baker, Sandra L. Daigle, Ashley Olson, Daniel A. Skelly, Elissa J. Chesler, Christopher L. Baker, Brian S. White","doi":"10.1186/s13059-024-03374-9","DOIUrl":"https://doi.org/10.1186/s13059-024-03374-9","url":null,"abstract":"Enhlink is a computational tool for scATAC-seq data analysis, facilitating precise interrogation of enhancer function at the single-cell level. It employs an ensemble approach incorporating technical and biological covariates to infer condition-specific regulatory DNA linkages. Enhlink can integrate multi-omic data for enhanced specificity, when available. Evaluation with simulated and real data, including multi-omic datasets from the mouse striatum and novel promoter capture Hi-C data, demonstrate that Enhlink outperfoms alternative methods. Coupled with eQTL analysis, it identified a putative super-enhancer in striatal neurons. Overall, Enhlink offers accuracy, power, and potential for revealing novel biological insights in gene regulation.\u0000","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":12.3,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142118216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}