首页 > 最新文献

Briefings in Functional Genomics最新文献

英文 中文
Environmental community transcriptomics: strategies and struggles. 环境群落转录组学:战略与斗争。
IF 2.5 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2025-01-15 DOI: 10.1093/bfgp/elae033
Jeanet Mante, Kyra E Groover, Randi M Pullen

Transcriptomics is the study of RNA transcripts, the portion of the genome that is transcribed, in a specific cell, tissue, or organism. Transcriptomics provides insight into gene expression patterns, regulation, and the underlying mechanisms of cellular processes. Community transcriptomics takes this a step further by studying the RNA transcripts from environmental assemblies of organisms, with the intention of better understanding the interactions between members of the community. Community transcriptomics requires successful extraction of RNA from a diverse set of organisms and subsequent analysis via mapping those reads to a reference genome or de novo assembly of the reads. Both, extraction protocols and the analysis steps can pose hurdles for community transcriptomics. This review covers advances in transcriptomic techniques and assesses the viability of applying them to community transcriptomics.

转录组学研究的是特定细胞、组织或生物体中的 RNA 转录本,即基因组中被转录的部分。转录组学有助于深入了解基因表达模式、调控和细胞过程的内在机制。群落转录组学在此基础上更进一步,研究了生物环境集合体中的 RNA 转录本,目的是更好地了解群落成员之间的相互作用。群落转录组学要求成功地从各种生物体中提取 RNA,然后通过将这些读数映射到参考基因组或重新组装读数进行分析。提取协议和分析步骤都会对群落转录组学造成障碍。本综述介绍了转录组学技术的进展,并评估了将这些技术应用于群落转录组学的可行性。
{"title":"Environmental community transcriptomics: strategies and struggles.","authors":"Jeanet Mante, Kyra E Groover, Randi M Pullen","doi":"10.1093/bfgp/elae033","DOIUrl":"10.1093/bfgp/elae033","url":null,"abstract":"<p><p>Transcriptomics is the study of RNA transcripts, the portion of the genome that is transcribed, in a specific cell, tissue, or organism. Transcriptomics provides insight into gene expression patterns, regulation, and the underlying mechanisms of cellular processes. Community transcriptomics takes this a step further by studying the RNA transcripts from environmental assemblies of organisms, with the intention of better understanding the interactions between members of the community. Community transcriptomics requires successful extraction of RNA from a diverse set of organisms and subsequent analysis via mapping those reads to a reference genome or de novo assembly of the reads. Both, extraction protocols and the analysis steps can pose hurdles for community transcriptomics. This review covers advances in transcriptomic techniques and assesses the viability of applying them to community transcriptomics.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735753/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142057398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
pyRforest: a comprehensive R package for genomic data analysis featuring scikit-learn Random Forests in R. pyRforest:用于基因组数据分析的综合性 R 软件包,采用 R 中的 scikit-learn 随机森林技术。
IF 2.5 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2025-01-15 DOI: 10.1093/bfgp/elae038
Tyler Kolisnik, Faeze Keshavarz-Rahaghi, Rachel V Purcell, Adam N H Smith, Olin K Silander

Random Forest models are widely used in genomic data analysis and can offer insights into complex biological mechanisms, particularly when features influence the target in interactive, nonlinear, or nonadditive ways. Currently, some of the most efficient Random Forest methods in terms of computational speed are implemented in Python. However, many biologists use R for genomic data analysis, as R offers a unified platform for performing additional statistical analysis and visualization. Here, we present an R package, pyRforest, which integrates Python scikit-learn "RandomForestClassifier" algorithms into the R environment. pyRforest inherits the efficient memory management and parallelization of Python, and is optimized for classification tasks on large genomic datasets, such as those from RNA-seq. pyRforest offers several additional capabilities, including a novel rank-based permutation method for biomarker identification. This method can be used to estimate and visualize P-values for individual features, allowing the researcher to identify a subset of features for which there is robust statistical evidence of an effect. In addition, pyRforest includes methods for the calculation and visualization of SHapley Additive exPlanations values. Finally, pyRforest includes support for comprehensive downstream analysis for gene ontology and pathway enrichment. pyRforest thus improves the implementation and interpretability of Random Forest models for genomic data analysis by merging the strengths of Python with R. pyRforest can be downloaded at: https://www.github.com/tkolisnik/pyRforest with an associated vignette at https://github.com/tkolisnik/pyRforest/blob/main/vignettes/pyRforest-vignette.pdf.

随机森林模型被广泛应用于基因组数据分析,并能深入揭示复杂的生物机制,尤其是当特征以交互、非线性或非相加的方式影响目标时。目前,一些计算速度最快的随机森林方法是用 Python 实现的。然而,许多生物学家使用 R 进行基因组数据分析,因为 R 提供了一个统一的平台来执行额外的统计分析和可视化。pyRforest 继承了 Python 的高效内存管理和并行化功能,并针对大型基因组数据集(如 RNA-seq 数据集)上的分类任务进行了优化。这种方法可用于估算和直观显示单个特征的 P 值,使研究人员能够识别出有可靠统计证据表明存在效应的特征子集。此外,pyRforest 还包括 SHapley Additive exPlanations 值的计算和可视化方法。pyRforest 结合了 Python 和 R 的优势,从而改进了用于基因组数据分析的随机森林模型的实现和可解释性。pyRforest 的下载地址为:https://www.github.com/tkolisnik/pyRforest,相关的 vignette 下载地址为:https://github.com/tkolisnik/pyRforest/blob/main/vignettes/pyRforest-vignette.pdf。
{"title":"pyRforest: a comprehensive R package for genomic data analysis featuring scikit-learn Random Forests in R.","authors":"Tyler Kolisnik, Faeze Keshavarz-Rahaghi, Rachel V Purcell, Adam N H Smith, Olin K Silander","doi":"10.1093/bfgp/elae038","DOIUrl":"10.1093/bfgp/elae038","url":null,"abstract":"<p><p>Random Forest models are widely used in genomic data analysis and can offer insights into complex biological mechanisms, particularly when features influence the target in interactive, nonlinear, or nonadditive ways. Currently, some of the most efficient Random Forest methods in terms of computational speed are implemented in Python. However, many biologists use R for genomic data analysis, as R offers a unified platform for performing additional statistical analysis and visualization. Here, we present an R package, pyRforest, which integrates Python scikit-learn \"RandomForestClassifier\" algorithms into the R environment. pyRforest inherits the efficient memory management and parallelization of Python, and is optimized for classification tasks on large genomic datasets, such as those from RNA-seq. pyRforest offers several additional capabilities, including a novel rank-based permutation method for biomarker identification. This method can be used to estimate and visualize P-values for individual features, allowing the researcher to identify a subset of features for which there is robust statistical evidence of an effect. In addition, pyRforest includes methods for the calculation and visualization of SHapley Additive exPlanations values. Finally, pyRforest includes support for comprehensive downstream analysis for gene ontology and pathway enrichment. pyRforest thus improves the implementation and interpretability of Random Forest models for genomic data analysis by merging the strengths of Python with R. pyRforest can be downloaded at: https://www.github.com/tkolisnik/pyRforest with an associated vignette at https://github.com/tkolisnik/pyRforest/blob/main/vignettes/pyRforest-vignette.pdf.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735746/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142382536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
STLBRF: an improved random forest algorithm based on standardized-threshold for feature screening of gene expression data. STLBRF:基于标准化阈值的改进型随机森林算法,用于基因表达数据的特征筛选。
IF 2.5 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2025-01-15 DOI: 10.1093/bfgp/elae048
Huini Feng, Ying Ju, Xiaofeng Yin, Wenshi Qiu, Xu Zhang

When the traditional random forest (RF) algorithm is used to select feature elements in biostatistical data, a large amount of noise data and parameters can affect the importance of the selected feature elements, making the control of feature selection difficult. Therefore, it is a challenge for the traditional RF algorithm to preserve the accuracy of algorithm results in the presence of noise data. Generally, directly removing noise data can result in significant bias in the results. In this study, we develop a new algorithm, standardized threshold, and loops based random forest (STLBRF), and apply it to the field of gene expression data for feature gene selection. This algorithm, based on the traditional RF algorithm, combines backward elimination and K-fold cross-validation to construct a cyclic system and set a standardized threshold: error increment. The algorithm overcomes the shortcomings of existing gene selection methods. We compare ridge regression, lasso regression, elastic net regression, the traditional RF algorithm, and our improved RF algorithm using three real gene expression datasets and conducting a quantitative analysis. To ensure the reliability of the results, we validate the effectiveness of the genes selected by these methods using the Random Forest classifier. The results indicate that, compared to other methods, the STLBRF algorithm achieves not only higher effectiveness in feature gene selection but also better control over the number of selected genes. Our method offers reliable technical support for feature expression analysis and research on biomarker selection.

传统的随机森林(random forest, RF)算法在生物统计数据中选择特征元素时,大量的噪声数据和参数会影响所选特征元素的重要性,给特征选择的控制带来困难。因此,传统的射频算法在存在噪声数据的情况下如何保持算法结果的准确性是一个挑战。通常,直接去除噪声数据会导致结果出现明显偏差。在本研究中,我们开发了一种新的算法,标准化阈值和基于循环的随机森林(STLBRF),并将其应用于基因表达数据领域的特征基因选择。该算法在传统射频算法的基础上,结合反向消除和K-fold交叉验证构建循环系统,并设置标准化阈值:误差增量。该算法克服了现有基因选择方法的不足。我们比较了岭回归、lasso回归、弹性网回归、传统的射频算法和改进的射频算法,并使用三个真实的基因表达数据集进行了定量分析。为了确保结果的可靠性,我们使用随机森林分类器验证了这些方法选择的基因的有效性。结果表明,与其他方法相比,STLBRF算法不仅在特征基因选择方面具有更高的有效性,而且对选择的基因数量也有更好的控制。该方法为特征表达分析和生物标志物选择研究提供了可靠的技术支持。
{"title":"STLBRF: an improved random forest algorithm based on standardized-threshold for feature screening of gene expression data.","authors":"Huini Feng, Ying Ju, Xiaofeng Yin, Wenshi Qiu, Xu Zhang","doi":"10.1093/bfgp/elae048","DOIUrl":"10.1093/bfgp/elae048","url":null,"abstract":"<p><p>When the traditional random forest (RF) algorithm is used to select feature elements in biostatistical data, a large amount of noise data and parameters can affect the importance of the selected feature elements, making the control of feature selection difficult. Therefore, it is a challenge for the traditional RF algorithm to preserve the accuracy of algorithm results in the presence of noise data. Generally, directly removing noise data can result in significant bias in the results. In this study, we develop a new algorithm, standardized threshold, and loops based random forest (STLBRF), and apply it to the field of gene expression data for feature gene selection. This algorithm, based on the traditional RF algorithm, combines backward elimination and K-fold cross-validation to construct a cyclic system and set a standardized threshold: error increment. The algorithm overcomes the shortcomings of existing gene selection methods. We compare ridge regression, lasso regression, elastic net regression, the traditional RF algorithm, and our improved RF algorithm using three real gene expression datasets and conducting a quantitative analysis. To ensure the reliability of the results, we validate the effectiveness of the genes selected by these methods using the Random Forest classifier. The results indicate that, compared to other methods, the STLBRF algorithm achieves not only higher effectiveness in feature gene selection but also better control over the number of selected genes. Our method offers reliable technical support for feature expression analysis and research on biomarker selection.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735748/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142907874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Application of computational algorithms for single-cell RNA-seq and ATAC-seq in neurodegenerative diseases. 单细胞 RNA-seq 和 ATAC-seq 计算算法在神经退行性疾病中的应用。
IF 2.5 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2025-01-15 DOI: 10.1093/bfgp/elae044
Hwisoo Choi, Hyeonkyu Kim, Hoebin Chung, Dong-Sung Lee, Junil Kim

Recent advancements in single-cell technologies, including single-cell RNA sequencing (scRNA-seq) and Assay for Transposase-Accessible Chromatin using sequencing (scATAC-seq), have greatly improved our insight into the epigenomic landscapes across various biological contexts and diseases. This paper reviews key computational tools and machine learning approaches that integrate scRNA-seq and scATAC-seq data to facilitate the alignment of transcriptomic data with chromatin accessibility profiles. Applying these integrated single-cell technologies in neurodegenerative diseases, such as Alzheimer's disease and Parkinson's disease, reveals how changes in chromatin accessibility and gene expression can illuminate pathogenic mechanisms and identify potential therapeutic targets. Despite facing challenges like data sparsity and computational demands, ongoing enhancements in scATAC-seq and scRNA-seq technologies, along with better analytical methods, continue to expand their applications. These advancements promise to revolutionize our approach to medical research and clinical diagnostics, offering a comprehensive view of cellular function and disease pathology.

单细胞技术的最新进展,包括单细胞RNA测序(scRNA-seq)和转座酶可及染色质测序(scATAC-seq),大大提高了我们对各种生物背景和疾病的表观基因组景观的洞察力。本文综述了整合 scRNA-seq 和 scATAC-seq 数据的关键计算工具和机器学习方法,以促进转录组数据与染色质可及性图谱的配准。在阿尔茨海默病和帕金森病等神经退行性疾病中应用这些集成单细胞技术,揭示了染色质可及性和基因表达的变化如何阐明致病机制并确定潜在的治疗靶点。尽管面临数据稀缺和计算需求等挑战,scATAC-seq 和 scRNA-seq 技术的不断改进以及更好的分析方法仍在继续扩大其应用范围。这些进步有望彻底改变我们的医学研究和临床诊断方法,为细胞功能和疾病病理提供一个全面的视角。
{"title":"Application of computational algorithms for single-cell RNA-seq and ATAC-seq in neurodegenerative diseases.","authors":"Hwisoo Choi, Hyeonkyu Kim, Hoebin Chung, Dong-Sung Lee, Junil Kim","doi":"10.1093/bfgp/elae044","DOIUrl":"10.1093/bfgp/elae044","url":null,"abstract":"<p><p>Recent advancements in single-cell technologies, including single-cell RNA sequencing (scRNA-seq) and Assay for Transposase-Accessible Chromatin using sequencing (scATAC-seq), have greatly improved our insight into the epigenomic landscapes across various biological contexts and diseases. This paper reviews key computational tools and machine learning approaches that integrate scRNA-seq and scATAC-seq data to facilitate the alignment of transcriptomic data with chromatin accessibility profiles. Applying these integrated single-cell technologies in neurodegenerative diseases, such as Alzheimer's disease and Parkinson's disease, reveals how changes in chromatin accessibility and gene expression can illuminate pathogenic mechanisms and identify potential therapeutic targets. Despite facing challenges like data sparsity and computational demands, ongoing enhancements in scATAC-seq and scRNA-seq technologies, along with better analytical methods, continue to expand their applications. These advancements promise to revolutionize our approach to medical research and clinical diagnostics, offering a comprehensive view of cellular function and disease pathology.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735751/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142584958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
m6A RNA modification pathway: orchestrating fibrotic mechanisms across multiple organs. m6A RNA修饰途径:协调多器官纤维化机制。
IF 2.5 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2025-01-15 DOI: 10.1093/bfgp/elae051
Xiangfei Huang, Zilu Yu, Juan Tian, Tao Chen, Aiping Wei, Chao Mei, Shibiao Chen, Yong Li

Organ fibrosis, a common consequence of chronic tissue injury, presents a significant health challenge. Recent research has revealed the regulatory role of N6-methyladenosine (m6A) RNA modification in fibrosis of various organs, including the lung, liver, kidney, and heart. In this comprehensive review, we summarize the latest findings on the mechanisms and functions of m6A modification in organ fibrosis. By highlighting the potential of m6A modification as a therapeutic target, our goal is to encourage further research in this emerging field and support advancements in the clinical treatment of organ fibrosis.

器官纤维化是慢性组织损伤的常见后果,对健康提出了重大挑战。最近的研究揭示了n6 -甲基腺苷(m6A) RNA修饰在包括肺、肝、肾和心脏在内的各种器官纤维化中的调节作用。在这篇综述中,我们对m6A修饰在器官纤维化中的机制和功能的最新发现进行了综述。通过强调m6A修饰作为治疗靶点的潜力,我们的目标是鼓励在这一新兴领域的进一步研究,并支持器官纤维化临床治疗的进步。
{"title":"m6A RNA modification pathway: orchestrating fibrotic mechanisms across multiple organs.","authors":"Xiangfei Huang, Zilu Yu, Juan Tian, Tao Chen, Aiping Wei, Chao Mei, Shibiao Chen, Yong Li","doi":"10.1093/bfgp/elae051","DOIUrl":"10.1093/bfgp/elae051","url":null,"abstract":"<p><p>Organ fibrosis, a common consequence of chronic tissue injury, presents a significant health challenge. Recent research has revealed the regulatory role of N6-methyladenosine (m6A) RNA modification in fibrosis of various organs, including the lung, liver, kidney, and heart. In this comprehensive review, we summarize the latest findings on the mechanisms and functions of m6A modification in organ fibrosis. By highlighting the potential of m6A modification as a therapeutic target, our goal is to encourage further research in this emerging field and support advancements in the clinical treatment of organ fibrosis.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735750/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142933353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Less is more: relative rank is more informative than absolute abundance for compositional NGS data. 少即是多:对于成分 NGS 数据而言,相对等级比绝对丰度更有参考价值。
IF 2.5 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2025-01-15 DOI: 10.1093/bfgp/elae045
Xubin Zheng, Nana Jin, Qiong Wu, Ning Zhang, Haonan Wu, Yuanhao Wang, Rui Luo, Tao Liu, Wanfu Ding, Qingshan Geng, Lixin Cheng

High-throughput gene expression data have been extensively generated and utilized in biological mechanism investigations, biomarker detection, disease diagnosis and prognosis. These applications encompass not only bulk transcriptome, but also single cell RNA-seq data. However, extracting reliable biological information from transcriptome data remains challenging due to the constrains of Compositional Data Analysis. Current data preprocessing methods, including dataset normalization and batch effect correction, are insufficient to address these issues and improve data quality for downstream analysis. Alternatively, qualification methods focusing on the relative order of gene expression (ROGER) are more informative than the quantification methods that rely on gene expression abundance. The Pairwise Analysis of Gene expression method is an enhancement of ROGER, designed for data integration in either sample space or feature space. In this review, we summarize the methods applied to transcriptome data analysis and discuss their potentials in predicting clinical outcomes.

高通量基因表达数据已广泛产生并用于生物机制研究、生物标记物检测、疾病诊断和预后。这些应用不仅包括大量转录组数据,还包括单细胞 RNA-seq 数据。然而,由于合成数据分析的限制,从转录组数据中提取可靠的生物信息仍然具有挑战性。目前的数据预处理方法,包括数据集归一化和批量效应校正,都不足以解决这些问题并提高下游分析的数据质量。另外,与依赖基因表达丰度的定量方法相比,侧重于基因表达相对顺序(ROGER)的定性方法信息量更大。基因表达成对分析方法是 ROGER 的增强版,旨在对样本空间或特征空间进行数据整合。在这篇综述中,我们总结了应用于转录组数据分析的方法,并讨论了这些方法在预测临床结果方面的潜力。
{"title":"Less is more: relative rank is more informative than absolute abundance for compositional NGS data.","authors":"Xubin Zheng, Nana Jin, Qiong Wu, Ning Zhang, Haonan Wu, Yuanhao Wang, Rui Luo, Tao Liu, Wanfu Ding, Qingshan Geng, Lixin Cheng","doi":"10.1093/bfgp/elae045","DOIUrl":"10.1093/bfgp/elae045","url":null,"abstract":"<p><p>High-throughput gene expression data have been extensively generated and utilized in biological mechanism investigations, biomarker detection, disease diagnosis and prognosis. These applications encompass not only bulk transcriptome, but also single cell RNA-seq data. However, extracting reliable biological information from transcriptome data remains challenging due to the constrains of Compositional Data Analysis. Current data preprocessing methods, including dataset normalization and batch effect correction, are insufficient to address these issues and improve data quality for downstream analysis. Alternatively, qualification methods focusing on the relative order of gene expression (ROGER) are more informative than the quantification methods that rely on gene expression abundance. The Pairwise Analysis of Gene expression method is an enhancement of ROGER, designed for data integration in either sample space or feature space. In this review, we summarize the methods applied to transcriptome data analysis and discuss their potentials in predicting clinical outcomes.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735744/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142683596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using artificial intelligence and statistics for managing peritoneal metastases from gastrointestinal cancers. 使用人工智能和统计学来管理胃肠道癌症的腹膜转移。
IF 2.5 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2025-01-15 DOI: 10.1093/bfgp/elae049
Adam Wojtulewski, Aleksandra Sikora, Sean Dineen, Mustafa Raoof, Aleksandra Karolak

Objective: The primary objective of this study is to investigate various applications of artificial intelligence (AI) and statistical methodologies for analyzing and managing peritoneal metastases (PM) caused by gastrointestinal cancers.

Methods: Relevant keywords and search criteria were comprehensively researched on PubMed and Google Scholar to identify articles and reviews related to the topic. The AI approaches considered were conventional machine learning (ML) and deep learning (DL) models, and the relevant statistical approaches included biostatistics and logistic models.

Results: The systematic literature review yielded nearly 30 articles meeting the predefined criteria. Analyses of these studies showed that AI methodologies consistently outperformed traditional statistical approaches. In the AI approaches, DL consistently produced the most precise results, while classical ML demonstrated varied performance but maintained high predictive accuracy. The sample size was the recurring factor that increased the accuracy of the predictions for models of the same type.

Conclusions: AI and statistical approaches can detect PM developing among patients with gastrointestinal cancers. Therefore, if clinicians integrated these approaches into diagnostics and prognostics, they could better analyze and manage PM, enhancing clinical decision-making and patients' outcomes. Collaboration across multiple institutions would also help in standardizing methods for data collection and allowing consistent results.

目的:本研究的主要目的是探讨人工智能(AI)和统计方法在胃肠道癌症引起的腹膜转移(PM)分析和管理中的各种应用。方法:在PubMed和谷歌Scholar上综合研究相关关键词和检索标准,识别与该主题相关的文章和综述。考虑的人工智能方法是传统的机器学习(ML)和深度学习(DL)模型,相关的统计方法包括生物统计学和逻辑模型。结果:系统文献综述得到符合预定标准的近30篇文章。对这些研究的分析表明,人工智能方法始终优于传统的统计方法。在人工智能方法中,深度学习始终产生最精确的结果,而经典ML表现出不同的性能,但保持了很高的预测准确性。样本量是增加同类型模型预测准确性的反复出现的因素。结论:人工智能和统计学方法可以检测胃肠道肿瘤患者发生的PM。因此,如果临床医生将这些方法整合到诊断和预后中,他们可以更好地分析和管理PM,提高临床决策和患者预后。多个机构之间的合作也将有助于数据收集方法的标准化,并允许一致的结果。
{"title":"Using artificial intelligence and statistics for managing peritoneal metastases from gastrointestinal cancers.","authors":"Adam Wojtulewski, Aleksandra Sikora, Sean Dineen, Mustafa Raoof, Aleksandra Karolak","doi":"10.1093/bfgp/elae049","DOIUrl":"10.1093/bfgp/elae049","url":null,"abstract":"<p><strong>Objective: </strong>The primary objective of this study is to investigate various applications of artificial intelligence (AI) and statistical methodologies for analyzing and managing peritoneal metastases (PM) caused by gastrointestinal cancers.</p><p><strong>Methods: </strong>Relevant keywords and search criteria were comprehensively researched on PubMed and Google Scholar to identify articles and reviews related to the topic. The AI approaches considered were conventional machine learning (ML) and deep learning (DL) models, and the relevant statistical approaches included biostatistics and logistic models.</p><p><strong>Results: </strong>The systematic literature review yielded nearly 30 articles meeting the predefined criteria. Analyses of these studies showed that AI methodologies consistently outperformed traditional statistical approaches. In the AI approaches, DL consistently produced the most precise results, while classical ML demonstrated varied performance but maintained high predictive accuracy. The sample size was the recurring factor that increased the accuracy of the predictions for models of the same type.</p><p><strong>Conclusions: </strong>AI and statistical approaches can detect PM developing among patients with gastrointestinal cancers. Therefore, if clinicians integrated these approaches into diagnostics and prognostics, they could better analyze and manage PM, enhancing clinical decision-making and patients' outcomes. Collaboration across multiple institutions would also help in standardizing methods for data collection and allowing consistent results.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735730/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142907876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond the hype: using AI, big data, wearable devices, and the internet of things for high-throughput livestock phenotyping. 超越炒作:利用人工智能、大数据、可穿戴设备和物联网进行高通量家畜表型分析。
IF 2.5 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2025-01-15 DOI: 10.1093/bfgp/elae032
Tomas Klingström, Emelie Zonabend König, Avhashoni Agnes Zwane

Phenotyping of animals is a routine task in agriculture which can provide large datasets for the functional annotation of genomes. Using the livestock farming sector to study complex traits enables genetics researchers to fully benefit from the digital transformation of society as economies of scale substantially reduces the cost of phenotyping animals on farms. In the agricultural sector genomics has transitioned towards a model of 'Genomics without the genes' as a large proportion of the genetic variation in animals can be modelled using the infinitesimal model for genomic breeding valuations. Combined with third generation sequencing creating pan-genomes for livestock the digital infrastructure for trait collection and precision farming provides a unique opportunity for high-throughput phenotyping and the study of complex traits in a controlled environment. The emphasis on cost efficient data collection mean that mobile phones and computers have become ubiquitous for cost-efficient large-scale data collection but that the majority of the recorded traits can still be recorded manually with limited training or tools. This is especially valuable in low- and middle income countries and in settings where indigenous breeds are kept at farms preserving more traditional farming methods. Digitalization is therefore an important enabler for high-throughput phenotyping for smaller livestock herds with limited technology investments as well as large-scale commercial operations. It is demanding and challenging for individual researchers to keep up with the opportunities created by the rapid advances in digitalization for livestock farming and how it can be used by researchers with or without a specialization in livestock. This review provides an overview of the current status of key enabling technologies for precision livestock farming applicable for the functional annotation of genomes.

动物表型分析是农业领域的一项常规工作,可为基因组功能注释提供大量数据集。利用畜牧业研究复杂的性状能让遗传学研究人员充分受益于社会的数字化转型,因为规模经济大大降低了农场动物表型的成本。在农业领域,基因组学已向 "无基因的基因组学 "模式过渡,因为动物的大部分遗传变异都可以利用基因组育种估值的无限小模型进行建模。第三代测序技术为家畜创建了泛基因组,而用于性状收集和精准农业的数字基础设施则为高通量表型分析和在受控环境中研究复杂性状提供了独特的机会。对低成本高效率数据收集的重视意味着,移动电话和计算机已变得无处不在,可用于低成本高效率的大规模数据收集,但大多数记录的性状仍可通过有限的培训或工具进行人工记录。这在中低收入国家和保留本土品种的农场中尤为重要。因此,对于技术投资有限的小型畜群和大规模商业运营而言,数字化是高通量表型分析的重要推动因素。对于个人研究人员来说,如何跟上畜牧业数字化快速发展所带来的机遇,以及如何让畜牧业专业或非畜牧业专业的研究人员使用数字化技术,是一项艰巨而富有挑战性的任务。本综述概述了适用于基因组功能注释的精准畜牧业关键使能技术的现状。
{"title":"Beyond the hype: using AI, big data, wearable devices, and the internet of things for high-throughput livestock phenotyping.","authors":"Tomas Klingström, Emelie Zonabend König, Avhashoni Agnes Zwane","doi":"10.1093/bfgp/elae032","DOIUrl":"10.1093/bfgp/elae032","url":null,"abstract":"<p><p>Phenotyping of animals is a routine task in agriculture which can provide large datasets for the functional annotation of genomes. Using the livestock farming sector to study complex traits enables genetics researchers to fully benefit from the digital transformation of society as economies of scale substantially reduces the cost of phenotyping animals on farms. In the agricultural sector genomics has transitioned towards a model of 'Genomics without the genes' as a large proportion of the genetic variation in animals can be modelled using the infinitesimal model for genomic breeding valuations. Combined with third generation sequencing creating pan-genomes for livestock the digital infrastructure for trait collection and precision farming provides a unique opportunity for high-throughput phenotyping and the study of complex traits in a controlled environment. The emphasis on cost efficient data collection mean that mobile phones and computers have become ubiquitous for cost-efficient large-scale data collection but that the majority of the recorded traits can still be recorded manually with limited training or tools. This is especially valuable in low- and middle income countries and in settings where indigenous breeds are kept at farms preserving more traditional farming methods. Digitalization is therefore an important enabler for high-throughput phenotyping for smaller livestock herds with limited technology investments as well as large-scale commercial operations. It is demanding and challenging for individual researchers to keep up with the opportunities created by the rapid advances in digitalization for livestock farming and how it can be used by researchers with or without a specialization in livestock. This review provides an overview of the current status of key enabling technologies for precision livestock farming applicable for the functional annotation of genomes.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735752/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142001413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Systematic benchmark of single-cell hashtag demultiplexing approaches reveals robust performance of a clustering-based method. 单细胞标签解复用方法的系统基准揭示了基于聚类的方法的强大性能。
IF 2.5 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2025-01-15 DOI: 10.1093/bfgp/elae039
Mohammed Sayed, Yue Julia Wang, Hee-Woong Lim

Single-cell technology opened up a new avenue to delineate cellular status at a single-cell resolution and has become an essential tool for studying human diseases. Multiplexing allows cost-effective experiments by combining multiple samples and effectively mitigates batch effects. It starts by giving each sample a unique tag and then pooling them together for library preparation and sequencing. After sequencing, sample demultiplexing is performed based on tag detection, where cells belonging to one sample are expected to have a higher amount of the corresponding tag than cells from other samples. However, in reality, demultiplexing is not straightforward due to the noise and contamination from various sources. Successful demultiplexing depends on the efficient removal of such contamination. Here, we perform a systematic benchmark combining different normalization methods and demultiplexing approaches using real-world data and simulated datasets. We show that accounting for sequencing depth variability increases the separability between tagged and untagged cells, and the clustering-based approach outperforms existing tools. The clustering-based workflow is available as an R package from https://github.com/hwlim/hashDemux.

单细胞技术为以单细胞分辨率描述细胞状态开辟了一条新途径,已成为研究人类疾病的重要工具。多路复用技术通过将多个样本组合在一起,实现了经济高效的实验,并有效地减轻了批次效应。首先,给每个样本一个独特的标签,然后将它们集中在一起进行文库制备和测序。测序结束后,根据标签检测结果对样本进行解复用,预计属于一个样本的细胞会比其他样本的细胞含有更多的相应标签。然而,在现实中,由于各种来源的噪音和污染,解复用并不简单。成功的解复用取决于能否有效去除这些污染。在这里,我们利用真实世界数据和模拟数据集,结合不同的归一化方法和去多路复用方法,进行了一次系统的基准测试。我们的研究表明,考虑测序深度的可变性能提高标记细胞与非标记细胞之间的可分离性,基于聚类的方法优于现有工具。基于聚类的工作流程可作为 R 软件包从 https://github.com/hwlim/hashDemux 获取。
{"title":"Systematic benchmark of single-cell hashtag demultiplexing approaches reveals robust performance of a clustering-based method.","authors":"Mohammed Sayed, Yue Julia Wang, Hee-Woong Lim","doi":"10.1093/bfgp/elae039","DOIUrl":"10.1093/bfgp/elae039","url":null,"abstract":"<p><p>Single-cell technology opened up a new avenue to delineate cellular status at a single-cell resolution and has become an essential tool for studying human diseases. Multiplexing allows cost-effective experiments by combining multiple samples and effectively mitigates batch effects. It starts by giving each sample a unique tag and then pooling them together for library preparation and sequencing. After sequencing, sample demultiplexing is performed based on tag detection, where cells belonging to one sample are expected to have a higher amount of the corresponding tag than cells from other samples. However, in reality, demultiplexing is not straightforward due to the noise and contamination from various sources. Successful demultiplexing depends on the efficient removal of such contamination. Here, we perform a systematic benchmark combining different normalization methods and demultiplexing approaches using real-world data and simulated datasets. We show that accounting for sequencing depth variability increases the separability between tagged and untagged cells, and the clustering-based approach outperforms existing tools. The clustering-based workflow is available as an R package from https://github.com/hwlim/hashDemux.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735735/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142481473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Systematic analysis of the transcriptional landscape of melanoma reveals drug-target expression plasticity. 对黑色素瘤转录景观的系统分析揭示了药物靶点表达的可塑性。
IF 2.5 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2025-01-15 DOI: 10.1093/bfgp/elad055
Brad Balderson, Mitchell Fane, Tracey J Harvey, Michael Piper, Aaron Smith, Mikael Bodén

Metastatic melanoma originates from melanocytes of the skin. Melanoma metastasis results in poor treatment prognosis for patients and is associated with epigenetic and transcriptional changes that reflect the developmental program of melanocyte differentiation from neural crest stem cells. Several studies have explored melanoma transcriptional heterogeneity using microarray, bulk and single-cell RNA-sequencing technologies to derive data-driven models of the transcriptional-state change which occurs during melanoma progression. No study has systematically examined how different models of melanoma progression derived from different data types, technologies and biological conditions compare. Here, we perform a cross-sectional study to identify averaging effects of bulk-based studies that mask and distort apparent melanoma transcriptional heterogeneity; we describe new transcriptionally distinct melanoma cell states, identify differential co-expression of genes between studies and examine the effects of predicted drug susceptibilities of different cell states between studies. Importantly, we observe considerable variability in drug-target gene expression between studies, indicating potential transcriptional plasticity of melanoma to down-regulate these drug targets and thereby circumvent treatment. Overall, observed differences in gene co-expression and predicted drug susceptibility between studies suggest bulk-based transcriptional measurements do not reliably gauge heterogeneity and that melanoma transcriptional plasticity is greater than described when studies are considered in isolation.

转移性黑色素瘤源自皮肤的黑色素细胞。黑色素瘤转移导致患者治疗预后不良,并与表观遗传和转录变化有关,这些变化反映了黑色素细胞从神经嵴干细胞分化而来的发育程序。多项研究利用芯片、大容量和单细胞 RNA 序列技术探讨了黑色素瘤转录异质性,从而得出黑色素瘤发展过程中转录状态变化的数据驱动模型。目前还没有研究系统地考察了从不同数据类型、技术和生物条件中得出的不同黑色素瘤进展模型之间的比较。在这里,我们进行了一项横断面研究,以确定掩盖和扭曲明显黑色素瘤转录异质性的基于批量研究的平均效应;我们描述了新的转录不同的黑色素瘤细胞状态,确定了不同研究之间基因的差异共表达,并检查了不同研究之间不同细胞状态的预测药物敏感性的影响。重要的是,我们观察到不同研究之间的药物靶点基因表达存在相当大的差异,这表明黑色素瘤具有潜在的转录可塑性,可以下调这些药物靶点,从而规避治疗。总之,观察到的不同研究之间基因共表达和预测药物敏感性的差异表明,基于批量的转录测量并不能可靠地衡量异质性,而且黑色素瘤转录可塑性比孤立考虑研究时描述的要大。
{"title":"Systematic analysis of the transcriptional landscape of melanoma reveals drug-target expression plasticity.","authors":"Brad Balderson, Mitchell Fane, Tracey J Harvey, Michael Piper, Aaron Smith, Mikael Bodén","doi":"10.1093/bfgp/elad055","DOIUrl":"10.1093/bfgp/elad055","url":null,"abstract":"<p><p>Metastatic melanoma originates from melanocytes of the skin. Melanoma metastasis results in poor treatment prognosis for patients and is associated with epigenetic and transcriptional changes that reflect the developmental program of melanocyte differentiation from neural crest stem cells. Several studies have explored melanoma transcriptional heterogeneity using microarray, bulk and single-cell RNA-sequencing technologies to derive data-driven models of the transcriptional-state change which occurs during melanoma progression. No study has systematically examined how different models of melanoma progression derived from different data types, technologies and biological conditions compare. Here, we perform a cross-sectional study to identify averaging effects of bulk-based studies that mask and distort apparent melanoma transcriptional heterogeneity; we describe new transcriptionally distinct melanoma cell states, identify differential co-expression of genes between studies and examine the effects of predicted drug susceptibilities of different cell states between studies. Importantly, we observe considerable variability in drug-target gene expression between studies, indicating potential transcriptional plasticity of melanoma to down-regulate these drug targets and thereby circumvent treatment. Overall, observed differences in gene co-expression and predicted drug susceptibility between studies suggest bulk-based transcriptional measurements do not reliably gauge heterogeneity and that melanoma transcriptional plasticity is greater than described when studies are considered in isolation.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139106948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Briefings in Functional Genomics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1