Multidimensional scaling improves distance-based clustering for microbiome data.

Guanhua Chen, Xinyue Wang, Qiang Sun, Zheng-Zheng Tang
{"title":"Multidimensional scaling improves distance-based clustering for microbiome data.","authors":"Guanhua Chen, Xinyue Wang, Qiang Sun, Zheng-Zheng Tang","doi":"10.1093/bioinformatics/btaf042","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Clustering patients into subgroups based on their microbial compositions can greatly enhance our understanding of the role of microbes in human health and disease etiology. Distance-based clustering methods, such as partitioning around medoids (PAM), are popular due to their computational efficiency and absence of distributional assumptions. However, the performance of these methods can be suboptimal when true cluster memberships are driven by differences in the abundance of only a few microbes, a situation known as the sparse signal scenario.</p><p><strong>Results: </strong>We demonstrate that classical multidimensional scaling (MDS), a widely used dimensionality reduction technique, effectively denoises microbiome data and enhances the clustering performance of distance-based methods. We propose a two-step procedure that first applies MDS to project high-dimensional microbiome data into a low-dimensional space, followed by distance-based clustering using the low-dimensional data. Our extensive simulations demonstrate that our procedure offers superior performance compared to directly conducting distance-based clustering under the sparse signal scenario. The advantage of our procedure is further showcased in several real data applications.</p><p><strong>Availability and implementation: </strong>The R package MDSMClust is available at https://github.com/wxy929/MDS-project.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11814494/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Motivation: Clustering patients into subgroups based on their microbial compositions can greatly enhance our understanding of the role of microbes in human health and disease etiology. Distance-based clustering methods, such as partitioning around medoids (PAM), are popular due to their computational efficiency and absence of distributional assumptions. However, the performance of these methods can be suboptimal when true cluster memberships are driven by differences in the abundance of only a few microbes, a situation known as the sparse signal scenario.

Results: We demonstrate that classical multidimensional scaling (MDS), a widely used dimensionality reduction technique, effectively denoises microbiome data and enhances the clustering performance of distance-based methods. We propose a two-step procedure that first applies MDS to project high-dimensional microbiome data into a low-dimensional space, followed by distance-based clustering using the low-dimensional data. Our extensive simulations demonstrate that our procedure offers superior performance compared to directly conducting distance-based clustering under the sparse signal scenario. The advantage of our procedure is further showcased in several real data applications.

Availability and implementation: The R package MDSMClust is available at https://github.com/wxy929/MDS-project.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多维尺度改进了微生物组数据的基于距离的聚类。
动机:根据患者的微生物组成将患者分组,可以大大提高我们对微生物在人类健康和疾病病因学中的作用的理解。基于距离的聚类方法,如围绕介质划分(PAM),由于其计算效率和不需要分布假设而受到欢迎。然而,当真正的集群成员是由少数微生物的丰度差异驱动时,这些方法的性能可能不是最优的,这种情况被称为稀疏信号场景。结果:我们证明了经典多维尺度(MDS)是一种广泛使用的降维技术,可以有效地去噪微生物组数据,并提高基于距离的方法的聚类性能。我们提出了一个两步程序,首先应用MDS将高维微生物组数据投影到低维空间,然后使用低维数据进行基于距离的聚类。我们的大量模拟表明,与在稀疏信号场景下直接进行基于距离的聚类相比,我们的方法提供了更好的性能。在几个实际数据应用中进一步展示了该方法的优势。可用性:R包MDSMClust可在https://github.com/wxy929/MDS-project.Contact: gchen25@wisc.edu.Supplementary获取信息:补充数据可在Bioinformatics在线获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
DeeDeeExperiment: Building an infrastructure for integrating and managing omics data analysis results in R/Bioconductor. Hypergraph Representations of Single-Cell RNA Sequencing Data for Improved Cell Clustering. GlycanGT: A Pretrained Graph Transformer Framework for Glycan Graph Representation and Generative Learning. Enhancing Mutation Impact Prediction in Protein Protein Interactions through Interpretable Graph-Based Multi-Level Feature Interactions. ONEST: A Web-Based Platform for the Rapid and Robust Analysis of Protein Excited States through CEST Spectroscopy.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1