Fitting Distances by Tree Metrics Minimizing the Total Error within a Constant Factor

IF 2.3 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Journal of the ACM Pub Date : 2024-01-02 DOI:10.1145/3639453
Vincent Cohen-Addad, Debarati Das, Evangelos Kipouridis, Nikos Parotsidis, Mikkel Thorup
{"title":"Fitting Distances by Tree Metrics Minimizing the Total Error within a Constant Factor","authors":"Vincent Cohen-Addad, Debarati Das, Evangelos Kipouridis, Nikos Parotsidis, Mikkel Thorup","doi":"10.1145/3639453","DOIUrl":null,"url":null,"abstract":"<p>We consider the numerical taxonomy problem of fitting a positive distance function \\({\\mathcal {D}:{S\\choose 2}\\rightarrow \\mathbb {R}_{\\gt 0}} \\) by a tree metric. We want a tree <i>T</i> with positive edge weights and including <i>S</i> among the vertices so that their distances in <i>T</i> match those in \\(\\mathcal {D} \\). A nice application is in evolutionary biology where the tree <i>T</i> aims to approximate the branching process leading to the observed distances in \\(\\mathcal {D} \\) [Cavalli-Sforza and Edwards 1967]. We consider the total error, that is the sum of distance errors over all pairs of points. We present a deterministic polynomial time algorithm minimizing the total error within a constant factor. We can do this both for general trees, and for the special case of ultrametrics with a root having the same distance to all vertices in <i>S</i>. </p><p>The problems are APX-hard, so a constant factor is the best we can hope for in polynomial time. The best previous approximation factor was <i>O</i>((log <i>n</i>)(log log <i>n</i>)) by Ailon and Charikar [2005] who wrote “Determining whether an <i>O</i>(1) approximation can be obtained is a fascinating question”.</p>","PeriodicalId":50022,"journal":{"name":"Journal of the ACM","volume":"25 1","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the ACM","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3639453","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

We consider the numerical taxonomy problem of fitting a positive distance function \({\mathcal {D}:{S\choose 2}\rightarrow \mathbb {R}_{\gt 0}} \) by a tree metric. We want a tree T with positive edge weights and including S among the vertices so that their distances in T match those in \(\mathcal {D} \). A nice application is in evolutionary biology where the tree T aims to approximate the branching process leading to the observed distances in \(\mathcal {D} \) [Cavalli-Sforza and Edwards 1967]. We consider the total error, that is the sum of distance errors over all pairs of points. We present a deterministic polynomial time algorithm minimizing the total error within a constant factor. We can do this both for general trees, and for the special case of ultrametrics with a root having the same distance to all vertices in S.

The problems are APX-hard, so a constant factor is the best we can hope for in polynomial time. The best previous approximation factor was O((log n)(log log n)) by Ailon and Charikar [2005] who wrote “Determining whether an O(1) approximation can be obtained is a fascinating question”.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用树指标拟合距离,将总误差降至常数范围内最小
我们考虑通过树度量拟合正距离函数({\mathcal {D}:{S\choose 2}\rightarrow \mathbb {R}_{\gt 0}} )的数值分类问题。我们想要一棵具有正边权重的树 T,其中的顶点包括 S,这样它们在 T 中的距离就能与\(\mathcal {D} \)中的距离相匹配。一个很好的应用是在生物进化中,树 T 的目的是近似导致在 \(\mathcal {D} \)中观察到的距离的分支过程 [Cavalli-Sforza and Edwards 1967]。我们考虑的是总误差,即所有点对的距离误差之和。我们提出了一种确定性多项式时间算法,可以在一个常数因子内最小化总误差。我们既可以针对一般的树,也可以针对根与 S 中所有顶点的距离相同的超etrics 特例。这些问题都是 APX 难问题,因此常数因子是我们在多项式时间内所能期望的最佳值。Ailon 和 Charikar [2005] 以前的最佳近似因子是 O((log n)(log log n)),他们写道:"确定能否获得 O(1) 近似值是一个引人入胜的问题"。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of the ACM
Journal of the ACM 工程技术-计算机:理论方法
CiteScore
7.50
自引率
0.00%
发文量
51
审稿时长
3 months
期刊介绍: The best indicator of the scope of the journal is provided by the areas covered by its Editorial Board. These areas change from time to time, as the field evolves. The following areas are currently covered by a member of the Editorial Board: Algorithms and Combinatorial Optimization; Algorithms and Data Structures; Algorithms, Combinatorial Optimization, and Games; Artificial Intelligence; Complexity Theory; Computational Biology; Computational Geometry; Computer Graphics and Computer Vision; Computer-Aided Verification; Cryptography and Security; Cyber-Physical, Embedded, and Real-Time Systems; Database Systems and Theory; Distributed Computing; Economics and Computation; Information Theory; Logic and Computation; Logic, Algorithms, and Complexity; Machine Learning and Computational Learning Theory; Networking; Parallel Computing and Architecture; Programming Languages; Quantum Computing; Randomized Algorithms and Probabilistic Analysis of Algorithms; Scientific Computing and High Performance Computing; Software Engineering; Web Algorithms and Data Mining
期刊最新文献
Query lower bounds for log-concave sampling Transaction Fee Mechanism Design Sparse Higher Order Čech Filtrations Killing a Vortex Separations in Proof Complexity and TFNP
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1