Prophet: An Efficient Feature Indexing Mechanism for Similarity Data Sharing at Network Edge

IEEE INFOCOM 2023 - IEEE Conference on Computer Communications Pub Date : 2023-05-17 DOI:10.1109/INFOCOM53939.2023.10228941

Yuchen Sun, Deke Guo, Lailong Luo, Li Liu, Xinyi Li, Junjie Xie

{"title":"Prophet: An Efficient Feature Indexing Mechanism for Similarity Data Sharing at Network Edge","authors":"Yuchen Sun, Deke Guo, Lailong Luo, Li Liu, Xinyi Li, Junjie Xie","doi":"10.1109/INFOCOM53939.2023.10228941","DOIUrl":null,"url":null,"abstract":"As a promising infrastructure, edge storage systems have drawn many attempts to efficiently distribute and share data among edge servers. However, it remains open to meeting the increasing demand for similarity retrieval across servers. The intrinsic reason is that the existing solutions can only return an exact data match for a query while more general edge applications require the data similar to a query input from any server. To fill this gap, this paper pioneers a new paradigm to support high-dimensional similarity search at network edges. Specifically, we propose Prophet, the first known architecture for similarity data indexing. We first divide the feature space of data into plenty of subareas, then project both subareas and edge servers into a virtual plane where the distances between any two points can reflect not only data similarity but also network latency. When any edge server submits a request for data insert, delete, or query, it computes the data feature and the virtual coordinates; then iteratively forwards the request through greedy routing based on the forwarding tables and the virtual coordinates. By Prophet, similar high-dimensional features would be stored by a common server or several nearby servers. Compared with distributed hash tables in P2P networks, Prophet requires logarithmic servers to access for a data request and reduces the network latency from the logarithmic to the constant level of the server number. Experimental results indicate that Prophet achieves comparable retrieval accuracy and shortens the query latency by 55%~70% compared with centralized schemes.","PeriodicalId":387707,"journal":{"name":"IEEE INFOCOM 2023 - IEEE Conference on Computer Communications","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE INFOCOM 2023 - IEEE Conference on Computer Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFOCOM53939.2023.10228941","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

As a promising infrastructure, edge storage systems have drawn many attempts to efficiently distribute and share data among edge servers. However, it remains open to meeting the increasing demand for similarity retrieval across servers. The intrinsic reason is that the existing solutions can only return an exact data match for a query while more general edge applications require the data similar to a query input from any server. To fill this gap, this paper pioneers a new paradigm to support high-dimensional similarity search at network edges. Specifically, we propose Prophet, the first known architecture for similarity data indexing. We first divide the feature space of data into plenty of subareas, then project both subareas and edge servers into a virtual plane where the distances between any two points can reflect not only data similarity but also network latency. When any edge server submits a request for data insert, delete, or query, it computes the data feature and the virtual coordinates; then iteratively forwards the request through greedy routing based on the forwarding tables and the virtual coordinates. By Prophet, similar high-dimensional features would be stored by a common server or several nearby servers. Compared with distributed hash tables in P2P networks, Prophet requires logarithmic servers to access for a data request and reduces the network latency from the logarithmic to the constant level of the server number. Experimental results indicate that Prophet achieves comparable retrieval accuracy and shortens the query latency by 55%~70% compared with centralized schemes.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Prophet:一种高效的网络边缘相似数据共享特征索引机制

边缘存储系统作为一种很有前途的基础设施，在边缘服务器之间有效地分发和共享数据已经引起了许多尝试。但是，它仍然可以满足不断增长的跨服务器相似性检索需求。其内在原因是现有的解决方案只能为查询返回精确的数据匹配，而更通用的边缘应用程序需要类似于来自任何服务器的查询输入的数据。为了填补这一空白，本文开创了一种新的范式来支持网络边缘的高维相似性搜索。具体来说，我们提出了Prophet，这是已知的第一个用于相似数据索引的架构。我们首先将数据的特征空间划分为大量的子区域，然后将子区域和边缘服务器投影到一个虚拟平面中，其中任意两点之间的距离不仅可以反映数据相似度，还可以反映网络延迟。当任何边缘服务器提交数据插入、删除或查询请求时，计算数据特征和虚拟坐标;然后根据转发表和虚拟坐标通过贪婪路由迭代转发请求。通过Prophet，类似的高维特征将存储在一个公共服务器或附近的几个服务器上。与P2P网络中的分布式哈希表相比，每次数据请求都需要对数服务器访问，将网络延迟从服务器数量的对数级降低到常数级。实验结果表明，与集中式方案相比，该方案的检索精度相当，查询延迟缩短了55%~70%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE INFOCOM 2023 - IEEE Conference on Computer Communications

自引率

0.00%

发文量