A. Bremler-Barr, Yotam Harchol, David Hay, Y. Hel-Or
{"title":"使用三元内容可寻址存储器的超快速相似性搜索","authors":"A. Bremler-Barr, Yotam Harchol, David Hay, Y. Hel-Or","doi":"10.1145/2771937.2771938","DOIUrl":null,"url":null,"abstract":"Similarity search, and specifically the nearest-neighbor search (NN) problem is widely used in many fields of computer science such as machine learning, computer vision and databases. However, in many settings such searches are known to suffer from the notorious curse of dimensionality, where running time grows exponentially with d. This causes severe performance degradation when working in high-dimensional spaces. Approximate techniques such as locality-sensitive hashing [2] improve the performance of the search, but are still computationally intensive. In this paper we propose a new way to solve this problem using a special hardware device called ternary content addressable memory (TCAM). TCAM is an associative memory, which is a special type of computer memory that is widely used in switches and routers for very high speed search applications. We show that the TCAM computational model can be leveraged and adjusted to solve NN search problems in a single TCAM lookup cycle, and with linear space. This concept does not suffer from the curse of dimensionality and is shown to improve the best known approaches for NN by more than four orders of magnitude. Simulation results demonstrate dramatic improvement over the best known approaches for NN, and suggest that TCAM devices may play a critical role in future large-scale databases and cloud applications.","PeriodicalId":267524,"journal":{"name":"Proceedings of the 11th International Workshop on Data Management on New Hardware","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Ultra-Fast Similarity Search Using Ternary Content Addressable Memory\",\"authors\":\"A. Bremler-Barr, Yotam Harchol, David Hay, Y. Hel-Or\",\"doi\":\"10.1145/2771937.2771938\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Similarity search, and specifically the nearest-neighbor search (NN) problem is widely used in many fields of computer science such as machine learning, computer vision and databases. However, in many settings such searches are known to suffer from the notorious curse of dimensionality, where running time grows exponentially with d. This causes severe performance degradation when working in high-dimensional spaces. Approximate techniques such as locality-sensitive hashing [2] improve the performance of the search, but are still computationally intensive. In this paper we propose a new way to solve this problem using a special hardware device called ternary content addressable memory (TCAM). TCAM is an associative memory, which is a special type of computer memory that is widely used in switches and routers for very high speed search applications. We show that the TCAM computational model can be leveraged and adjusted to solve NN search problems in a single TCAM lookup cycle, and with linear space. This concept does not suffer from the curse of dimensionality and is shown to improve the best known approaches for NN by more than four orders of magnitude. Simulation results demonstrate dramatic improvement over the best known approaches for NN, and suggest that TCAM devices may play a critical role in future large-scale databases and cloud applications.\",\"PeriodicalId\":267524,\"journal\":{\"name\":\"Proceedings of the 11th International Workshop on Data Management on New Hardware\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-05-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 11th International Workshop on Data Management on New Hardware\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2771937.2771938\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 11th International Workshop on Data Management on New Hardware","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2771937.2771938","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Ultra-Fast Similarity Search Using Ternary Content Addressable Memory
Similarity search, and specifically the nearest-neighbor search (NN) problem is widely used in many fields of computer science such as machine learning, computer vision and databases. However, in many settings such searches are known to suffer from the notorious curse of dimensionality, where running time grows exponentially with d. This causes severe performance degradation when working in high-dimensional spaces. Approximate techniques such as locality-sensitive hashing [2] improve the performance of the search, but are still computationally intensive. In this paper we propose a new way to solve this problem using a special hardware device called ternary content addressable memory (TCAM). TCAM is an associative memory, which is a special type of computer memory that is widely used in switches and routers for very high speed search applications. We show that the TCAM computational model can be leveraged and adjusted to solve NN search problems in a single TCAM lookup cycle, and with linear space. This concept does not suffer from the curse of dimensionality and is shown to improve the best known approaches for NN by more than four orders of magnitude. Simulation results demonstrate dramatic improvement over the best known approaches for NN, and suggest that TCAM devices may play a critical role in future large-scale databases and cloud applications.