首页 > 最新文献

Proceedings of the 59th ACM/IEEE Design Automation Conference最新文献

英文 中文
SCAIE-V: an open-source SCAlable interface for ISA extensions for RISC-V processors SCAIE-V: RISC-V处理器的ISA扩展的开源可扩展接口
Pub Date : 2022-07-10 DOI: 10.1145/3489517.3530432
M. Damian, J. Oppermann, Christoph Spang, A. Koch
Custom instructions extending a base ISA are often used to increase performance. However, only few cores provide open interfaces for integrating such ISA Extensions (ISAX). In addition, the degree to which a core's capabilities are exposed for extension varies wildly between interfaces. Thus, even when using open-source cores, the lack of standardized ISAX interfaces typically causes high engineering effort when implementing or porting ISAXes. We present SCAIE-V, a highly portable and feature-rich ISAX interface that supports custom control flow, decoupled execution, multi-cycle-instructions, and memory transactions. The cost of the interface itself scales with the complexity of the ISAXes actually used.
扩展基本ISA的自定义指令通常用于提高性能。然而,只有少数核心提供了集成这些ISA扩展(ISAX)的开放接口。此外,一个核心的功能公开用于扩展的程度在不同的接口之间差别很大。因此,即使使用开源内核,在实现或移植ISAX时,缺乏标准化的ISAX接口通常也会导致大量的工程工作。我们提出SCAIE-V,一个高度可移植和功能丰富的ISAX接口,支持自定义控制流,解耦执行,多周期指令和内存事务。接口本身的成本随实际使用的ISAXes的复杂性而变化。
{"title":"SCAIE-V: an open-source SCAlable interface for ISA extensions for RISC-V processors","authors":"M. Damian, J. Oppermann, Christoph Spang, A. Koch","doi":"10.1145/3489517.3530432","DOIUrl":"https://doi.org/10.1145/3489517.3530432","url":null,"abstract":"Custom instructions extending a base ISA are often used to increase performance. However, only few cores provide open interfaces for integrating such ISA Extensions (ISAX). In addition, the degree to which a core's capabilities are exposed for extension varies wildly between interfaces. Thus, even when using open-source cores, the lack of standardized ISAX interfaces typically causes high engineering effort when implementing or porting ISAXes. We present SCAIE-V, a highly portable and feature-rich ISAX interface that supports custom control flow, decoupled execution, multi-cycle-instructions, and memory transactions. The cost of the interface itself scales with the complexity of the ISAXes actually used.","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123329942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
ReSMA
Pub Date : 2022-07-10 DOI: 10.1145/3489517.3530559
Huize Li, Hai Jin, Long Zheng, Yu Huang, Xiaofei Liao, Zhuohui Duan, Dan Chen, Chuangyi Gui
Approximate string matching (ASM) functions as the basic operation kernel for a large number of string processing applications. Existing Von-Neumann-based ASM accelerators suffer from huge intermediate data with the ever-increasing string data, leading to massive off-chip data transmissions. This paper presents a novel ASM processing-in-memory (PIM) accelerator, namely ReSMA, based on ReCAM- and ReRAM-arrays to eliminate the off-chip data transmissions in ASM. We develop a novel ReCAM-friendly filter-and-filtering algorithm to process the q-grams filtering in ReCAM memory. We also design a new data mapping strategy and a new verification algorithm, which enables computing the edit distances totally in ReRAM crossbars for energy saving. Experimental results show that ReSMA outperforms the CPU-, GPU-, FPGA-, ASIC-, and PIM-based solutions by 268.7×, 38.6×, 20.9×, 707.8×, and 14.7× in terms of performance, and 153.8×, 42.2×, 31.6×, 18.3×, and 5.3× in terms of energy-saving, respectively.
{"title":"ReSMA","authors":"Huize Li, Hai Jin, Long Zheng, Yu Huang, Xiaofei Liao, Zhuohui Duan, Dan Chen, Chuangyi Gui","doi":"10.1145/3489517.3530559","DOIUrl":"https://doi.org/10.1145/3489517.3530559","url":null,"abstract":"Approximate string matching (ASM) functions as the basic operation kernel for a large number of string processing applications. Existing Von-Neumann-based ASM accelerators suffer from huge intermediate data with the ever-increasing string data, leading to massive off-chip data transmissions. This paper presents a novel ASM processing-in-memory (PIM) accelerator, namely ReSMA, based on ReCAM- and ReRAM-arrays to eliminate the off-chip data transmissions in ASM. We develop a novel ReCAM-friendly filter-and-filtering algorithm to process the q-grams filtering in ReCAM memory. We also design a new data mapping strategy and a new verification algorithm, which enables computing the edit distances totally in ReRAM crossbars for energy saving. Experimental results show that ReSMA outperforms the CPU-, GPU-, FPGA-, ASIC-, and PIM-based solutions by 268.7×, 38.6×, 20.9×, 707.8×, and 14.7× in terms of performance, and 153.8×, 42.2×, 31.6×, 18.3×, and 5.3× in terms of energy-saving, respectively.","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123397961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
GaBAN
Pub Date : 2022-07-10 DOI: 10.1163/_eifo_sim_2411
Jiajie Chen, Le Yang, Youhui Zhang
{"title":"GaBAN","authors":"Jiajie Chen, Le Yang, Youhui Zhang","doi":"10.1163/_eifo_sim_2411","DOIUrl":"https://doi.org/10.1163/_eifo_sim_2411","url":null,"abstract":"","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"233 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122635635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Energy efficient data search design and optimization based on a compact ferroelectric FET content addressable memory 基于紧凑铁电场效应晶体管内容可寻址存储器的节能数据搜索设计与优化
Pub Date : 2022-07-10 DOI: 10.1145/3489517.3530527
Jiahao Cai, M. Imani, K. Ni, Grace Li Zhang, Bing Li, Ulf Schlichtmann, Cheng Zhuo, Xunzhao Yin
Content Addressable Memory (CAM) is widely used for associative search tasks in advanced machine learning models and data-intensive applications due to the highly parallel pattern matching capability. Most state-of-the-art CAM designs focus on reducing the CAM cell area by exploiting the nonvolatile memories (NVMs). There exists only little research on optimizing the design and energy efficiency of NVM based CAMs for practical deployment in edge devices and AI hardware. In this paper, we propose a general compact and energy efficient CAM design scheme that alleviates the design overhead by employing just one NVM device in the cell. We also propose an adaptive matchline (ML) precharge and discharge scheme that further optimizes the search energy by fully reducing the ML voltage swing. We consider Ferroelectric field effect transistors (FeFETs) as the representative NVM, and present a 2T-1FeFET CAM array including a sense amplifier implementing the proposed ML scheme. Evaluation results suggest that our proposed 2T-1FeFET CAM design achieves 6.64×/4.74×/9.14×/3.02× better energy efficiency compared with CMOS/ReRAM/STT-MRAM/2FeFET CAM arrays. Benchmarking results show that our approach provides 3.3×/2.1× energy-delay product improvement over the 2T-2R/2FeFET CAM in accelerating query processing applications.
内容寻址存储器(CAM)由于具有高度并行的模式匹配能力,被广泛用于高级机器学习模型和数据密集型应用中的关联搜索任务。大多数最先进的CAM设计都致力于通过利用非易失性存储器(NVMs)来减小CAM单元面积。对于优化基于NVM的cam的设计和能效,以便在边缘设备和人工智能硬件中实际部署,目前的研究很少。在本文中,我们提出了一种通用的紧凑节能的CAM设计方案,该方案通过在单元中仅使用一个NVM设备来减轻设计开销。我们还提出了一种自适应匹配线(ML)预充放电方案,通过充分减小ML电压摆动进一步优化搜索能量。我们将铁电场效应晶体管(fefet)作为NVM的代表,并提出了一个2T-1FeFET CAM阵列,其中包括一个实现所提出的ML方案的感测放大器。评估结果表明,与CMOS/ReRAM/STT-MRAM/2FeFET CAM阵列相比,我们提出的2T-1FeFET CAM阵列的能量效率提高了6.64×/4.74×/9.14×/3.02×。基准测试结果表明,在加速查询处理应用中,我们的方法比2T-2R/2FeFET CAM提供了3.3倍/2.1倍的能量延迟产品改进。
{"title":"Energy efficient data search design and optimization based on a compact ferroelectric FET content addressable memory","authors":"Jiahao Cai, M. Imani, K. Ni, Grace Li Zhang, Bing Li, Ulf Schlichtmann, Cheng Zhuo, Xunzhao Yin","doi":"10.1145/3489517.3530527","DOIUrl":"https://doi.org/10.1145/3489517.3530527","url":null,"abstract":"Content Addressable Memory (CAM) is widely used for associative search tasks in advanced machine learning models and data-intensive applications due to the highly parallel pattern matching capability. Most state-of-the-art CAM designs focus on reducing the CAM cell area by exploiting the nonvolatile memories (NVMs). There exists only little research on optimizing the design and energy efficiency of NVM based CAMs for practical deployment in edge devices and AI hardware. In this paper, we propose a general compact and energy efficient CAM design scheme that alleviates the design overhead by employing just one NVM device in the cell. We also propose an adaptive matchline (ML) precharge and discharge scheme that further optimizes the search energy by fully reducing the ML voltage swing. We consider Ferroelectric field effect transistors (FeFETs) as the representative NVM, and present a 2T-1FeFET CAM array including a sense amplifier implementing the proposed ML scheme. Evaluation results suggest that our proposed 2T-1FeFET CAM design achieves 6.64×/4.74×/9.14×/3.02× better energy efficiency compared with CMOS/ReRAM/STT-MRAM/2FeFET CAM arrays. Benchmarking results show that our approach provides 3.3×/2.1× energy-delay product improvement over the 2T-2R/2FeFET CAM in accelerating query processing applications.","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124191059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Adaptive window-based sensor attack detection for cyber-physical systems 基于自适应窗口的网络物理系统传感器攻击检测
Pub Date : 2022-07-10 DOI: 10.1145/3489517.3530555
Lin Zhang, Zifan Wang, Mengyu Liu, Fanxin Kong
Sensor attacks alter sensor readings and spoof Cyber-Physical Systems (CPS) to perform dangerous actions. Existing detection works tend to minimize the detection delay and false alarms at the same time, while there is a clear trade-off between the two metrics. Instead, we argue that attack detection should dynamically balance the two metrics when a physical system is at different states. Along with this argument, we propose an adaptive sensor attack detection system that consists of three components - an adaptive detector, detection deadline estimator, and data logger. It can adapt the detection delay and thus false alarms at run time to meet a varying detection deadline and improve usability (or false alarms). Finally, we implement our detection system and validate it using multiple CPS simulators and a reduced-scale autonomous vehicle testbed.
传感器攻击改变传感器读数并欺骗网络物理系统(CPS)执行危险操作。现有的检测工作倾向于同时最小化检测延迟和假警报,而这两个指标之间存在明显的权衡。相反,我们认为攻击检测应该在物理系统处于不同状态时动态平衡这两个度量。根据这一论点,我们提出了一个自适应传感器攻击检测系统,该系统由三个组件组成-自适应检测器,检测截止日期估计器和数据记录器。它可以在运行时调整检测延迟和假警报,以满足不同的检测截止日期并提高可用性(或假警报)。最后,我们实现了我们的检测系统,并使用多个CPS模拟器和一个缩小规模的自动驾驶汽车试验台对其进行了验证。
{"title":"Adaptive window-based sensor attack detection for cyber-physical systems","authors":"Lin Zhang, Zifan Wang, Mengyu Liu, Fanxin Kong","doi":"10.1145/3489517.3530555","DOIUrl":"https://doi.org/10.1145/3489517.3530555","url":null,"abstract":"Sensor attacks alter sensor readings and spoof Cyber-Physical Systems (CPS) to perform dangerous actions. Existing detection works tend to minimize the detection delay and false alarms at the same time, while there is a clear trade-off between the two metrics. Instead, we argue that attack detection should dynamically balance the two metrics when a physical system is at different states. Along with this argument, we propose an adaptive sensor attack detection system that consists of three components - an adaptive detector, detection deadline estimator, and data logger. It can adapt the detection delay and thus false alarms at run time to meet a varying detection deadline and improve usability (or false alarms). Finally, we implement our detection system and validate it using multiple CPS simulators and a reduced-scale autonomous vehicle testbed.","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125313361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
EcoFusion
Pub Date : 2022-07-10 DOI: 10.1145/3489517.3530489
A. Malawade, Trier Mortlock, M. A. Faruque
Autonomous vehicles use multiple sensors, large deep-learning models, and powerful hardware platforms to perceive the environment and navigate safely. In many contexts, some sensing modalities negatively impact perception while increasing energy consumption. We propose EcoFusion: an energy-aware sensor fusion approach that uses context to adapt the fusion method and reduce energy consumption without affecting perception performance. EcoFusion performs up to 9.5% better at object detection than existing fusion methods with approximately 60% less energy and 58% lower latency on the industry-standard Nvidia Drive PX2 hardware platform. We also propose several context-identification strategies, implement a joint optimization between energy and performance, and present scenario-specific results.
{"title":"EcoFusion","authors":"A. Malawade, Trier Mortlock, M. A. Faruque","doi":"10.1145/3489517.3530489","DOIUrl":"https://doi.org/10.1145/3489517.3530489","url":null,"abstract":"Autonomous vehicles use multiple sensors, large deep-learning models, and powerful hardware platforms to perceive the environment and navigate safely. In many contexts, some sensing modalities negatively impact perception while increasing energy consumption. We propose EcoFusion: an energy-aware sensor fusion approach that uses context to adapt the fusion method and reduce energy consumption without affecting perception performance. EcoFusion performs up to 9.5% better at object detection than existing fusion methods with approximately 60% less energy and 58% lower latency on the industry-standard Nvidia Drive PX2 hardware platform. We also propose several context-identification strategies, implement a joint optimization between energy and performance, and present scenario-specific results.","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125893745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Accelerating nonlinear DC circuit simulation with reinforcement learning
Pub Date : 2022-07-10 DOI: 10.1145/3489517.3530512
Zhou Jin, Haojie Pei, Yichao Dong, Xiang Jin, Xiao Wu, Weipeng Xing, Dan Niu
DC analysis is the foundation for nonlinear electronic circuit simulation. Pseudo transient analysis (PTA) methods have gained great success among various continuation algorithms. However, PTA tends to be computationally intensive without careful tuning of parameters and proper stepping strategies. In this paper, we harness the latest advancing in machine learning to resolve these challenges simultaneously. Particularly, an active learning is leveraged to provide a fine initial solver environment, in which a TD3-based Reinforcement Learning (RL) is implemented to accelerate the simulation on the fly. The RL agent is strengthen with dual agents, priority sampling, and cooperative learning to enhance its robustness and convergence. The proposed algorithms are implemented in an out-of-the-box SPICElike simulator, which demonstrated a significant speedup: up to 3.1X for the initial stage and 234X for the RL stage.
直流分析是非线性电子电路仿真的基础。在各种延拓算法中,伪瞬态分析(PTA)方法取得了很大的成功。然而,如果没有仔细调整参数和适当的步进策略,PTA往往是计算密集型的。在本文中,我们利用机器学习的最新进展来同时解决这些挑战。特别地,利用主动学习来提供一个良好的初始求解器环境,其中实现了基于td3的强化学习(RL)来加速动态仿真。采用双代理、优先抽样和合作学习等方法增强RL智能体的鲁棒性和收敛性。所提出的算法在一个开箱即用的spice模拟器中实现,该模拟器显示出显着的加速:初始阶段高达3.1倍,RL阶段高达234X。
{"title":"Accelerating nonlinear DC circuit simulation with reinforcement learning","authors":"Zhou Jin, Haojie Pei, Yichao Dong, Xiang Jin, Xiao Wu, Weipeng Xing, Dan Niu","doi":"10.1145/3489517.3530512","DOIUrl":"https://doi.org/10.1145/3489517.3530512","url":null,"abstract":"DC analysis is the foundation for nonlinear electronic circuit simulation. Pseudo transient analysis (PTA) methods have gained great success among various continuation algorithms. However, PTA tends to be computationally intensive without careful tuning of parameters and proper stepping strategies. In this paper, we harness the latest advancing in machine learning to resolve these challenges simultaneously. Particularly, an active learning is leveraged to provide a fine initial solver environment, in which a TD3-based Reinforcement Learning (RL) is implemented to accelerate the simulation on the fly. The RL agent is strengthen with dual agents, priority sampling, and cooperative learning to enhance its robustness and convergence. The proposed algorithms are implemented in an out-of-the-box SPICElike simulator, which demonstrated a significant speedup: up to 3.1X for the initial stage and 234X for the RL stage.","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129297281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
ODHD
Pub Date : 2022-07-10 DOI: 10.1145/3489517.3530395
Ruixuan Wang, Xun Jiao, X. S. Hu
Outlier detection is a classical and important technique that has been used in different application domains such as medical diagnosis and Internet-of-Things. Recently, machine learning-based outlier detection algorithms, such as one-class support vector machine (OCSVM), isolation forest and autoencoder, have demonstrated promising results in outlier detection. In this paper, we take a radical departure from these classical learning methods and propose ODHD, an outlier detection method based on hyperdimensional computing (HDC). In ODHD, the outlier detection process is based on a P-U learning structure, in which we train a one-class HV based on inlier samples. This HV represents the abstraction information of all inlier samples; hence, any (testing) sample whose corresponding HV is dissimilar from this HV will be considered as an outlier. We perform an extensive evaluation using six datasets across different application domains and compare ODHD with multiple baseline methods including OCSVM, isolation forest, and autoencoder using three metrics including accuracy, F1 score and ROC-AUC. Experimental results show that ODHD outperforms all the baseline methods on every dataset for every metric. Moreover, we perform a design space exploration for ODHD to illustrate the tradeoff between performance and efficiency. The promising results presented in this paper provide a viable option and alternative to traditional learning algorithms for outlier detection.
{"title":"ODHD","authors":"Ruixuan Wang, Xun Jiao, X. S. Hu","doi":"10.1145/3489517.3530395","DOIUrl":"https://doi.org/10.1145/3489517.3530395","url":null,"abstract":"Outlier detection is a classical and important technique that has been used in different application domains such as medical diagnosis and Internet-of-Things. Recently, machine learning-based outlier detection algorithms, such as one-class support vector machine (OCSVM), isolation forest and autoencoder, have demonstrated promising results in outlier detection. In this paper, we take a radical departure from these classical learning methods and propose ODHD, an outlier detection method based on hyperdimensional computing (HDC). In ODHD, the outlier detection process is based on a P-U learning structure, in which we train a one-class HV based on inlier samples. This HV represents the abstraction information of all inlier samples; hence, any (testing) sample whose corresponding HV is dissimilar from this HV will be considered as an outlier. We perform an extensive evaluation using six datasets across different application domains and compare ODHD with multiple baseline methods including OCSVM, isolation forest, and autoencoder using three metrics including accuracy, F1 score and ROC-AUC. Experimental results show that ODHD outperforms all the baseline methods on every dataset for every metric. Moreover, we perform a design space exploration for ODHD to illustrate the tradeoff between performance and efficiency. The promising results presented in this paper provide a viable option and alternative to traditional learning algorithms for outlier detection.","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"313 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121175252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
VStore
Pub Date : 2022-07-10 DOI: 10.1145/3489517.3530560
Shengwen Liang, Ying Wang, Ziming Yuan, Cheng Liu, Huawei Li, Xiaowei Li
Graph-based vector search that finds best matches to user queries based on their semantic similarities using a graph data structure, becomes instrumental in data science and AI application. However, deploying graph-based vector search in production systems requires high accuracy and cost-efficiency with low latency and memory footprint, which existing work fails to offer. We present VStore, a graph-based vector search solution that collaboratively optimizes accuracy, latency, memory, and data movement on large-scale vector data based on in-storage computing. The evaluation shows that VStore exhibits significant search efficiency improvement and energy reduction while attaining accuracy over CPU, GPU, and ZipNN platforms.
{"title":"VStore","authors":"Shengwen Liang, Ying Wang, Ziming Yuan, Cheng Liu, Huawei Li, Xiaowei Li","doi":"10.1145/3489517.3530560","DOIUrl":"https://doi.org/10.1145/3489517.3530560","url":null,"abstract":"Graph-based vector search that finds best matches to user queries based on their semantic similarities using a graph data structure, becomes instrumental in data science and AI application. However, deploying graph-based vector search in production systems requires high accuracy and cost-efficiency with low latency and memory footprint, which existing work fails to offer. We present VStore, a graph-based vector search solution that collaboratively optimizes accuracy, latency, memory, and data movement on large-scale vector data based on in-storage computing. The evaluation shows that VStore exhibits significant search efficiency improvement and energy reduction while attaining accuracy over CPU, GPU, and ZipNN platforms.","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116343166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Towards a formally verified hardware root-of-trust for data-oblivious computing 为数据无关计算提供正式验证的硬件信任根
Pub Date : 2022-07-10 DOI: 10.1145/3489517.3530981
Lucas Deutschmann, Johannes Müller, M. R. Fadiheh, D. Stoffel, W. Kunz
The importance of preventing microarchitectural timing side channels in security-critical applications has surged immensely over the last several years. Constant-time programming has emerged as a best-practice technique to prevent leaking out secret information through timing. It builds on the assumption that certain basic machine instructions execute timing-independently w.r.t. their input data. However, whether an instruction fulfills this data-independent timing criterion varies strongly from architecture to architecture. In this paper, we propose a novel methodology to formally verify data-oblivious behavior in hardware using standard property checking techniques. Each successfully verified instruction represents a trusted hardware primitive for developing data-oblivious algorithms. A counterexample, on the other hand, represents a restriction that must be communicated to the software developer. We evaluate the proposed methodology in multiple case studies, ranging from small arithmetic units to medium-sized processors. One case study uncovered a data-dependent timing violation in the extensively verified and highly secure Ibex RISC-V core.
在过去几年中,在安全关键型应用程序中防止微体系结构定时侧信道的重要性急剧增加。固定时间编程已经成为防止定时泄露机密信息的最佳实践技术。它建立在这样的假设之上,即某些基本机器指令在输入数据之外独立于时间执行。然而,指令是否满足这种与数据无关的计时标准在不同的体系结构中差别很大。在本文中,我们提出了一种新的方法,使用标准的属性检查技术来正式验证硬件中的数据无关行为。每条成功验证的指令都代表一个可信的硬件原语,用于开发无关数据的算法。另一方面,反例表示必须与软件开发人员沟通的限制。我们在多个案例研究中评估了所提出的方法,范围从小型算术单元到中型处理器。一个案例研究发现,在经过广泛验证和高度安全的Ibex RISC-V核心中存在数据依赖的时序违规。
{"title":"Towards a formally verified hardware root-of-trust for data-oblivious computing","authors":"Lucas Deutschmann, Johannes Müller, M. R. Fadiheh, D. Stoffel, W. Kunz","doi":"10.1145/3489517.3530981","DOIUrl":"https://doi.org/10.1145/3489517.3530981","url":null,"abstract":"The importance of preventing microarchitectural timing side channels in security-critical applications has surged immensely over the last several years. Constant-time programming has emerged as a best-practice technique to prevent leaking out secret information through timing. It builds on the assumption that certain basic machine instructions execute timing-independently w.r.t. their input data. However, whether an instruction fulfills this data-independent timing criterion varies strongly from architecture to architecture. In this paper, we propose a novel methodology to formally verify data-oblivious behavior in hardware using standard property checking techniques. Each successfully verified instruction represents a trusted hardware primitive for developing data-oblivious algorithms. A counterexample, on the other hand, represents a restriction that must be communicated to the software developer. We evaluate the proposed methodology in multiple case studies, ranging from small arithmetic units to medium-sized processors. One case study uncovered a data-dependent timing violation in the extensively verified and highly secure Ibex RISC-V core.","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126789340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
Proceedings of the 59th ACM/IEEE Design Automation Conference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1