首页 > 最新文献

IEEE Transactions on Computers最新文献

英文 中文
Chiplet-Gym: Optimizing Chiplet-based AI Accelerator Design with Reinforcement Learning Chiplet-Gym:利用强化学习优化基于 Chiplet 的人工智能加速器设计
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-09-11 DOI: 10.1109/tc.2024.3457740
Kaniz Mishty, Mehdi Sadi
{"title":"Chiplet-Gym: Optimizing Chiplet-based AI Accelerator Design with Reinforcement Learning","authors":"Kaniz Mishty, Mehdi Sadi","doi":"10.1109/tc.2024.3457740","DOIUrl":"https://doi.org/10.1109/tc.2024.3457740","url":null,"abstract":"","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"33 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging GPU in Homomorphic Encryption: Framework Design and Analysis of BFV Variants 在同态加密中利用 GPU:框架设计与 BFV 变种分析
IF 3.6 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-09-11 DOI: 10.1109/TC.2024.3457733
Shiyu Shen;Hao Yang;Wangchen Dai;Lu Zhou;Zhe Liu;Yunlei Zhao
Homomorphic Encryption (HE) enhances data security by enabling computations on encrypted data, advancing privacy-focused computations. The BFV scheme, a promising HE scheme, raises considerable performance challenges. Graphics Processing Units (GPUs), with considerable parallel processing abilities, offer an effective solution. In this work, we present an in-depth study on accelerating and comparing BFV variants on GPUs, including Bajard-Eynard-Hasan-Zucca (BEHZ), Halevi-Polyakov-Shoup (HPS), and recent variants. We introduce a universal framework for all variants, propose optimized BEHZ implementation, and first support HPS variants with large parameter sets on GPUs. We also optimize low-level arithmetic and high-level operations, minimizing instructions for modular operations, enhancing hardware utilization for base conversion, and implementing efficient reuse strategies and fusion methods to reduce computational and memory consumption. Leveraging our framework, we offer comprehensive comparative analyses. Performance evaluation shows a 31.9$times$ speedup over OpenFHE running on a multi-threaded CPU and 39.7% and 29.9% improvement for tensoring and relinearization over the state-of-the-art GPU BEHZ implementation. The leveled HPS variant records up to 4$times$ speedup over other variants, positioning it as a highly promising alternative for specific applications.
同态加密(HE)通过在加密数据上进行计算,提高了数据的安全性,推进了以隐私为重点的计算。BFV 方案是一种很有前途的 HE 方案,但它在性能上面临相当大的挑战。具有强大并行处理能力的图形处理器(GPU)提供了一个有效的解决方案。在这项工作中,我们深入研究了如何在 GPU 上加速和比较 BFV 变体,包括 Bajard-Eynard-Hasan-Zucca (BEHZ)、Halevi-Polyakov-Shoup (HPS) 和最近的变体。我们为所有变体引入了一个通用框架,提出了经过优化的 BEHZ 实现方法,并首次支持在 GPU 上使用大型参数集的 HPS 变体。我们还优化了底层算术和高层操作,最大限度地减少了模块化操作指令,提高了基数转换的硬件利用率,并实施了高效的重用策略和融合方法,以减少计算和内存消耗。利用我们的框架,我们提供了全面的比较分析。性能评估显示,与运行在多线程CPU上的OpenFHE相比,速度提高了31.9美元/次,与最先进的GPU BEHZ实现相比,张化和重线性分别提高了39.7%和29.9%。与其他变体相比,平移 HPS 变体的速度提高了 4 美元/次,使其成为特定应用中极具潜力的替代方案。
{"title":"Leveraging GPU in Homomorphic Encryption: Framework Design and Analysis of BFV Variants","authors":"Shiyu Shen;Hao Yang;Wangchen Dai;Lu Zhou;Zhe Liu;Yunlei Zhao","doi":"10.1109/TC.2024.3457733","DOIUrl":"10.1109/TC.2024.3457733","url":null,"abstract":"Homomorphic Encryption (HE) enhances data security by enabling computations on encrypted data, advancing privacy-focused computations. The BFV scheme, a promising HE scheme, raises considerable performance challenges. Graphics Processing Units (GPUs), with considerable parallel processing abilities, offer an effective solution. In this work, we present an in-depth study on accelerating and comparing BFV variants on GPUs, including Bajard-Eynard-Hasan-Zucca (BEHZ), Halevi-Polyakov-Shoup (HPS), and recent variants. We introduce a universal framework for all variants, propose optimized BEHZ implementation, and first support HPS variants with large parameter sets on GPUs. We also optimize low-level arithmetic and high-level operations, minimizing instructions for modular operations, enhancing hardware utilization for base conversion, and implementing efficient reuse strategies and fusion methods to reduce computational and memory consumption. Leveraging our framework, we offer comprehensive comparative analyses. Performance evaluation shows a 31.9\u0000<inline-formula><tex-math>$times$</tex-math></inline-formula>\u0000 speedup over OpenFHE running on a multi-threaded CPU and 39.7% and 29.9% improvement for tensoring and relinearization over the state-of-the-art GPU BEHZ implementation. The leveled HPS variant records up to 4\u0000<inline-formula><tex-math>$times$</tex-math></inline-formula>\u0000 speedup over other variants, positioning it as a highly promising alternative for specific applications.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 12","pages":"2817-2829"},"PeriodicalIF":3.6,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Acceleration of Fast Sample Entropy for FPGAs 加速 FPGA 的快速采样熵
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-09-11 DOI: 10.1109/tc.2024.3457735
Chen Chao, Chengyu Liu, Jianqing Li, Bruno da Silva
{"title":"Acceleration of Fast Sample Entropy for FPGAs","authors":"Chen Chao, Chengyu Liu, Jianqing Li, Bruno da Silva","doi":"10.1109/tc.2024.3457735","DOIUrl":"https://doi.org/10.1109/tc.2024.3457735","url":null,"abstract":"","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"2 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Novel Lagrange Multipliers-Driven Adaptive Offloading for Vehicular Edge Computing 面向车载边缘计算的新型拉格朗日乘法器驱动自适应卸载
IF 3.6 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-09-11 DOI: 10.1109/TC.2024.3457729
Liang Zhao;Tianyu Li;Guiying Meng;Ammar Hawbani;Geyong Min;Ahmed Y. Al-Dubai;Albert Y. Zomaya
Vehicular Edge Computing (VEC) is a transportation-specific version of Mobile Edge Computing (MEC) designed for vehicular scenarios. Task offloading allows vehicles to send computational tasks to nearby Roadside Units (RSUs) in order to reduce the computation cost for the overall system. However, the state-of-the-art solutions have not fully addressed the challenge of large-scale task result feedback with low delay, due to the extremely flexible network structure and complex traffic data. In this paper, we explore the joint task offloading and resource allocation problem with result feedback cost in the VEC. In particular, this study develops a VEC computing offloading scheme, namely, a Lagrange multipliers-based adaptive computing offloading with prediction model, considering multiple RSUs and vehicles within their coverage areas. First, the VEC network architecture employs GAN to establish a prediction model, utilizing the powerful predictive capabilities of GAN to forecast the maximum distance of future trajectories, thereby reducing the decision space for task offloading. Subsequently, we propose a real-time adaptive model and adjust the parameters in different scenarios to accommodate the dynamic characteristic of the VEC network. Finally, we apply Lagrange Multiplier-based Non-Uniform Genetic Algorithm (LM-NUGA) to make task offloading decision. Effectively, this algorithm provides reliable and efficient computing services. The results from simulation indicate that our proposed scheme efficiently reduces the computation cost for the whole VEC system. This paves the way for a new generation of disruptive and reliable offloading schemes.
车载边缘计算(VEC)是移动边缘计算(MEC)的交通专用版本,专为车载场景设计。任务卸载允许车辆将计算任务发送到附近的路边单元(RSU),以降低整个系统的计算成本。然而,由于极其灵活的网络结构和复杂的交通数据,最先进的解决方案尚未完全解决低延迟大规模任务结果反馈的难题。在本文中,我们探讨了 VEC 中带有结果反馈成本的联合任务卸载和资源分配问题。具体而言,本研究开发了一种 VEC 计算卸载方案,即基于拉格朗日乘法器的自适应计算卸载预测模型,考虑了多个 RSU 及其覆盖区域内的车辆。首先,VEC 网络架构采用 GAN 建立预测模型,利用 GAN 强大的预测能力预测未来轨迹的最大距离,从而减少任务卸载的决策空间。随后,我们提出了实时自适应模型,并在不同场景下调整参数,以适应 VEC 网络的动态特性。最后,我们应用基于拉格朗日乘法器的非均匀遗传算法(LM-NUGA)来进行任务卸载决策。该算法能有效地提供可靠、高效的计算服务。仿真结果表明,我们提出的方案有效降低了整个 VEC 系统的计算成本。这为新一代颠覆性的可靠卸载方案铺平了道路。
{"title":"Novel Lagrange Multipliers-Driven Adaptive Offloading for Vehicular Edge Computing","authors":"Liang Zhao;Tianyu Li;Guiying Meng;Ammar Hawbani;Geyong Min;Ahmed Y. Al-Dubai;Albert Y. Zomaya","doi":"10.1109/TC.2024.3457729","DOIUrl":"10.1109/TC.2024.3457729","url":null,"abstract":"Vehicular Edge Computing (VEC) is a transportation-specific version of Mobile Edge Computing (MEC) designed for vehicular scenarios. Task offloading allows vehicles to send computational tasks to nearby Roadside Units (RSUs) in order to reduce the computation cost for the overall system. However, the state-of-the-art solutions have not fully addressed the challenge of large-scale task result feedback with low delay, due to the extremely flexible network structure and complex traffic data. In this paper, we explore the joint task offloading and resource allocation problem with result feedback cost in the VEC. In particular, this study develops a VEC computing offloading scheme, namely, a Lagrange multipliers-based adaptive computing offloading with prediction model, considering multiple RSUs and vehicles within their coverage areas. First, the VEC network architecture employs GAN to establish a prediction model, utilizing the powerful predictive capabilities of GAN to forecast the maximum distance of future trajectories, thereby reducing the decision space for task offloading. Subsequently, we propose a real-time adaptive model and adjust the parameters in different scenarios to accommodate the dynamic characteristic of the VEC network. Finally, we apply Lagrange Multiplier-based Non-Uniform Genetic Algorithm (LM-NUGA) to make task offloading decision. Effectively, this algorithm provides reliable and efficient computing services. The results from simulation indicate that our proposed scheme efficiently reduces the computation cost for the whole VEC system. This paves the way for a new generation of disruptive and reliable offloading schemes.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 12","pages":"2868-2881"},"PeriodicalIF":3.6,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hardware Implementation of Unsigned Approximate Hybrid Square Rooters for Error-Resilient Applications 面向容错应用的无符号近似混合平方根的硬件实现
IF 3.6 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-09-11 DOI: 10.1109/TC.2024.3457731
Lalit Bandil;Bal Chand Nagar
In this paper, the authors proposed an approximate hybrid square rooter (AHSQR). It is the combination of array and logarithmic-based square rooter (SQR) to create a balance between accuracy and hardware performance. An array-based SQR is utilized as an exact SQR (ESQR) to obtain the MSBs of output for high precision, while a logarithmic SQR is used to estimate the remaining output digits to enhance design metrics. A modified AHSQR (MAHSQR) is also proposed to retain accuracy at increasing degrees of approximation by computing the square root of LSBs using the ESQR unit. This reduces the mean relative error distance by up to 31% and the normalized mean error distance by up to 26%. Various accuracy metrics and hardware characteristics are evaluated and analyzed for 16-bit unsigned exact, state-of-the-art, and proposed SQRs. The proposed SQRs are designed using Verilog and implemented using Artix7 FPGA. The results show that the proposed SQRs performances are improved compared to the state-of-the-art methods by being approximately 70% smaller, 2.5 times faster, and consuming only 25% of the power of the ESQR. Applications of the proposed SQRs as a Sobel edge detector, and K-means clustering for image processing, and an envelope detector for communication systems are also included.
在本文中,作者提出了一种近似混合平方根器(AHSQR)。它是阵列和基于对数的平方根器(SQR)的结合,在精度和硬件性能之间取得了平衡。基于阵列的 SQR 被用作精确 SQR(ESQR),以获得高精度输出的 MSB,而对数 SQR 则用于估算剩余的输出位数,以提高设计指标。此外,还提出了改进的 AHSQR (MAHSQR),通过使用 ESQR 单元计算 LSB 的平方根,在近似度增加的情况下保持精度。这样,平均相对误差距离最多可减少 31%,归一化平均误差距离最多可减少 26%。对 16 位无符号精确 SQR、最先进 SQR 和建议 SQR 的各种精度指标和硬件特性进行了评估和分析。建议的 SQR 使用 Verilog 设计,并使用 Artix7 FPGA 实现。结果表明,与最先进的方法相比,建议的 SQR 性能提高了约 70%,速度提高了 2.5 倍,功耗仅为 ESQR 的 25%。建议的 SQRs 还可应用于图像处理中的 Sobel 边缘检测器和 K-means 聚类,以及通信系统中的包络检测器。
{"title":"Hardware Implementation of Unsigned Approximate Hybrid Square Rooters for Error-Resilient Applications","authors":"Lalit Bandil;Bal Chand Nagar","doi":"10.1109/TC.2024.3457731","DOIUrl":"10.1109/TC.2024.3457731","url":null,"abstract":"In this paper, the authors proposed an approximate hybrid square rooter (AHSQR). It is the combination of array and logarithmic-based square rooter (SQR) to create a balance between accuracy and hardware performance. An array-based SQR is utilized as an exact SQR (ESQR) to obtain the MSBs of output for high precision, while a logarithmic SQR is used to estimate the remaining output digits to enhance design metrics. A modified AHSQR (MAHSQR) is also proposed to retain accuracy at increasing degrees of approximation by computing the square root of LSBs using the ESQR unit. This reduces the mean relative error distance by up to 31% and the normalized mean error distance by up to 26%. Various accuracy metrics and hardware characteristics are evaluated and analyzed for 16-bit unsigned exact, state-of-the-art, and proposed SQRs. The proposed SQRs are designed using Verilog and implemented using Artix7 FPGA. The results show that the proposed SQRs performances are improved compared to the state-of-the-art methods by being approximately 70% smaller, 2.5 times faster, and consuming only 25% of the power of the ESQR. Applications of the proposed SQRs as a Sobel edge detector, and K-means clustering for image processing, and an envelope detector for communication systems are also included.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 12","pages":"2734-2746"},"PeriodicalIF":3.6,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CAPE: Criticality-Aware Performance and Energy Optimization Policy for NCFET-Based Caches CAPE:基于 NCFET 高速缓存的临界值感知性能和能量优化策略
IF 3.6 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-09-11 DOI: 10.1109/TC.2024.3457734
Divya Praneetha Ravipati;Ramanuj Goel;Victor M. van Santen;Hussam Amrouch;Preeti Ranjan Panda
Caches are crucial yet power-hungry components in present-day computing systems. With the Negative Capacitance Fin Field-Effect Transistor (NCFET) gaining significant attention due to its internal voltage amplification, allowing for better operation at lower voltages (stronger ON-current and reduced leakage current), the introduction of NCFET technology in caches can reduce power consumption without loss in performance. Apart from the benefits offered by the technology, we leverage the unique characteristics offered by NCFETs and propose a dynamic voltage scaling based criticality-aware performance and energy optimization policy (CAPE) for on-chip caches. We present the first work towards optimizing energy in NCFET-based caches with minimal impact on performance. Compared to operating at a nominal voltage of 0.7 V, CAPE shows improvement in Last-Level Cache (LLC) energy savings by up to 19.2%, while the baseline policies devised for traditional CMOS- (/FinFET-) based caches are ineffective in improving NCFET-based LLC energy savings. Compared to the considered baseline policies, our CAPE policy also demonstrates better LLC energy-delay product (EDP) and throughput savings.
高速缓存是当今计算系统中非常重要但又非常耗电的组件。负电容鳍式场效应晶体管(NCFET)具有内部电压放大功能,能在较低电压下更好地工作(导通电流更大,漏电流更小),因此备受关注。除了该技术带来的优势外,我们还利用 NCFET 的独特特性,为片上高速缓存提出了基于临界值感知的动态电压扩展性能和能耗优化策略(CAPE)。我们首次提出了在对性能影响最小的情况下优化基于 NCFET 的高速缓存能耗的方法。与在 0.7 V 标称电压下工作相比,CAPE 在最后一级高速缓存(LLC)节能方面的改进高达 19.2%,而为基于传统 CMOS(/FinFET)的高速缓存设计的基准策略在改进基于 NCFET 的 LLC 节能方面效果不佳。与所考虑的基准策略相比,我们的 CAPE 策略还能更好地节省 LLC 能量-延迟积(EDP)和吞吐量。
{"title":"CAPE: Criticality-Aware Performance and Energy Optimization Policy for NCFET-Based Caches","authors":"Divya Praneetha Ravipati;Ramanuj Goel;Victor M. van Santen;Hussam Amrouch;Preeti Ranjan Panda","doi":"10.1109/TC.2024.3457734","DOIUrl":"10.1109/TC.2024.3457734","url":null,"abstract":"Caches are crucial yet power-hungry components in present-day computing systems. With the Negative Capacitance Fin Field-Effect Transistor (NCFET) gaining significant attention due to its internal voltage amplification, allowing for better operation at lower voltages (stronger ON-current and reduced leakage current), the introduction of NCFET technology in caches can reduce power consumption without loss in performance. Apart from the benefits offered by the technology, we leverage the unique characteristics offered by NCFETs and propose a dynamic voltage scaling based criticality-aware performance and energy optimization policy (CAPE) for on-chip caches. We present the first work towards optimizing energy in NCFET-based caches with minimal impact on performance. Compared to operating at a nominal voltage of 0.7 V, CAPE shows improvement in Last-Level Cache (LLC) energy savings by up to 19.2%, while the baseline policies devised for traditional CMOS- (/FinFET-) based caches are ineffective in improving NCFET-based LLC energy savings. Compared to the considered baseline policies, our CAPE policy also demonstrates better LLC energy-delay product (EDP) and throughput savings.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 12","pages":"2830-2843"},"PeriodicalIF":3.6,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Compressed Test Pattern Generation for Deep Neural Networks 深度神经网络的压缩测试模式生成
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-09-11 DOI: 10.1109/tc.2024.3457738
Dina A. Moussa, Michael Hefenbrock, Mehdi Tahoori
{"title":"Compressed Test Pattern Generation for Deep Neural Networks","authors":"Dina A. Moussa, Michael Hefenbrock, Mehdi Tahoori","doi":"10.1109/tc.2024.3457738","DOIUrl":"https://doi.org/10.1109/tc.2024.3457738","url":null,"abstract":"","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"10 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CUSPX: Efficient GPU Implementations of Post-Quantum Signature SPHINCS+ CUSPX:后量子签名 SPHINCS+ 的高效 GPU 实现
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-09-11 DOI: 10.1109/tc.2024.3457736
Ziheng Wang, Xiaoshe Dong, Heng Chen, Yan Kang, Qiang Wang
{"title":"CUSPX: Efficient GPU Implementations of Post-Quantum Signature SPHINCS+","authors":"Ziheng Wang, Xiaoshe Dong, Heng Chen, Yan Kang, Qiang Wang","doi":"10.1109/tc.2024.3457736","DOIUrl":"https://doi.org/10.1109/tc.2024.3457736","url":null,"abstract":"","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"39 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Component Dependencies Based Network-on-Chip Test 基于组件依赖关系的片上网络测试
IF 3.6 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-09-11 DOI: 10.1109/TC.2024.3457732
Letian Huang;Tianjin Zhao;Ziren Wang;Junkai Zhan;Junshi Wang;Xiaohang Wang
On-line test of NoC is essential for its reliability. This paper proposed an integral test solution for on-line test of NoC to reduce the test cost and improve the reliability of NOC. The test solution includes a new partitioning method, as well as a test method and a test schedule which are based on the proposed partitioning method. The new partitioning method partitions the NoC into a new type of basis unit under test (UUT) named as interdependent components based unit under test (iDC-UUT), which applies component test methods. The iDC-UUT have very low level of functional interdependency and simple physical connection, which results in small test overhead and high test coverage. The proposed test method consists of DFT architecture, test wrapper and test vectors, which can speed-up the test procedure and further improve the test coverage. The proposed test schedule reduces the blockage probability of data packets during testing by increasing the degree of test disorder, so as to further reduce the test cost. Experimental results show that the proposed test solution reduces power and area by 12.7% and 22.7% over an existing test solution. The average latency is reduced by 22.6% to 38.4% over the existing test solution.
NoC 的在线测试对其可靠性至关重要。本文提出了 NoC 在线测试的整体测试解决方案,以降低测试成本,提高 NOC 的可靠性。该测试解决方案包括一种新的分区方法,以及基于所提分区方法的测试方法和测试计划。新的分区方法将 NoC 划分为一种新型的被测基础单元(UUT),命名为基于组件的被测单元(iDC-UUT),采用组件测试方法。iDC-UUT 的功能相互依赖性很低,物理连接简单,因此测试开销小,测试覆盖率高。拟议的测试方法由 DFT 架构、测试封装器和测试矢量组成,可加快测试过程并进一步提高测试覆盖率。建议的测试计划通过增加测试无序度来降低测试过程中数据包的阻塞概率,从而进一步降低测试成本。实验结果表明,与现有测试方案相比,建议的测试方案在功耗和面积上分别降低了 12.7% 和 22.7%。平均延迟比现有测试方案减少了 22.6% 至 38.4%。
{"title":"Component Dependencies Based Network-on-Chip Test","authors":"Letian Huang;Tianjin Zhao;Ziren Wang;Junkai Zhan;Junshi Wang;Xiaohang Wang","doi":"10.1109/TC.2024.3457732","DOIUrl":"10.1109/TC.2024.3457732","url":null,"abstract":"On-line test of NoC is essential for its reliability. This paper proposed an integral test solution for on-line test of NoC to reduce the test cost and improve the reliability of NOC. The test solution includes a new partitioning method, as well as a test method and a test schedule which are based on the proposed partitioning method. The new partitioning method partitions the NoC into a new type of basis unit under test (UUT) named as interdependent components based unit under test (iDC-UUT), which applies component test methods. The iDC-UUT have very low level of functional interdependency and simple physical connection, which results in small test overhead and high test coverage. The proposed test method consists of DFT architecture, test wrapper and test vectors, which can speed-up the test procedure and further improve the test coverage. The proposed test schedule reduces the blockage probability of data packets during testing by increasing the degree of test disorder, so as to further reduce the test cost. Experimental results show that the proposed test solution reduces power and area by 12.7% and 22.7% over an existing test solution. The average latency is reduced by 22.6% to 38.4% over the existing test solution.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 12","pages":"2805-2816"},"PeriodicalIF":3.6,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FLALM: A Flexible Low Area-Latency Montgomery Modular Multiplication on FPGA FLALM: FPGA 上灵活的低面积-延迟蒙哥马利模块化乘法器
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-09-11 DOI: 10.1109/tc.2024.3457739
Yujun Xie, Yuan Liu, Xin Zheng, Bohan Lan, Dengyun Lei, Dehao Xiang, Shuting Cai, Xiaoming Xiong
{"title":"FLALM: A Flexible Low Area-Latency Montgomery Modular Multiplication on FPGA","authors":"Yujun Xie, Yuan Liu, Xin Zheng, Bohan Lan, Dengyun Lei, Dehao Xiang, Shuting Cai, Xiaoming Xiong","doi":"10.1109/tc.2024.3457739","DOIUrl":"https://doi.org/10.1109/tc.2024.3457739","url":null,"abstract":"","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"8 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Computers
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1