IACR Cryptol. ePrint Arch.最新文献

MOSFHET: Optimized Software for FHE over the Torus MOSFHET：用于环上 FHE 的优化软件

IACR Cryptol. ePrint Arch.

Pub Date : 2024-07-24 DOI: 10.1007/s13389-024-00359-z

Antonio Guimarães, E. Borin, Diego F. Aranha

引用次数: 9

Time Sharing - A Novel Approach to Low-Latency Masking 时间共享--低延迟掩蔽的新方法

IACR Cryptol. ePrint Arch.

Pub Date : 2024-07-18 DOI: 10.46586/tches.v2024.i3.249-272

Dilip Kumar, Siemen Dhooghe, J. Balasch, Benedikt Gierlichs, Ingrid Verbauwhede

We present a novel approach to small area and low-latency first-order masking in hardware. The core idea is to separate the processing of shares in time in order to achieve non-completeness. Resulting circuits are proven first-order glitchextended PINI secure. This means the method can be straightforwardly applied to mask arbitrary functions without constraints which the designer must take care of. Furthermore we show that an implementation can benefit from optimization through EDA tools without sacrificing security. We provide concrete results of several case studies. Our low-latency implementation of a complete PRINCE core shows a 32% area improvement (44% with optimization) over the state-of-the-art. Our PRINCE S-Box passes formal verification with a tool and the complete core on FPGA shows no first-order leakage in TVLA with 100 million traces. Our low-latency implementation of the AES S-Box costs roughly one third (one quarter with optimization) of the area of state-of-the-art implementations. It shows no first-order leakage in TVLA with 250 million traces.

我们提出了一种在硬件中实现小面积、低延迟一阶掩码的新方法。其核心思想是在时间上分离份额处理，以实现非完整性。研究结果证明，电路具有一阶闪烁扩展 PINI 安全性。这意味着该方法可以直接应用于屏蔽任意函数，而无需设计者必须考虑的约束条件。此外，我们还证明，通过 EDA 工具进行优化，可以在不牺牲安全性的情况下实现该方法。我们提供了几个案例研究的具体结果。我们对一个完整 PRINCE 内核的低延迟实现表明，其面积比最新技术提高了 32%（优化后提高了 44%）。我们的 PRINCE S-Box 通过了一种工具的形式验证，FPGA 上的完整内核在 1 亿次跟踪的 TVLA 中没有出现一阶泄漏。我们的 AES S-Box 低延迟实现成本约为最先进实现的三分之一（优化后为四分之一）。它在 2.5 亿次跟踪的 TVLA 中未显示一阶泄漏。

引用次数: 0

White-box filtering attacks breaking SEL masking: from exponential to polynomial time 破解 SEL 屏蔽的白盒过滤攻击：从指数时间到多项式时间

IACR Cryptol. ePrint Arch.

Pub Date : 2024-07-18 DOI: 10.46586/tches.v2024.i3.1-24

Alex Charlès, A. Udovenko

This work proposes a new white-box attack technique called filtering, which can be combined with any other trace-based attack method. The idea is to filter the traces based on the value of an intermediate variable in the implementation, aiming to fix a share of a sensitive value and degrade the security of an involved masking scheme.Coupled with LDA (filtered LDA, FLDA), it leads to an attack defeating the state-ofthe-art SEL masking scheme (CHES 2021) of arbitrary degree and number of linear shares with quartic complexity in the window size. In comparison, the current best attacks have exponential complexities in the degree (higher degree decoding analysis, HDDA), in the number of linear shares (higher-order differential computation analysis, HODCA), or the window size (white-box learning parity with noise, WBLPN). The attack exploits the key idea of the SEL scheme - an efficient parallel combination of the nonlinear and linear masking schemes. We conclude that a proper composition of masking schemes is essential for security.In addition, we propose several optimizations for linear algebraic attacks: redundant node removal (RNR), optimized parity check matrix usage, and chosen-plaintext filtering (CPF), significantly improving the performance of security evaluation of white-box implementations.

这项工作提出了一种新的白盒攻击技术，称为过滤，它可以与任何其他基于轨迹的攻击方法相结合。该技术与 LDA（过滤 LDA，FLDA）相结合，可产生一种攻击方法，可击败任意程度和线性份额数的最先进 SEL 屏蔽方案（CHES 2021），其窗口大小的复杂度为四次方。相比之下，目前最好的攻击在阶数（高阶解码分析，HDDA）、线性份额数（高阶差分计算分析，HODCA）或窗口大小（带噪声的白盒学习奇偶校验，WBLPN）方面的复杂度都是指数级的。这种攻击利用了 SEL 方案的关键理念--非线性和线性掩码方案的高效并行组合。此外，我们还针对线性代数攻击提出了几种优化方案：冗余节点移除（RNR）、优化奇偶校验矩阵使用和选择平文过滤（CPF），从而显著提高了白盒实现的安全评估性能。

{"title":"White-box filtering attacks breaking SEL masking: from exponential to polynomial time","authors":"Alex Charlès, A. Udovenko","doi":"10.46586/tches.v2024.i3.1-24","DOIUrl":"https://doi.org/10.46586/tches.v2024.i3.1-24","url":null,"abstract":"This work proposes a new white-box attack technique called filtering, which can be combined with any other trace-based attack method. The idea is to filter the traces based on the value of an intermediate variable in the implementation, aiming to fix a share of a sensitive value and degrade the security of an involved masking scheme.Coupled with LDA (filtered LDA, FLDA), it leads to an attack defeating the state-ofthe-art SEL masking scheme (CHES 2021) of arbitrary degree and number of linear shares with quartic complexity in the window size. In comparison, the current best attacks have exponential complexities in the degree (higher degree decoding analysis, HDDA), in the number of linear shares (higher-order differential computation analysis, HODCA), or the window size (white-box learning parity with noise, WBLPN). The attack exploits the key idea of the SEL scheme - an efficient parallel combination of the nonlinear and linear masking schemes. We conclude that a proper composition of masking schemes is essential for security.In addition, we propose several optimizations for linear algebraic attacks: redundant node removal (RNR), optimized parity check matrix usage, and chosen-plaintext filtering (CPF), significantly improving the performance of security evaluation of white-box implementations.","PeriodicalId":508905,"journal":{"name":"IACR Cryptol. ePrint Arch.","volume":" 37","pages":"691"},"PeriodicalIF":0.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141825507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Optimized Homomorphic Evaluation of Boolean Functions 布尔函数的优化同态评估

IACR Cryptol. ePrint Arch.

Pub Date : 2024-07-18 DOI: 10.46586/tches.v2024.i3.302-341

Nicolas Bon, David Pointcheval, Matthieu Rivain

We propose a new framework to homomorphically evaluate Boolean functions using the Torus Fully Homomorphic Encryption (TFHE) scheme. Compared to previous approaches focusing on Boolean gates, our technique can evaluate more complex Boolean functions with several inputs using a single bootstrapping. This allows us to greatly reduce the number of bootstrapping operations necessary to evaluate a Boolean circuit compared to previous works, thus achieving significant improvements in terms of performances. We define theoretically our approach which consists in adding an intermediate homomorphic layer between the plain Boolean space and the ciphertext space. This layer relies on so-called p-encodings embedding bits into Zp. We analyze the properties of these encodings to enable the evaluation of a given Boolean function and provide a deterministic algorithm (as well as an efficient heuristic) to find valid sets of encodings for a given function. We also propose a method to decompose any Boolean circuit into Boolean functions which are efficiently evaluable using our approach. We apply our framework to homomorphically evaluate various cryptographic primitives, and in particular the AES cipher. Our implementation results show significant improvements compared to the state of the art.

我们提出了一种利用环完全同态加密（TFHE）方案对布尔函数进行同态评估的新框架。与以往专注于布尔门的方法相比，我们的技术只需一次引导，就能评估具有多个输入的更复杂布尔函数。与以前的方法相比，我们可以大大减少评估布尔电路所需的引导操作次数，从而显著提高性能。我们从理论上定义了我们的方法，即在纯布尔空间和密码文本空间之间添加一个中间同态层。这一层依赖于将比特嵌入 Zp 的所谓 p 编码。我们分析了这些编码的特性，以评估给定的布尔函数，并提供了一种确定性算法（以及一种高效的启发式算法），为给定函数找到有效的编码集。我们还提出了一种将任何布尔电路分解为布尔函数的方法，使用我们的方法可以高效地评估这些布尔函数。我们将我们的框架应用于同态评估各种加密原语，特别是 AES 密码。我们的实现结果表明，与目前的技术水平相比，我们的方法有了明显的改进。

{"title":"Optimized Homomorphic Evaluation of Boolean Functions","authors":"Nicolas Bon, David Pointcheval, Matthieu Rivain","doi":"10.46586/tches.v2024.i3.302-341","DOIUrl":"https://doi.org/10.46586/tches.v2024.i3.302-341","url":null,"abstract":"We propose a new framework to homomorphically evaluate Boolean functions using the Torus Fully Homomorphic Encryption (TFHE) scheme. Compared to previous approaches focusing on Boolean gates, our technique can evaluate more complex Boolean functions with several inputs using a single bootstrapping. This allows us to greatly reduce the number of bootstrapping operations necessary to evaluate a Boolean circuit compared to previous works, thus achieving significant improvements in terms of performances. We define theoretically our approach which consists in adding an intermediate homomorphic layer between the plain Boolean space and the ciphertext space. This layer relies on so-called p-encodings embedding bits into Zp. We analyze the properties of these encodings to enable the evaluation of a given Boolean function and provide a deterministic algorithm (as well as an efficient heuristic) to find valid sets of encodings for a given function. We also propose a method to decompose any Boolean circuit into Boolean functions which are efficiently evaluable using our approach. We apply our framework to homomorphically evaluate various cryptographic primitives, and in particular the AES cipher. Our implementation results show significant improvements compared to the state of the art.","PeriodicalId":508905,"journal":{"name":"IACR Cryptol. ePrint Arch.","volume":" 32","pages":"1589"},"PeriodicalIF":0.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141825558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Hints from Hertz: Dynamic Frequency Scaling Side-Channel Analysis of Number Theoretic Transform in Lattice-Based KEMs 赫兹的提示：基于网格的 KEM 中数字理论变换的动态频率缩放侧通道分析

IACR Cryptol. ePrint Arch.

Pub Date : 2024-07-18 DOI: 10.46586/tches.v2024.i3.200-223

Tianrun Yu, Chi Cheng, Zilong Yang, Yingchen Wang, Yanbin Pan, Jian Weng

Number Theoretic Transform (NTT) has been widely used in accelerating computations in lattice-based cryptography. However, attackers can potentially launch power analysis targeting the NTT because it is one of the most time-consuming parts of the implementation. This extended time frame provides a natural window of opportunity for attackers. In this paper, we investigate the first CPU frequency leakage (Hertzbleed-like) attacks against NTT in lattice-based KEMs. Our key observation is that different inputs to NTT incur different Hamming weights in its output and intermediate layers. By measuring the CPU frequency during the execution of NTT, we propose a simple yet effective attack idea to find the input to NTT that triggers NTT processing data with significantly low Hamming weight. We further apply our attack idea to real-world applications that are built upon NTT: CPAsecure Kyber without Compression and Decompression functions, and CCA-secure NTTRU. This leads us to extract information or frequency hints about the secret key. Integrating these hints into the LWE-estimator framework, we estimate a minimum of 35% security loss caused by the leakage. The frequency and timing measurements on the Reference and AVX2 implementations of NTT in both Kyber and NTTRU align well with our theoretical analysis, confirming the existence of frequency side-channel leakage in NTT. It is important to emphasize that our observation is not limited to a specific implementation but rather the algorithm on which NTT is based. Therefore, our results call for more attention to the analysis of power leakage against NTT in lattice-based cryptography.

数论变换（NTT）被广泛用于加速基于网格的密码学计算。然而，由于 NTT 是实现过程中最耗时的部分之一，攻击者有可能针对 NTT 发起功率分析。这一延长的时间框架为攻击者提供了一个天然的机会之窗。在本文中，我们首次研究了针对基于网格的 KEM 中的 NTT 的 CPU 频率泄漏（类似赫兹出血）攻击。我们的主要观察结果是，NTT 的不同输入会在其输出层和中间层产生不同的汉明权重。通过测量 NTT 执行过程中的 CPU 频率，我们提出了一种简单而有效的攻击思路，以找到 NTT 的输入，从而触发 NTT 以明显较低的 Hamming 权重处理数据。我们进一步将我们的攻击思想应用于建立在 NTT 基础上的真实世界应用：无压缩和解压缩功能的 CPA 安全 Kyber 和 CCA 安全 NTTRU。这样，我们就能提取有关秘钥的信息或频率提示。将这些提示整合到 LWE 估算框架中，我们估算出泄漏造成的安全损失至少为 35%。在 Kyber 和 NTTRU 中对 NTT 的参考和 AVX2 实现进行的频率和时序测量与我们的理论分析非常吻合，证实了 NTT 中存在频率侧信道泄漏。需要强调的是，我们的观察结果并不局限于特定的实现，而是 NTT 所基于的算法。因此，我们的研究结果呼吁人们更多地关注基于网格的密码学中针对 NTT 的功率泄漏分析。

{"title":"Hints from Hertz: Dynamic Frequency Scaling Side-Channel Analysis of Number Theoretic Transform in Lattice-Based KEMs","authors":"Tianrun Yu, Chi Cheng, Zilong Yang, Yingchen Wang, Yanbin Pan, Jian Weng","doi":"10.46586/tches.v2024.i3.200-223","DOIUrl":"https://doi.org/10.46586/tches.v2024.i3.200-223","url":null,"abstract":"Number Theoretic Transform (NTT) has been widely used in accelerating computations in lattice-based cryptography. However, attackers can potentially launch power analysis targeting the NTT because it is one of the most time-consuming parts of the implementation. This extended time frame provides a natural window of opportunity for attackers. In this paper, we investigate the first CPU frequency leakage (Hertzbleed-like) attacks against NTT in lattice-based KEMs. Our key observation is that different inputs to NTT incur different Hamming weights in its output and intermediate layers. By measuring the CPU frequency during the execution of NTT, we propose a simple yet effective attack idea to find the input to NTT that triggers NTT processing data with significantly low Hamming weight. We further apply our attack idea to real-world applications that are built upon NTT: CPAsecure Kyber without Compression and Decompression functions, and CCA-secure NTTRU. This leads us to extract information or frequency hints about the secret key. Integrating these hints into the LWE-estimator framework, we estimate a minimum of 35% security loss caused by the leakage. The frequency and timing measurements on the Reference and AVX2 implementations of NTT in both Kyber and NTTRU align well with our theoretical analysis, confirming the existence of frequency side-channel leakage in NTT. It is important to emphasize that our observation is not limited to a specific implementation but rather the algorithm on which NTT is based. Therefore, our results call for more attention to the analysis of power leakage against NTT in lattice-based cryptography.","PeriodicalId":508905,"journal":{"name":"IACR Cryptol. ePrint Arch.","volume":" 16","pages":"70"},"PeriodicalIF":0.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141825759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

HAETAE: Shorter Lattice-Based Fiat-Shamir Signatures HAETAE：基于较短格子的菲亚特-沙米尔签名

IACR Cryptol. ePrint Arch.

Pub Date : 2024-07-18 DOI: 10.46586/tches.v2024.i3.25-75

J. Cheon, Hyeong-Soon Choe, Julien Devevey, Tim Güneysu, Dongyeon Hong, Markus Krausz, Georg Land, Marc Möller, D. Stehlé, MinJune Yi

We present HAETAE (Hyperball bimodAl modulE rejecTion signAture schemE), a new lattice-based signature scheme. Like the NIST-selected Dilithium signature scheme, HAETAE is based on the Fiat-Shamir with Aborts paradigm, but our design choices target an improved complexity/compactness compromise that is highly relevant for many space-limited application scenarios. We primarily focus on reducing signature and verification key sizes so that signatures fit into one TCP or UDP datagram while preserving a high level of security against a variety of attacks. As a result, our scheme has signature and verification key sizes up to 39% and 25% smaller, respectively, compared than Dilithium. We provide a portable, constanttime reference implementation together with an optimized implementation using AVX2 instructions and an implementation with reduced stack size for the Cortex-M4. Moreover, we describe how to efficiently protect HAETAE against implementation attacks such as side-channel analysis, making it an attractive candidate for use in IoT and other embedded systems.

我们提出的 HAETAE（Hyperball bimodAl modulE rejecTion signAture schemE）是一种基于晶格的新型签名方案。与 NIST 选定的 Dilithium 签名方案一样，HAETAE 也是基于 Fiat-Shamir with Aborts 范式，但我们的设计目标是改进复杂性/紧凑性的折衷，这与许多空间受限的应用场景高度相关。我们的主要重点是减少签名和验证密钥的大小，以便将签名放入一个 TCP 或 UDP 数据报中，同时针对各种攻击保持高水平的安全性。因此，与 Dilithium 相比，我们的方案的签名和验证密钥大小分别减少了 39% 和 25%。我们为 Cortex-M4 提供了可移植的恒定时间参考实现，以及使用 AVX2 指令的优化实现和减少堆栈大小的实现。此外，我们还介绍了如何有效地保护 HAETAE 免受侧信道分析等实现攻击，使其成为物联网和其他嵌入式系统中极具吸引力的候选方案。

引用次数: 9

PoMMES: Prevention of Micro-architectural Leakages in Masked Embedded Software PoMMES：防止屏蔽嵌入式软件中的微架构泄漏

IACR Cryptol. ePrint Arch.

Pub Date : 2024-07-18 DOI: 10.46586/tches.v2024.i3.342-376

Jannik Zeitschner, Amir Moradi

Software solutions to address computational challenges are ubiquitous in our daily lives. One specific application area where software is often used is in embedded systems, which, like other digital electronic devices, are vulnerable to side-channel analysis attacks. Although masking is the most common countermeasure and provides a solid theoretical foundation for ensuring security, recent research has revealed a crucial gap between theoretical and real-world security. This shortcoming stems from the micro-architectural effects of the underlying micro-processor. Common security models used to formally verify masking schemes such as the d-probing model fully ignore the micro-architectural leakages that lead to a set of instructions that unintentionally recombine the shares. Manual generation of masked assembly code that remains secure in the presence of such micro-architectural recombinations often involves trial and error, and is non-trivial even for experts.Motivated by this, we present PoMMES, which enables inexperienced software developers to automatically compile masked functions written in a high-level programming language into assembly code, while preserving the theoretically proven security in practice. Compared to the state of the art, based on a general model for microarchitectural effects, our scheme allows the generation of practically secure masked software at arbitrary security orders for in-order processors. The major contribution of PoMMES is its micro-architecture aware register allocation algorithm, which is one of the crucial steps during the compilation process. In addition to simulation-based assessments that we conducted by open-source tools dedicated to evaluating masked software implementations, we confirm the effectiveness of the PoMMES-generated codes through experimental analysis. We present the result of power consumption based leakage assessments of several case studies running on a Cortex M0+ micro-controller, which is commonly deployed in industry.

在我们的日常生活中，应对计算挑战的软件解决方案无处不在。嵌入式系统是经常使用软件的一个特定应用领域，它与其他数字电子设备一样，容易受到侧信道分析攻击。尽管掩码是最常见的应对措施，并为确保安全提供了坚实的理论基础，但最近的研究发现，理论与现实世界的安全性之间存在着至关重要的差距。这一缺陷源于底层微处理器的微架构效应。用于正式验证屏蔽方案的常见安全模型（如 d-probing 模型）完全忽略了微体系结构泄漏，这种泄漏会导致一组指令无意中重新组合份额。因此，我们提出了 PoMMES，它能让缺乏经验的软件开发人员自动将用高级编程语言编写的屏蔽函数编译成汇编代码，同时在实践中保持理论上已证明的安全性。与基于微体系结构效应通用模型的现有技术相比，我们的方案可以为无序处理器生成任意安全级别的实际安全屏蔽软件。PoMMES 的主要贡献在于其微架构感知寄存器分配算法，这是编译过程中的关键步骤之一。除了利用专用于评估掩码软件实现的开源工具进行基于仿真的评估外，我们还通过实验分析证实了 PoMMES 生成的代码的有效性。我们介绍了在 Cortex M0+ 微控制器上运行的几个案例研究的功耗泄漏评估结果，该微控制器通常部署在工业领域。

{"title":"PoMMES: Prevention of Micro-architectural Leakages in Masked Embedded Software","authors":"Jannik Zeitschner, Amir Moradi","doi":"10.46586/tches.v2024.i3.342-376","DOIUrl":"https://doi.org/10.46586/tches.v2024.i3.342-376","url":null,"abstract":"Software solutions to address computational challenges are ubiquitous in our daily lives. One specific application area where software is often used is in embedded systems, which, like other digital electronic devices, are vulnerable to side-channel analysis attacks. Although masking is the most common countermeasure and provides a solid theoretical foundation for ensuring security, recent research has revealed a crucial gap between theoretical and real-world security. This shortcoming stems from the micro-architectural effects of the underlying micro-processor. Common security models used to formally verify masking schemes such as the d-probing model fully ignore the micro-architectural leakages that lead to a set of instructions that unintentionally recombine the shares. Manual generation of masked assembly code that remains secure in the presence of such micro-architectural recombinations often involves trial and error, and is non-trivial even for experts.Motivated by this, we present PoMMES, which enables inexperienced software developers to automatically compile masked functions written in a high-level programming language into assembly code, while preserving the theoretically proven security in practice. Compared to the state of the art, based on a general model for microarchitectural effects, our scheme allows the generation of practically secure masked software at arbitrary security orders for in-order processors. The major contribution of PoMMES is its micro-architecture aware register allocation algorithm, which is one of the crucial steps during the compilation process. In addition to simulation-based assessments that we conducted by open-source tools dedicated to evaluating masked software implementations, we confirm the effectiveness of the PoMMES-generated codes through experimental analysis. We present the result of power consumption based leakage assessments of several case studies running on a Cortex M0+ micro-controller, which is commonly deployed in industry.","PeriodicalId":508905,"journal":{"name":"IACR Cryptol. ePrint Arch.","volume":" 20","pages":"574"},"PeriodicalIF":0.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141826110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

1/0 Shades of UC: Photonic Side-Channel Analysis of Universal Circuits 1/0 Shades of UC：通用电路的光子侧通道分析

IACR Cryptol. ePrint Arch.

Pub Date : 2024-07-18 DOI: 10.46586/tches.v2024.i3.574-602

Dev M. Mehta, M. Hashemi, Domenic Forte, Shahin Tajik, F. Ganji

A universal circuit (UC) can be thought of as a programmable circuit that can simulate any circuit up to a certain size by specifying its secret configuration bits. UCs have been incorporated into various applications, such as private function evaluation (PFE). Recently, studies have attempted to formalize the concept of semiconductor intellectual property (IP) protection in the context of UCs. This is despite the observations made in theory and practice that, in reality, the adversary may obtain additional information about the secret when executing cryptographic protocols. This paper aims to answer the question of whether UCs leak information unintentionally, which can be leveraged by the adversary to disclose the configuration bits. In this regard, we propose the first photon emission analysis against UCs relying on computer vision-based approaches. We demonstrate that the adversary can utilize a cost-effective solution to take images to be processed by off-the-shelf algorithms to extract configuration bits. We examine the efficacy of our method in two scenarios: (1) the design is small enough to be captured in a single image during the attack phase, and (2) multiple images should be captured to launch the attack by deploying a divide-and-conquer strategy. To evaluate the effectiveness of our attack, we use metrics commonly applied in side-channel analysis, namely rank and success rate. By doing so, we show that our profiled photon emission analysis achieves a success rate of 1 by employing a few templates (concretely, only 18 images were used as templates).

通用电路（UC）可以看作是一种可编程电路，通过指定其秘密配置位，它可以模拟一定大小的任何电路。通用电路已被纳入各种应用，如私人函数评估（PFE）。最近，一些研究试图在 UC 的背景下正式确定半导体知识产权（IP）保护的概念。尽管在理论和实践中观察到，在现实中，对手在执行加密协议时可能会获得有关秘密的额外信息。本文旨在回答 UC 是否会无意中泄露信息的问题，而这些信息会被对手利用来泄露配置比特。在这方面，我们首次提出了基于计算机视觉方法的针对 UC 的光子发射分析。我们证明，对手可以利用一种经济高效的解决方案来拍摄图像，并通过现成的算法进行处理，从而提取配置位。我们在两种情况下检验了我们方法的有效性：(1) 设计足够小，可以在攻击阶段通过单张图像捕获；(2) 应通过部署分而治之策略捕获多张图像以发起攻击。为了评估我们的攻击效果，我们使用了侧信道分析中常用的指标，即等级和成功率。通过这种方法，我们表明，我们的剖析光子发射分析只需使用少量模板（具体而言，只使用了 18 幅图像作为模板）就能达到 1 的成功率。

{"title":"1/0 Shades of UC: Photonic Side-Channel Analysis of Universal Circuits","authors":"Dev M. Mehta, M. Hashemi, Domenic Forte, Shahin Tajik, F. Ganji","doi":"10.46586/tches.v2024.i3.574-602","DOIUrl":"https://doi.org/10.46586/tches.v2024.i3.574-602","url":null,"abstract":"A universal circuit (UC) can be thought of as a programmable circuit that can simulate any circuit up to a certain size by specifying its secret configuration bits. UCs have been incorporated into various applications, such as private function evaluation (PFE). Recently, studies have attempted to formalize the concept of semiconductor intellectual property (IP) protection in the context of UCs. This is despite the observations made in theory and practice that, in reality, the adversary may obtain additional information about the secret when executing cryptographic protocols. This paper aims to answer the question of whether UCs leak information unintentionally, which can be leveraged by the adversary to disclose the configuration bits. In this regard, we propose the first photon emission analysis against UCs relying on computer vision-based approaches. We demonstrate that the adversary can utilize a cost-effective solution to take images to be processed by off-the-shelf algorithms to extract configuration bits. We examine the efficacy of our method in two scenarios: (1) the design is small enough to be captured in a single image during the attack phase, and (2) multiple images should be captured to launch the attack by deploying a divide-and-conquer strategy. To evaluate the effectiveness of our attack, we use metrics commonly applied in side-channel analysis, namely rank and success rate. By doing so, we show that our profiled photon emission analysis achieves a success rate of 1 by employing a few templates (concretely, only 18 images were used as templates).","PeriodicalId":508905,"journal":{"name":"IACR Cryptol. ePrint Arch.","volume":" 3","pages":"72"},"PeriodicalIF":0.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141824401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

CrISA-X: Unleashing Performance Excellence in Lightweight Symmetric Cryptography for Extendable and Deeply Embedded Processors CrISA-X：为可扩展和深度嵌入式处理器释放轻量级对称密码学的卓越性能

IACR Cryptol. ePrint Arch.

Pub Date : 2024-07-18 DOI: 10.46586/tches.v2024.i3.377-417

Oren Ganon, Itamar Levi

The efficient execution of a Lightweight Cryptography (LWC) algorithm is essential for edge computing platforms. Dedicated Instruction Set Extensions (ISEs) are often included for this purpose. We propose the CrISA-X-a Cryptography Instruction Set Architecture eXtensions designed to improve cryptographic latency on extendable processors. CrISA-X, provides enhanced speed of various algorithms simultaneously while optimizing ISA adaptability, a feat yet to be accomplished. The extension, diverse for several computation levels, is first tailored explicitly for individual algorithms and sets of LWC algorithms, depending on performance, frequency, and area trade-offs. By diligently applying the Min-Max optimization technique, we have configured these extensions to achieve a delicate balance between performance, area utilization, code size, etc. Our study presents empirical evidence of the performance enhancement achieved on a synthesis modular RISC processor. We offer a framework for creating optimized processor hardware and ISA extensions. The CrISA-X outperforms ISA extensions by delivering significant performance boosts between 3x to 17x while experiencing a relative area cost increase of +12% and +47% in LUTs. Notably, as one important example, the utilization of the ASCON algorithm yields a 10x performance boost in contrast to the base ISA instruction implementation.

高效执行轻量级密码学（LWC）算法对边缘计算平台至关重要。专用指令集扩展（ISE）通常用于此目的。我们提出了 CrISA-X- 一种加密指令集架构扩展，旨在改善可扩展处理器上的加密延迟。CrISA-X 在优化 ISA 适应性的同时，提高了各种算法的速度，这是一项尚未完成的壮举。根据性能、频率和面积的权衡，该扩展首先针对单个算法和 LWC 算法集进行了明确定制，并在多个计算级别上实现了多样化。通过孜孜不倦地应用 Min-Max 优化技术，我们配置了这些扩展，在性能、面积利用率、代码大小等方面实现了微妙的平衡。我们的研究提供了在合成模块化 RISC 处理器上实现性能提升的经验证据。我们提供了一个创建优化处理器硬件和 ISA 扩展的框架。CrISA-X 的性能优于 ISA 扩展，其性能显著提高了 3 至 17 倍，而相对面积成本增加了 12%，LUT 增加了 47%。值得注意的是，一个重要的例子是，与基本 ISA 指令实现相比，利用 ASCON 算法可实现 10 倍的性能提升。

{"title":"CrISA-X: Unleashing Performance Excellence in Lightweight Symmetric Cryptography for Extendable and Deeply Embedded Processors","authors":"Oren Ganon, Itamar Levi","doi":"10.46586/tches.v2024.i3.377-417","DOIUrl":"https://doi.org/10.46586/tches.v2024.i3.377-417","url":null,"abstract":"The efficient execution of a Lightweight Cryptography (LWC) algorithm is essential for edge computing platforms. Dedicated Instruction Set Extensions (ISEs) are often included for this purpose. We propose the CrISA-X-a Cryptography Instruction Set Architecture eXtensions designed to improve cryptographic latency on extendable processors. CrISA-X, provides enhanced speed of various algorithms simultaneously while optimizing ISA adaptability, a feat yet to be accomplished. The extension, diverse for several computation levels, is first tailored explicitly for individual algorithms and sets of LWC algorithms, depending on performance, frequency, and area trade-offs. By diligently applying the Min-Max optimization technique, we have configured these extensions to achieve a delicate balance between performance, area utilization, code size, etc. Our study presents empirical evidence of the performance enhancement achieved on a synthesis modular RISC processor. We offer a framework for creating optimized processor hardware and ISA extensions. The CrISA-X outperforms ISA extensions by delivering significant performance boosts between 3x to 17x while experiencing a relative area cost increase of +12% and +47% in LUTs. Notably, as one important example, the utilization of the ASCON algorithm yields a 10x performance boost in contrast to the base ISA instruction implementation.","PeriodicalId":508905,"journal":{"name":"IACR Cryptol. ePrint Arch.","volume":" 10","pages":"59"},"PeriodicalIF":0.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141827770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evict+Spec+Time: Exploiting Out-of-Order Execution to Improve Cache-Timing Attacks Evict+Spec+Time: 利用失序执行改进缓存计时攻击

IACR Cryptol. ePrint Arch.

Pub Date : 2024-07-18 DOI: 10.46586/tches.v2024.i3.224-248

Shing Hing William Cheng, C. Chuengsatiansup, Daniel Genkin, Dallas McNeil, Toby Murray, Y. Yarom, Zhiyuan Zhang

Speculative out-of-order execution is a strategy of masking execution latency by allowing younger instructions to execute before older instructions. While originally considered to be innocuous, speculative out-of-order execution was brought into the spotlight with the 2018 publication of the Spectre and Meltdown attacks. These attacks demonstrated that microarchitectural side channels can leak sensitive data accessed by speculatively executed instructions that are not part of the normal program execution. Since then, a significant effort has been vested in investigating how microarchitectural side channels can leak data from speculatively executed instructions and how to control this leakage. However, much less is known about how speculative out-of-order execution affects microarchitectural side-channel attacks.In this paper, we investigate how speculative out-of-order execution affects the Evict+Time cache attack. Evict+Time is based on the observation that cache misses are slower than cache hits, hence by measuring the execution time of code, an attacker can determine if a cache miss occurred during the execution. We demonstrate that, due to limited resources for tracking out-of-order execution, under certain conditions an attacker can gain more fine-grained information and determine whether a cache miss occurred in part of the executed code.Based on the observation, we design the Evict+Spec+Time attack, a variant of Evict+Time that can learn not only whether a cache miss occurred, but also in which part of the victim code it occurred. We demonstrate that Evict+Spec+Time is an order of magnitude more efficient than Evict+Time when attacking a T-tables-based implementation of AES. We further show an Evict+Spec+Time attack on an S-boxbased implementation of AES, recovering the key with as little as 14 815 decryptions. To the best of our knowledge, ours is the first successful Evict+Time attack on such a victim.

投机性失序执行是一种掩盖执行延迟的策略，它允许较新的指令在较旧的指令之前执行。尽管投机性失序执行最初被认为是无害的，但随着 2018 年发布的 Spectre 和 Meltdown 攻击事件，投机性失序执行成为焦点。这些攻击表明，微架构侧信道可能会泄露不属于正常程序执行的投机执行指令所访问的敏感数据。从那时起，人们就开始花大力气研究微体系结构侧信道如何泄漏来自推测执行指令的数据，以及如何控制这种泄漏。在本文中，我们研究了投机性失序执行如何影响 Evict+Time 缓存攻击。Evict+Time 基于缓存未命中比缓存命中慢这一观察结果，因此通过测量代码的执行时间，攻击者可以确定在执行过程中是否发生了缓存未命中。我们证明，由于跟踪失序执行的资源有限，在某些条件下，攻击者可以获得更细粒度的信息，并确定缓存未命中是否发生在部分已执行代码中。基于这一观察结果，我们设计了 Evict+Spec+Time 攻击，它是 Evict+Time 的一个变种，不仅可以了解缓存未命中是否发生，还可以了解缓存未命中发生在受害者代码的哪个部分。我们证明，在攻击基于 T 表的 AES 实现时，Evict+Spec+Time 的效率比 Evict+Time 高一个数量级。我们进一步展示了对基于 S-box 的 AES 实现的 Evict+Spec+Time 攻击，只需 14 815 次解密就能恢复密钥。据我们所知，我们是第一次成功地对这种受害者进行 Evict+Time 攻击。

{"title":"Evict+Spec+Time: Exploiting Out-of-Order Execution to Improve Cache-Timing Attacks","authors":"Shing Hing William Cheng, C. Chuengsatiansup, Daniel Genkin, Dallas McNeil, Toby Murray, Y. Yarom, Zhiyuan Zhang","doi":"10.46586/tches.v2024.i3.224-248","DOIUrl":"https://doi.org/10.46586/tches.v2024.i3.224-248","url":null,"abstract":"Speculative out-of-order execution is a strategy of masking execution latency by allowing younger instructions to execute before older instructions. While originally considered to be innocuous, speculative out-of-order execution was brought into the spotlight with the 2018 publication of the Spectre and Meltdown attacks. These attacks demonstrated that microarchitectural side channels can leak sensitive data accessed by speculatively executed instructions that are not part of the normal program execution. Since then, a significant effort has been vested in investigating how microarchitectural side channels can leak data from speculatively executed instructions and how to control this leakage. However, much less is known about how speculative out-of-order execution affects microarchitectural side-channel attacks.In this paper, we investigate how speculative out-of-order execution affects the Evict+Time cache attack. Evict+Time is based on the observation that cache misses are slower than cache hits, hence by measuring the execution time of code, an attacker can determine if a cache miss occurred during the execution. We demonstrate that, due to limited resources for tracking out-of-order execution, under certain conditions an attacker can gain more fine-grained information and determine whether a cache miss occurred in part of the executed code.Based on the observation, we design the Evict+Spec+Time attack, a variant of Evict+Time that can learn not only whether a cache miss occurred, but also in which part of the victim code it occurred. We demonstrate that Evict+Spec+Time is an order of magnitude more efficient than Evict+Time when attacking a T-tables-based implementation of AES. We further show an Evict+Spec+Time attack on an S-boxbased implementation of AES, recovering the key with as little as 14 815 decryptions. To the best of our knowledge, ours is the first successful Evict+Time attack on such a victim.","PeriodicalId":508905,"journal":{"name":"IACR Cryptol. ePrint Arch.","volume":" 38","pages":"149"},"PeriodicalIF":0.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141827442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0