Décio Luiz Gazzoni Filho, Tomás Recio, Julio López Hernandez
We present a solution to the open problem of designing a linear-time, unbiased and timing attack-resistant shuffling algorithm for fixed-weight sampling. Although it can be implemented without timing leakages of secret data in any architecture, we illustrate with ARMv7-M and ARMv8-A implementations; for the latter, we take advantage of architectural features such as NEON and conditional instructions, which are representative of features available on architectures targeting similar systems, such as Intel. Our proposed algorithm improves asymptotically upon the current approach based on constant-time sorting networks ( O ( n ) versus O ( n log 2 n ) ), and an implementation of the new algorithm applied to NTRU is also faster in practice, by a factor of up to 6.91 ( 591 % ) on ARMv8-A cores and 12.89 ( 1189 % ) on the Cortex-M4; it also requires fewer uniform random bits. This translates into performance improvements for NTRU encapsulation, compared to state-of-the-art implementations, of up to 50% on ARMv8-A cores and 72% on the Cortex-M4, and small improvements to key generation (up to 2.7% on ARMv8-A cores and 6.1% on the Cortex-M4), with negligible impact on code size and a slight improvement in RAM usage for the Cortex-M4.
{"title":"Efficient isochronous fixed-weight sampling with applications to NTRU","authors":"Décio Luiz Gazzoni Filho, Tomás Recio, Julio López Hernandez","doi":"10.62056/a6n59qgxq","DOIUrl":"https://doi.org/10.62056/a6n59qgxq","url":null,"abstract":"We present a solution to the open problem of designing a linear-time, unbiased and timing attack-resistant shuffling algorithm for fixed-weight sampling. Although it can be implemented without timing leakages of secret data in any architecture, we illustrate with ARMv7-M and ARMv8-A implementations; for the latter, we take advantage of architectural features such as NEON and conditional instructions, which are representative of features available on architectures targeting similar systems, such as Intel. Our proposed algorithm improves asymptotically upon the current approach based on constant-time sorting networks (\u0000 \u0000 O\u0000 (\u0000 n\u0000 )\u0000 \u0000 versus \u0000 \u0000 O\u0000 (\u0000 n\u0000 \u0000 log\u0000 2\u0000 \u0000 n\u0000 )\u0000 \u0000 ), and an implementation of the new algorithm applied to NTRU is also faster in practice, by a factor of up to \u0000 \u0000 6.91\u0000 \u0000 (\u0000 591\u0000 %\u0000 )\u0000 \u0000 on ARMv8-A cores and \u0000 \u0000 12.89\u0000 \u0000 (\u0000 1189\u0000 %\u0000 )\u0000 \u0000 on the Cortex-M4; it also requires fewer uniform random bits. This translates into performance improvements for NTRU encapsulation, compared to state-of-the-art implementations, of up to 50% on ARMv8-A cores and 72% on the Cortex-M4, and small improvements to key generation (up to 2.7% on ARMv8-A cores and 6.1% on the Cortex-M4), with negligible impact on code size and a slight improvement in RAM usage for the Cortex-M4.","PeriodicalId":13158,"journal":{"name":"IACR Cryptol. ePrint Arch.","volume":"119 17","pages":"548"},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141667708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Distributed key generation (DKG) is a key building block in developing many efficient threshold cryptosystems. This work initiates the study of communication complexity and round complexity of DKG protocols over a point-to-point (bounded) synchronous network. Our key result is the first synchronous DKG protocol for discrete log-based cryptosystems with O(κn3) communication complexity (κ denotes a security parameter) that tolerates any t<n/2 Byzantine faults among n parties. We present two variants of the protocol: (i) a protocol with worst-case O(κn3) communication and O(t) rounds, and (ii) a protocol with expected O(κn3) communication and expected constant rounds. In the process of achieving our results, we design (1) a novel weak gradecast protocol with
分布式密钥生成(DKG)是开发许多高效阈值密码系统的关键构件。这项工作开始研究点对点(有界)同步网络上 DKG 协议的通信复杂度和回合复杂度。我们的主要成果是第一个基于离散日志密码系统的同步 DKG 协议,其通信复杂度为 O ( κ n 3 ) (κ 表示安全参数),可容忍 n 方之间的任何 t n / 2 拜占庭故障。我们提出了该协议的两个变体:(i) 最坏情况下通信复杂度为 O ( κ n 3 ) 、回合数为 O ( t ) 的协议;(ii) 预期通信复杂度为 O ( κ n 3 ) 、回合数为常数的协议。在实现这些结果的过程中,我们设计了:(1) 一种新颖的弱梯度传输协议,对于线性大小的输入和恒定轮次,其通信复杂度为 O ( κ n 2 ) ;(2) 一种名为 "可恢复共享集 "的协议,用于确保恢复共享秘密;(3) 一种遗忘领导者选举协议,其通信复杂度为 O ( κ n 3 ) ,轮次为恒定;(4) 一种多值验证拜占庭协议(MVBA)协议,对于线性大小的输入和预期恒定轮次,其通信复杂度为 O ( κ n 3 ) 。这些基元中的每一个都具有独立的意义。
{"title":"Synchronous Distributed Key Generation without Broadcasts","authors":"Nibesh Shrestha, Adithya Bhat, Aniket Kate, Kartik Nayak","doi":"10.62056/ayfhsgvtw","DOIUrl":"https://doi.org/10.62056/ayfhsgvtw","url":null,"abstract":"<jats:p> Distributed key generation (DKG) is a key building block in developing many efficient threshold cryptosystems. This work initiates the study of communication complexity and round complexity of DKG protocols over a point-to-point (bounded) synchronous network. Our key result is the first synchronous DKG protocol for discrete log-based cryptosystems with <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\">\u0000 <mml:mrow>\u0000 <mml:mi>O</mml:mi>\u0000 <mml:mo stretchy=\"false\">(</mml:mo>\u0000 <mml:mi>κ</mml:mi>\u0000 <mml:msup>\u0000 <mml:mi>n</mml:mi>\u0000 <mml:mn>3</mml:mn>\u0000 </mml:msup>\u0000 <mml:mo stretchy=\"false\">)</mml:mo>\u0000 </mml:mrow>\u0000 </mml:math> communication complexity (<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\">\u0000 <mml:mrow>\u0000 <mml:mi>κ</mml:mi>\u0000 </mml:mrow>\u0000 </mml:math> denotes a security parameter) that tolerates any <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\">\u0000 <mml:mrow>\u0000 <mml:mi>t</mml:mi>\u0000 <mml:mo><</mml:mo>\u0000 <mml:mi>n</mml:mi>\u0000 <mml:mo>/</mml:mo>\u0000 <mml:mn>2</mml:mn>\u0000 </mml:mrow>\u0000 </mml:math> Byzantine faults among <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\">\u0000 <mml:mrow>\u0000 <mml:mi>n</mml:mi>\u0000 </mml:mrow>\u0000 </mml:math> parties. We present two variants of the protocol: (i) a protocol with worst-case <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\">\u0000 <mml:mrow>\u0000 <mml:mi>O</mml:mi>\u0000 <mml:mo stretchy=\"false\">(</mml:mo>\u0000 <mml:mi>κ</mml:mi>\u0000 <mml:msup>\u0000 <mml:mi>n</mml:mi>\u0000 <mml:mn>3</mml:mn>\u0000 </mml:msup>\u0000 <mml:mo stretchy=\"false\">)</mml:mo>\u0000 </mml:mrow>\u0000 </mml:math> communication and <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\">\u0000 <mml:mrow>\u0000 <mml:mi>O</mml:mi>\u0000 <mml:mo stretchy=\"false\">(</mml:mo>\u0000 <mml:mi>t</mml:mi>\u0000 <mml:mo stretchy=\"false\">)</mml:mo>\u0000 </mml:mrow>\u0000 </mml:math> rounds, and (ii) a protocol with expected <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\">\u0000 <mml:mrow>\u0000 <mml:mi>O</mml:mi>\u0000 <mml:mo stretchy=\"false\">(</mml:mo>\u0000 <mml:mi>κ</mml:mi>\u0000 <mml:msup>\u0000 <mml:mi>n</mml:mi>\u0000 <mml:mn>3</mml:mn>\u0000 </mml:msup>\u0000 <mml:mo stretchy=\"false\">)</mml:mo>\u0000 </mml:mrow>\u0000 </mml:math> communication and expected constant rounds. In the process of achieving our results, we design (1) a novel weak gradecast protocol with ","PeriodicalId":13158,"journal":{"name":"IACR Cryptol. ePrint Arch.","volume":"116 39","pages":"1635"},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141666566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this work we first present an explicit forking lemma that distills the information-theoretic essence of the high-moment technique introduced by Rotem and Segev (CRYPTO '21), who analyzed the security of identification protocols and Fiat-Shamir signature schemes. Whereas the technique of Rotem and Segev was particularly geared towards two specific cryptographic primitives, we present a stand-alone probabilistic lower bound, which does not involve any underlying primitive or idealized model. The key difference between our lemma and previous ones is that instead of focusing on the tradeoff between the worst-case or expected running time of the resulting forking algorithm and its success probability, we focus on the tradeoff between higher moments of its running time and its success probability. Equipped with our lemma, we then establish concrete security bounds for the BN and BLS multi-signature schemes that are significantly tighter than the concrete security bounds established by Bellare and Neven (CCS '06) and Boneh, Drijvers and Neven (ASIACRYPT '18), respectively. Our analysis does not limit adversaries to any idealized algebraic model, such as the algebraic group model in which all algorithms are assumed to provide an algebraic justification for each group element they produce. Our bounds are derived in the random-oracle model based on the standard-model second-moment hardness of the discrete logarithm problem (for the BN scheme) and the computational co-Diffie-Hellman problem (for the BLS scheme). Such second-moment assumptions, asking that the success probability of any algorithm in solving the underlying computational problems is dominated by the second moment of the algorithm's running time, are particularly plausible in any group where no better-than-generic algorithms are currently known.
{"title":"An Explicit High-Moment Forking Lemma and its Applications to the Concrete Security of Multi-Signatures","authors":"Gil Segev, Liat Shapira","doi":"10.62056/a6qj89n4e","DOIUrl":"https://doi.org/10.62056/a6qj89n4e","url":null,"abstract":"In this work we first present an explicit forking lemma that distills the information-theoretic essence of the high-moment technique introduced by Rotem and Segev (CRYPTO '21), who analyzed the security of identification protocols and Fiat-Shamir signature schemes. Whereas the technique of Rotem and Segev was particularly geared towards two specific cryptographic primitives, we present a stand-alone probabilistic lower bound, which does not involve any underlying primitive or idealized model. The key difference between our lemma and previous ones is that instead of focusing on the tradeoff between the worst-case or expected running time of the resulting forking algorithm and its success probability, we focus on the tradeoff between higher moments of its running time and its success probability.\u0000 Equipped with our lemma, we then establish concrete security bounds for the BN and BLS multi-signature schemes that are significantly tighter than the concrete security bounds established by Bellare and Neven (CCS '06) and Boneh, Drijvers and Neven (ASIACRYPT '18), respectively. Our analysis does not limit adversaries to any idealized algebraic model, such as the algebraic group model in which all algorithms are assumed to provide an algebraic justification for each group element they produce. Our bounds are derived in the random-oracle model based on the standard-model second-moment hardness of the discrete logarithm problem (for the BN scheme) and the computational co-Diffie-Hellman problem (for the BLS scheme). Such second-moment assumptions, asking that the success probability of any algorithm in solving the underlying computational problems is dominated by the second moment of the algorithm's running time, are particularly plausible in any group where no better-than-generic algorithms are currently known.","PeriodicalId":13158,"journal":{"name":"IACR Cryptol. ePrint Arch.","volume":" 5","pages":"934"},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141668882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Decentralized Multi-Client Functional Encryption (DMCFE) extends the basic functional encryption to multiple clients that do not trust each other. They can independently encrypt the multiple plaintext-inputs to be given for evaluation to the function embedded in the functional decryption key, defined by multiple parameter-inputs. And they keep control on these functions as they all have to contribute to the generation of the functional decryption keys. Tags can be used in the ciphertexts and the keys to specify which inputs can be combined together. As any encryption scheme, DMCFE provides privacy of the plaintexts. But the functions associated to the functional decryption keys might be sensitive too (e.g. a model in machine learning). The function-hiding property has thus been introduced to additionally protect the function evaluated during the decryption process. In this paper, we provide new proof techniques to analyze a new concrete construction of function-hiding DMCFE for inner products, with strong security guarantees: the adversary can adaptively query multiple challenge ciphertexts and multiple challenge keys, with unbounded repetitions of the same tags in the ciphertext-queries and a fixed polynomially-large number of repetitions of the same tags in the key-queries. Previous constructions were proven secure in the selective setting only.
{"title":"Decentralized Multi-Client Functional Encryption with Strong Security","authors":"K. Nguyen, David Pointcheval, Robert Schädlich","doi":"10.62056/andkp2fgx","DOIUrl":"https://doi.org/10.62056/andkp2fgx","url":null,"abstract":"Decentralized Multi-Client Functional Encryption (DMCFE) extends the basic functional encryption to multiple clients that do not trust each other. They can independently encrypt the multiple plaintext-inputs to be given for evaluation to the function embedded in the functional decryption key, defined by multiple parameter-inputs. And they keep control on these functions as they all have to contribute to the generation of the functional decryption keys. Tags can be used in the ciphertexts and the keys to specify which inputs can be combined together. As any encryption scheme, DMCFE provides privacy of the plaintexts. But the functions associated to the functional decryption keys might be sensitive too (e.g. a model in machine learning). The function-hiding property has thus been introduced to additionally protect the function evaluated during the decryption process.\u0000 In this paper, we provide new proof techniques to analyze a new concrete construction of function-hiding DMCFE for inner products, with strong security guarantees: the adversary can adaptively query multiple challenge ciphertexts and multiple challenge keys, with unbounded repetitions of the same tags in the ciphertext-queries and a fixed polynomially-large number of repetitions of the same tags in the key-queries. Previous constructions were proven secure in the selective setting only.","PeriodicalId":13158,"journal":{"name":"IACR Cryptol. ePrint Arch.","volume":"114 49","pages":"764"},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141668212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fischlin's transform (CRYPTO 2005) is an alternative to the Fiat-Shamir transform that enables straight-line extraction when proving knowledge. In this work we focus on the problem of using the Fischlin transform to construct UC-secure zero-knowledge from Sigma protocols, since UC security – that guarantees security under general concurrent composition – requires straight-line (non-rewinding) simulators. We provide a slightly simplified transform that is much easier to understand, and present algorithmic and implementation optimizations that significantly improve the running time. It appears that the main obstacles to the use of Fischlin in practice is its computational cost and implementation complexity (with multiple parameters that need to be chosen). We provide clear guidelines and a simple methodology for choosing parameters, and show that with our optimizations the running-time is far lower than expected. For just one example, on a 2023 MacBook, the cost of proving the knowledge of discrete log with Fischlin is only 0.41ms (on a single core). This is 15 times slower than plain Fiat-Shamir on the same machine, which is a significant multiple but objectively not significant in many applications. We also extend the transform so that it can be applied to batch proofs, and show how this can be much more efficient than individually proving each statement. We hope that this paper will both encourage and help practitioners implement the Fischlin transform where relevant.
{"title":"Optimizing and Implementing Fischlin's Transform for UC-Secure Zero-Knowledge","authors":"Yi-Hsiu Chen, Yehuda Lindell","doi":"10.62056/a66chey6b","DOIUrl":"https://doi.org/10.62056/a66chey6b","url":null,"abstract":"Fischlin's transform (CRYPTO 2005) is an alternative to the Fiat-Shamir transform that enables straight-line extraction when proving knowledge. In this work we focus on the problem of using the Fischlin transform to construct UC-secure zero-knowledge from Sigma protocols, since UC security – that guarantees security under general concurrent composition – requires straight-line (non-rewinding) simulators. We provide a slightly simplified transform that is much easier to understand, and present algorithmic and implementation optimizations that significantly improve the running time. It appears that the main obstacles to the use of Fischlin in practice is its computational cost and implementation complexity (with multiple parameters that need to be chosen). We provide clear guidelines and a simple methodology for choosing parameters, and show that with our optimizations the running-time is far lower than expected. For just one example, on a 2023 MacBook, the cost of proving the knowledge of discrete log with Fischlin is only 0.41ms (on a single core). This is 15 times slower than plain Fiat-Shamir on the same machine, which is a significant multiple but objectively not significant in many applications. We also extend the transform so that it can be applied to batch proofs, and show how this can be much more efficient than individually proving each statement. We hope that this paper will both encourage and help practitioners implement the Fischlin transform where relevant.","PeriodicalId":13158,"journal":{"name":"IACR Cryptol. ePrint Arch.","volume":"104 29","pages":"526"},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141667407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gaëtan Cassiers, Loïc Masure, Charles Momin, Thorben Moos, A. Moradi, François-Xavier Standaert
Masking is a prominent strategy to protect cryptographic implementations against side-channel analysis. Its popularity arises from the exponential security gains that can be achieved for (approximately) quadratic resource utilization. Many variants of the countermeasure tailored for different optimization goals have been proposed. The common denominator among all of them is the implicit demand for robust and high entropy randomness. Simply assuming that uniformly distributed random bits are available, without taking the cost of their generation into account, leads to a poor understanding of the efficiency vs. security tradeoff of masked implementations. This is especially relevant in case of hardware masking schemes which are known to consume large amounts of random bits per cycle due to parallelism. Currently, there seems to be no consensus on how to most efficiently derive many pseudo-random bits per clock cycle from an initial seed and with properties suitable for masked hardware implementations. In this work, we evaluate a number of building blocks for this purpose and find that hardware-oriented stream ciphers like Trivium and its reduced-security variant Bivium B outperform most competitors when implemented in an unrolled fashion. Unrolled implementations of these primitives enable the flexible generation of many bits per cycle, which is crucial for satisfying the large randomness demands of state-of-the-art masking schemes. According to our analysis, only Linear Feedback Shift Registers (LFSRs), when also unrolled, are capable of producing long non-repetitive sequences of random-looking bits at a higher rate per cycle for the same or lower cost as Trivium and Bivium B. Yet, these instances do not provide black-box security as they generate only linear outputs. We experimentally demonstrate that using multiple output bits from an LFSR in the same masked implementation can violate probing security and even lead to harmful randomness cancellations. Circumventing these problems, and enabling an independent analysis of randomness generation and masking, requires the use of cryptographically stronger primitives like stream ciphers. As a result of our studies, we provide an evidence-based estimate for the cost of securely generating n fresh random bits per cycle. Depending on the desired level of black-box security and operating frequency, this cost can be as low as 20 n to 30 n ASIC gate equivalents (GE) or 3 n to 4 n FPGA look-up tables (LUTs), where n is the number of random bits required. Our results demonstrate that the cost per bit is (sometimes significantly) lower than estimated in previous works, incentivizing parallelism whenever exploitable. This provides further motivation to potentially move low randomness usage from a primary to a secondary design goal in hardware masking research.
掩码是保护加密实现免受侧信道分析的一种重要策略。它之所以广受欢迎,是因为在(近似)二次资源利用率的情况下,可以实现指数级的安全增益。针对不同的优化目标,人们提出了许多对策变体。所有这些变体的共同点是隐含着对鲁棒性和高熵随机性的需求。简单地假定均匀分布的随机比特是可用的,而不考虑其生成成本,会导致对掩码实现的效率与安全权衡认识不清。这一点与硬件掩码方案尤其相关,众所周知,硬件掩码方案由于并行性,每个周期会消耗大量随机比特。目前,对于如何在每个时钟周期内从初始种子最高效地获得大量伪随机比特,并使其具有适合掩码硬件实现的特性,似乎还没有达成共识。在这项工作中,我们评估了一些用于此目的的构件,发现面向硬件的流密码(如 Trivium 及其安全性较低的变体 Bivium B)在以解卷方式实现时优于大多数竞争对手。这些基元的解卷实现可以灵活地在每个周期生成许多比特,这对于满足最先进的掩码方案的大随机性要求至关重要。根据我们的分析,只有线性反馈移位寄存器(LFSRs)在未卷化的情况下,能够以更高的速率在每个周期生成冗长的非重复随机比特序列,成本与 Trivium 和 Bivium B 相同或更低。我们通过实验证明,在同一屏蔽实现中使用来自 LFSR 的多个输出位会违反探测安全性,甚至导致有害的随机性抵消。要规避这些问题,并对随机性生成和掩码进行独立分析,需要使用流密码等密码学上更强大的基元。通过研究,我们对每个周期安全生成 n 个新随机比特的成本进行了基于证据的估算。根据所需的黑盒安全级别和工作频率,这一成本可低至 20 n 至 30 n ASIC 门当量(GE)或 3 n 至 4 n FPGA 查找表(LUT),其中 n 为所需的随机比特数。我们的研究结果表明,每比特的成本比以前的研究估计要低(有时低得多),这就鼓励了并行性的利用。这进一步推动了在硬件掩码研究中,将低随机性使用从主要设计目标转变为次要设计目标。
{"title":"Randomness Generation for Secure Hardware Masking - Unrolled Trivium to the Rescue","authors":"Gaëtan Cassiers, Loïc Masure, Charles Momin, Thorben Moos, A. Moradi, François-Xavier Standaert","doi":"10.62056/akdkp2fgx","DOIUrl":"https://doi.org/10.62056/akdkp2fgx","url":null,"abstract":"Masking is a prominent strategy to protect cryptographic implementations against side-channel analysis. Its popularity arises from the exponential security gains that can be achieved for (approximately) quadratic resource utilization. Many variants of the countermeasure tailored for different optimization goals have been proposed. The common denominator among all of them is the implicit demand for robust and high entropy randomness. Simply assuming that uniformly distributed random bits are available, without taking the cost of their generation into account, leads to a poor understanding of the efficiency vs. security tradeoff of masked implementations. This is especially relevant in case of hardware masking schemes which are known to consume large amounts of random bits per cycle due to parallelism. Currently, there seems to be no consensus on how to most efficiently derive many pseudo-random bits per clock cycle from an initial seed and with properties suitable for masked hardware implementations. In this work, we evaluate a number of building blocks for this purpose and find that hardware-oriented stream ciphers like Trivium and its reduced-security variant Bivium B outperform most competitors when implemented in an unrolled fashion. Unrolled implementations of these primitives enable the flexible generation of many bits per cycle, which is crucial for satisfying the large randomness demands of state-of-the-art masking schemes. According to our analysis, only Linear Feedback Shift Registers (LFSRs), when also unrolled, are capable of producing long non-repetitive sequences of random-looking bits at a higher rate per cycle for the same or lower cost as Trivium and Bivium B. Yet, these instances do not provide black-box security as they generate only linear outputs. We experimentally demonstrate that using multiple output bits from an LFSR in the same masked implementation can violate probing security and even lead to harmful randomness cancellations. Circumventing these problems, and enabling an independent analysis of randomness generation and masking, requires the use of cryptographically stronger primitives like stream ciphers. As a result of our studies, we provide an evidence-based estimate for the cost of securely generating \u0000 \u0000 n\u0000 \u0000 fresh random bits per cycle. Depending on the desired level of black-box security and operating frequency, this cost can be as low as \u0000 \u0000 20\u0000 n\u0000 \u0000 to \u0000 \u0000 30\u0000 n\u0000 \u0000 ASIC gate equivalents (GE) or \u0000 \u0000 3\u0000 n\u0000 \u0000 to \u0000 \u0000 4\u0000 n\u0000 \u0000 FPGA look-up tables (LUTs), where \u0000 \u0000 n\u0000 \u0000 is the number of random bits required. Our results demonstrate that the cost per bit is (sometimes significantly) lower than estimated in previous works, incentivizing parallelism whenever exploitable. This provides further motivation to potentially move low randomness usage from a primary to a secondary design goal in hardware masking research.","PeriodicalId":13158,"journal":{"name":"IACR Cryptol. ePrint Arch.","volume":" 1236","pages":"1134"},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141668897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thomas Attema, Aron van Baarsen, Stefan van den Berg, Pedro Capitão, Vincent Dunning, Lisa Kohl
Despite much progress, general-purpose secure multi-party computation (MPC) with active security may still be prohibitively expensive in settings with large input datasets. This particularly applies to the secure evaluation of graph algorithms, where each party holds a subset of a large graph. Recently, Araki et al. (ACM CCS '21) showed that dedicated solutions may provide significantly better efficiency if the input graph is sparse. In particular, they provide an efficient protocol for the secure evaluation of “message passing” algorithms, such as the PageRank algorithm. Their protocol's computation and communication complexity are both O ~ ( M · B ) instead of the O ( M 2 ) complexity achieved by general-purpose MPC protocols, where M denotes the number of nodes and B the (average) number of incoming edges per node. On the downside, their approach achieves only a relatively weak security notion; 1 -out-of- 3 malicious security with selective abort. In this work, we show that PageRank can instead be captured efficiently as a restricted multiplication straight-line (RMS) program, and present a new actively secure MPC protocol tailored to handle RMS programs. In particular, we show that the local knowledge of the participants can be leveraged towards the first maliciously-secure protocol with communication complexity linear in M , independently of the sparsity of the graph. We present two variants of our protocol. In our communication-optimized protocol, going from semi-honest to malicious security only introduces a small communication overhead, but results in quadratic computation complexity O ( M 2 ) . In our balanced protocol, we still achieve a linear communication complexity O ( M ) , although with worse constants, but a significantly better computational complexity scaling with O ( M · B ) . Additionally, our protocols achieve security with identifiable abort and can tolerate up to n − 1 corruptions.
尽管取得了很大进展,但具有主动安全性的通用安全多方计算(MPC)在具有大型输入数据集的情况下仍可能过于昂贵。这尤其适用于图算法的安全评估,在这种情况下,每一方都持有一个大型图的子集。最近,Araki 等人(ACM CCS '21)的研究表明,如果输入图很稀疏,专用解决方案的效率可能会大大提高。特别是,他们为 "消息传递 "算法(如 PageRank 算法)的安全评估提供了一个高效协议。他们协议的计算和通信复杂度都是 O ~ ( M - B ) ,而不是通用 MPC 协议的 O ( M 2 ) 复杂度,其中 M 表示节点数,B 表示每个节点传入边的(平均)数量。缺点是,他们的方法只实现了相对较弱的安全概念:1-out-of-3 恶意安全与选择性中止。在这项工作中,我们证明 PageRank 可以高效地捕获为受限乘法直线(RMS)程序,并提出了一种新的主动安全 MPC 协议,专门用于处理 RMS 程序。特别是,我们展示了可以利用参与者的本地知识来实现第一个通信复杂度与 M 成线性关系的恶意安全协议,而与图的稀疏性无关。我们提出了协议的两个变体。在我们的通信优化协议中,从半诚实安全到恶意安全只引入了少量通信开销,但却带来了二次计算复杂度 O ( M 2 ) 。在我们的平衡协议中,虽然常数较差,但我们仍然实现了线性通信复杂度 O ( M ),但计算复杂度以 O ( M - B )缩放,明显更好。此外,我们的协议实现了可识别中止的安全性,并可容忍多达 n - 1 次破坏。
{"title":"Communication-Efficient Multi-Party Computation for RMS Programs","authors":"Thomas Attema, Aron van Baarsen, Stefan van den Berg, Pedro Capitão, Vincent Dunning, Lisa Kohl","doi":"10.62056/ab0lmp-3y","DOIUrl":"https://doi.org/10.62056/ab0lmp-3y","url":null,"abstract":"Despite much progress, general-purpose secure multi-party computation (MPC) with active security may still be prohibitively expensive in settings with large input datasets. This particularly applies to the secure evaluation of graph algorithms, where each party holds a subset of a large graph. Recently, Araki et al. (ACM CCS '21) showed that dedicated solutions may provide significantly better efficiency if the input graph is sparse. In particular, they provide an efficient protocol for the secure evaluation of “message passing” algorithms, such as the PageRank algorithm. Their protocol's computation and communication complexity are both \u0000 \u0000 \u0000 \u0000 O\u0000 \u0000 ~\u0000 \u0000 (\u0000 M\u0000 ·\u0000 B\u0000 )\u0000 \u0000 instead of the \u0000 \u0000 O\u0000 (\u0000 \u0000 M\u0000 2\u0000 \u0000 )\u0000 \u0000 complexity achieved by general-purpose MPC protocols, where \u0000 \u0000 M\u0000 \u0000 denotes the number of nodes and \u0000 \u0000 B\u0000 \u0000 the (average) number of incoming edges per node. On the downside, their approach achieves only a relatively weak security notion; \u0000 \u0000 1\u0000 \u0000 -out-of-\u0000 \u0000 3\u0000 \u0000 malicious security with selective abort.\u0000 In this work, we show that PageRank can instead be captured efficiently as a restricted multiplication straight-line (RMS) program, and present a new actively secure MPC protocol tailored to handle RMS programs. In particular, we show that the local knowledge of the participants can be leveraged towards the first maliciously-secure protocol with communication complexity linear in \u0000 \u0000 M\u0000 \u0000 , independently of the sparsity of the graph. We present two variants of our protocol. In our communication-optimized protocol, going from semi-honest to malicious security only introduces a small communication overhead, but results in quadratic computation complexity \u0000 \u0000 O\u0000 (\u0000 \u0000 M\u0000 2\u0000 \u0000 )\u0000 \u0000 . In our balanced protocol, we still achieve a linear communication complexity \u0000 \u0000 O\u0000 (\u0000 M\u0000 )\u0000 \u0000 , although with worse constants, but a significantly better computational complexity scaling with \u0000 \u0000 O\u0000 (\u0000 M\u0000 ·\u0000 B\u0000 )\u0000 \u0000 . Additionally, our protocols achieve security with identifiable abort and can tolerate up to \u0000 \u0000 n\u0000 −\u0000 1\u0000 \u0000 corruptions.","PeriodicalId":13158,"journal":{"name":"IACR Cryptol. ePrint Arch.","volume":" 10","pages":"568"},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141668459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Benoît Cogliati, Jérémy Jean, Thomas Peyrin, Y. Seurin
We analyze the multi-user (mu) security of a family of nonce-based authentication encryption (nAE) schemes based on a tweakable block cipher (TBC). The starting point of our work is an analysis of the mu security of the SCT-II mode which underlies the nAE scheme Deoxys-II, winner of the CAESAR competition for the defense-in-depth category. We extend this analysis in two directions, as we detail now. First, we investigate the mu security of several TBC-based variants of the counter encryption mode (including CTRT, the encryption mode used within SCT-II) that differ by the way a nonce, a random value, and a counter are combined as tweak and plaintext inputs to the TBC to produce the keystream blocks that will mask the plaintext blocks. Then, we consider the authentication part of SCT-II and study the mu security of the nonce-based MAC Nonce-as-Tweak (NaT) built from a TBC and an almost universal (AU) hash function. We also observe that the standard construction of an AU hash function from a (T)BC can be proven secure under the assumption that the underlying TBC is unpredictable rather than pseudorandom, allowing much better conjectures on the concrete AU advantage. This allows us to derive the mu security of the family of nAE modes obtained by combining these encryption/MAC building blocks through the NSIV composition method. Some of these modes require an underlying TBC with a larger tweak length than what is usually available for existing ones. We then show the practicality of our modes by instantiating them with two new TBC constructions, Deoxys-TBC-512 and Deoxys-TBC-640, which can be seen as natural extensions of the Deoxys-TBC family to larger tweak input sizes. Designing such TBCs with unusually large tweaks is prone to pitfalls: Indeed, we show that a large-tweak proposal for SKINNY published at EUROCRYPT 2020 presents an inherent construction flaw. We therefore provide a sound design strategy to construct large-tweak TBCs within the Superposition Tweakey (STK) framework, leading to new Deoxys-TBC and SKINNY variants. We provide software benchmarks indicating that while ensuring a very high security level, the performances of our proposals remain very competitive.
{"title":"A Long Tweak Goes a Long Way: High Multi-user Security Authenticated Encryption from Tweakable Block Ciphers","authors":"Benoît Cogliati, Jérémy Jean, Thomas Peyrin, Y. Seurin","doi":"10.62056/a3qjp2fgx","DOIUrl":"https://doi.org/10.62056/a3qjp2fgx","url":null,"abstract":"We analyze the multi-user (mu) security of a family of nonce-based authentication encryption (nAE) schemes based on a tweakable block cipher (TBC). The starting point of our work is an analysis of the mu security of the SCT-II mode which underlies the nAE scheme Deoxys-II, winner of the CAESAR competition for the defense-in-depth category. We extend this analysis in two directions, as we detail now.\u0000 First, we investigate the mu security of several TBC-based variants of the counter encryption mode (including CTRT, the encryption mode used within SCT-II) that differ by the way a nonce, a random value, and a counter are combined as tweak and plaintext inputs to the TBC to produce the keystream blocks that will mask the plaintext blocks. Then, we consider the authentication part of SCT-II and study the mu security of the nonce-based MAC Nonce-as-Tweak (NaT) built from a TBC and an almost universal (AU) hash function. We also observe that the standard construction of an AU hash function from a (T)BC can be proven secure under the assumption that the underlying TBC is unpredictable rather than pseudorandom, allowing much better conjectures on the concrete AU advantage. This allows us to derive the mu security of the family of nAE modes obtained by combining these encryption/MAC building blocks through the NSIV composition method.\u0000 Some of these modes require an underlying TBC with a larger tweak length than what is usually available for existing ones. We then show the practicality of our modes by instantiating them with two new TBC constructions, Deoxys-TBC-512 and Deoxys-TBC-640, which can be seen as natural extensions of the Deoxys-TBC family to larger tweak input sizes. Designing such TBCs with unusually large tweaks is prone to pitfalls: Indeed, we show that a large-tweak proposal for SKINNY published at EUROCRYPT 2020 presents an inherent construction flaw. We therefore provide a sound design strategy to construct large-tweak TBCs within the Superposition Tweakey (STK) framework, leading to new Deoxys-TBC and SKINNY variants. We provide software benchmarks indicating that while ensuring a very high security level, the performances of our proposals remain very competitive.","PeriodicalId":13158,"journal":{"name":"IACR Cryptol. ePrint Arch.","volume":"119 19","pages":"846"},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141667706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-01DOI: 10.56553/popets-2024-0100
Sajin Sasy, Adithya Vadapalli, Ian Goldberg
We present Private Random Access Computations (PRAC), a 3-party Secure Multi-Party Computation (MPC) framework to support random-access data structure algorithms for MPC with efficient communication in terms of rounds and bandwidth. PRAC extends the state-of-the-art DORAM Duoram with a new implementation, more flexibility in how the DORAM memory is shared, and support for Incremental and Wide DPFs. We then use these DPF extensions to achieve algorithmic improvements in three novel oblivious data structure protocols for MPC. PRAC exploits the observation that a secure protocol for an algorithm can gain efficiency if the protocol explicitly reveals information leaked by the algorithm inherently. We first present an optimized binary search protocol that reduces the bandwidth from O(lg² n) to O(lg n) for obliviously searching over n items. We then present an oblivious heap protocol with rounds reduced from O(lg n) to O(lg lg n) for insertions, and bandwidth reduced from O(lg² n) to O(lg n) for extractions. Finally, we also present the first oblivious AVL tree protocol for MPC where no party learns the data or the structure of the AVL tree, and can support arbitrary insertions and deletions with O(lg n) rounds and bandwidth. We experimentally evaluate our protocols with realistic network settings for a wide range of memory sizes to demonstrate their efficiency. For instance, we observe our binary search protocol provides >27× and >3× improvements in wall-clock time and bandwidth respectively over other approaches for a memory with 2^26 items; for the same setting our heap's extract-min protocol achieves >31× speedup in wall-clock time and >13× reduction in bandwidth.
{"title":"PRAC: Round-Efficient 3-Party MPC for Dynamic Data Structures","authors":"Sajin Sasy, Adithya Vadapalli, Ian Goldberg","doi":"10.56553/popets-2024-0100","DOIUrl":"https://doi.org/10.56553/popets-2024-0100","url":null,"abstract":"We present Private Random Access Computations (PRAC), a 3-party Secure Multi-Party Computation (MPC) framework to support random-access data structure algorithms for MPC with efficient communication in terms of rounds and bandwidth. PRAC extends the state-of-the-art DORAM Duoram with a new implementation, more flexibility in how the DORAM memory is shared, and support for Incremental and Wide DPFs. We then use these DPF extensions to achieve algorithmic improvements in three novel oblivious data structure protocols for MPC. PRAC exploits the observation that a secure protocol for an algorithm can gain efficiency if the protocol explicitly reveals information leaked by the algorithm inherently. We first present an optimized binary search protocol that reduces the bandwidth from O(lg² n) to O(lg n) for obliviously searching over n items. We then present an oblivious heap protocol with rounds reduced from O(lg n) to O(lg lg n) for insertions, and bandwidth reduced from O(lg² n) to O(lg n) for extractions. Finally, we also present the first oblivious AVL tree protocol for MPC where no party learns the data or the structure of the AVL tree, and can support arbitrary insertions and deletions with O(lg n) rounds and bandwidth. We experimentally evaluate our protocols with realistic network settings for a wide range of memory sizes to demonstrate their efficiency. For instance, we observe our binary search protocol provides >27× and >3× improvements in wall-clock time and bandwidth respectively over other approaches for a memory with 2^26 items; for the same setting our heap's extract-min protocol achieves >31× speedup in wall-clock time and >13× reduction in bandwidth.","PeriodicalId":13158,"journal":{"name":"IACR Cryptol. ePrint Arch.","volume":"19 1","pages":"1897"},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141704091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-01DOI: 10.56553/popets-2024-0076
Matthew Gregoire, Rachel Thomas, Saba Eskandarian
To resist the regimes of ubiquitous surveillance imposed upon us in every facet of modern life, we need technological tools that subvert surveillance systems. Unfortunately, while cryptographic tools frequently demonstrate how we can construct systems that safeguard user privacy, there is limited motivation for corporate entities engaged in surveillance to adopt these tools, as they often clash with profit incentives. This paper demonstrates how, in one particular aspect of everyday life -- customer loyalty programs -- users can subvert surveillance and attain anonymity, without necessitating any cooperation or modification in the behavior of their surveillors. We present the CheckOut system, which allows users to coordinate large anonymity sets of shoppers to hide the identity and purchasing habits of each particular user in the crowd. CheckOut scales up and systematizes past efforts to subvert loyalty surveillance, which have been primarily ad-hoc and manual affairs where customers physically swap loyalty cards to mask their real identities. CheckOut allows increased scale while ensuring that the necessary computing infrastructure does not itself become a new centralized point of privacy failure. Of particular importance to our scheme is a protocol for loyalty programs that offer reward points, where we demonstrate how CheckOut can assist users in paying each other back for loyalty points accrued while using each others' loyalty accounts. We present two different mechanisms to facilitate redistributing rewards points, offering trade-offs in functionality, performance, and security.
{"title":"CheckOut: User-Controlled Anonymization for Customer Loyalty Programs","authors":"Matthew Gregoire, Rachel Thomas, Saba Eskandarian","doi":"10.56553/popets-2024-0076","DOIUrl":"https://doi.org/10.56553/popets-2024-0076","url":null,"abstract":"To resist the regimes of ubiquitous surveillance imposed upon us in every facet of modern life, we need technological tools that subvert surveillance systems. Unfortunately, while cryptographic tools frequently demonstrate how we can construct systems that safeguard user privacy, there is limited motivation for corporate entities engaged in surveillance to adopt these tools, as they often clash with profit incentives. This paper demonstrates how, in one particular aspect of everyday life -- customer loyalty programs -- users can subvert surveillance and attain anonymity, without necessitating any cooperation or modification in the behavior of their surveillors. We present the CheckOut system, which allows users to coordinate large anonymity sets of shoppers to hide the identity and purchasing habits of each particular user in the crowd. CheckOut scales up and systematizes past efforts to subvert loyalty surveillance, which have been primarily ad-hoc and manual affairs where customers physically swap loyalty cards to mask their real identities. CheckOut allows increased scale while ensuring that the necessary computing infrastructure does not itself become a new centralized point of privacy failure. Of particular importance to our scheme is a protocol for loyalty programs that offer reward points, where we demonstrate how CheckOut can assist users in paying each other back for loyalty points accrued while using each others' loyalty accounts. We present two different mechanisms to facilitate redistributing rewards points, offering trade-offs in functionality, performance, and security.","PeriodicalId":13158,"journal":{"name":"IACR Cryptol. ePrint Arch.","volume":"95 1","pages":"475"},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141699157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}