The complementary strengths of Spiking Neural Networks (SNNs) and Artificial Neural Networks (ANNs) have promoted interest in leveraging hybrid ANN/SNN computation. While most existing efforts focus on ANN-SNN conversion for pure SNN inference, hybrid ANN/SNN inference present unique challenges where complexity and performance in both domains are critical. Key limitations include achieving ultra-low latency, maintaining unified training parameters for resource sharing, and developing efficient neural and encoding models for hybrid data interactions. To address these challenges, We introduce the Adaptive Clip-Floor-Shift (ACFS) activation to bridge the ANN-SNN gap with unified parameters, balancing inference accuracy and complexity across both domains. Our Hybrid Neuro-Encoding Bridge (HNEB) integrating Clipped-ReLU for ANNs, proposed Selective Integrate-and-Fire (SIF) model for enhanced SNN sparsity, and a Stateless Spike Encoding (SSE) mechanism for resource-efficient activation-spike conversion. Experimental results on VGG16 and ResNet demonstrate SNNs achieving competitive accuracy ($leq ! 0.89%$ loss) versus ANNs at ultra-low latency (e.g., $T leq 4$ for CIFAR10, $T leq 8$ for CIFAR100). Experimental analysis reveals Hybrid Neural Netwroks (HNNs) provide superior energy-accuracy trade-offs, improving energy efficiency by up to 84.13% over pure SNNs while maintaining accuracy through layer-wise ANN/SNN partitioning and minimized encoding overhead.
峰值神经网络(SNN)和人工神经网络(ANN)的互补优势促进了利用混合ANN/SNN计算的兴趣。虽然大多数现有的工作都集中在纯SNN推理的ANN-SNN转换上,但混合ANN/SNN推理提出了独特的挑战,其中两个领域的复杂性和性能都至关重要。关键的限制包括实现超低延迟,为资源共享维护统一的训练参数,以及为混合数据交互开发高效的神经和编码模型。为了解决这些挑战,我们引入了自适应Clip-Floor-Shift (ACFS)激活,以统一参数弥合ANN-SNN的差距,平衡两个领域的推理精度和复杂性。我们的混合神经编码桥(HNEB)集成了用于人工神经网络的clip - relu,提出了用于增强SNN稀疏性的选择性集成和触发(SIF)模型,以及用于资源高效激活-尖峰转换的无状态尖峰编码(SSE)机制。VGG16和ResNet上的实验结果表明,在超低延迟(例如,CIFAR10为$T leq 4$, CIFAR100为$T leq 8$)下,snn与ann相比实现了竞争精度($leq ! 0.89%$损失)。实验分析表明,混合神经网络(HNNs)提供了优越的能量精度权衡,提高能源效率高达84.13% over pure SNNs while maintaining accuracy through layer-wise ANN/SNN partitioning and minimized encoding overhead.
{"title":"Efficient Training and Neuro-Encoding for Bridging Hybrid ANN and SNN Computation","authors":"Musheer Abdullah;De Xu;Zhaoqi Miao;Yuhao Tai;Sawsan Alhabashi;Chen Zhao;Wu Gao","doi":"10.1109/TETC.2025.3607104","DOIUrl":"https://doi.org/10.1109/TETC.2025.3607104","url":null,"abstract":"The complementary strengths of Spiking Neural Networks (SNNs) and Artificial Neural Networks (ANNs) have promoted interest in leveraging hybrid ANN/SNN computation. While most existing efforts focus on ANN-SNN conversion for pure SNN inference, hybrid ANN/SNN inference present unique challenges where complexity and performance in both domains are critical. Key limitations include achieving ultra-low latency, maintaining unified training parameters for resource sharing, and developing efficient neural and encoding models for hybrid data interactions. To address these challenges, We introduce the Adaptive Clip-Floor-Shift (ACFS) activation to bridge the ANN-SNN gap with unified parameters, balancing inference accuracy and complexity across both domains. Our Hybrid Neuro-Encoding Bridge (HNEB) integrating Clipped-ReLU for ANNs, proposed Selective Integrate-and-Fire (SIF) model for enhanced SNN sparsity, and a Stateless Spike Encoding (SSE) mechanism for resource-efficient activation-spike conversion. Experimental results on VGG16 and ResNet demonstrate SNNs achieving competitive accuracy (<inline-formula><tex-math>$leq ! 0.89%$</tex-math></inline-formula> loss) versus ANNs at ultra-low latency (e.g., <inline-formula><tex-math>$T leq 4$</tex-math></inline-formula> for CIFAR10, <inline-formula><tex-math>$T leq 8$</tex-math></inline-formula> for CIFAR100). Experimental analysis reveals Hybrid Neural Netwroks (HNNs) provide superior energy-accuracy trade-offs, improving energy efficiency by up to 84.13% over pure SNNs while maintaining accuracy through layer-wise ANN/SNN partitioning and minimized encoding overhead.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 4","pages":"1591-1604"},"PeriodicalIF":5.4,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-11DOI: 10.1109/TETC.2025.3607300
{"title":"IEEE Transactions on Emerging Topics in Computing Publication Information","authors":"","doi":"10.1109/TETC.2025.3607300","DOIUrl":"https://doi.org/10.1109/TETC.2025.3607300","url":null,"abstract":"","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"C2-C2"},"PeriodicalIF":5.4,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11159605","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145050788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-19DOI: 10.1109/TETC.2025.3598369
Mina Kato;Tiago Koketsu Rodrigues;Nei Kato
Cloud offloading is an important technique for Internet of Things systems, as it allows devices with limited capabilities to access the powerful resources in the cloud when executing their applications. However, relying solely on the remote cloud is problematic, as the long access time from the far distance to the server makes real-time applications impossible to be executed. Multi-access edge computing addresses this by deploying cloud servers near the devices. The issue then becomes how to allocate devices between either remote cloud and multi-access edge computing, based on the device requirements. In this paper, we propose a Breakout Local Search-based solution that, given our designed binary integer linear programming model of the offloading problem, finds a near-optimal configuration for allocating devices between the two cloud types. The proposal is based on iterating between exploiting the local optimum found so far and perturbation of the current solution to explore more the search space. A comparison study shows that our proposal is better than baseline and conventional algorithms, speeding up the total service delay of tasks by at least 30 ms.
{"title":"Breakout Local Search Solution to the Offloading Decision Problem in a Multi-Access Edge Computing Cloud-Enabled Network","authors":"Mina Kato;Tiago Koketsu Rodrigues;Nei Kato","doi":"10.1109/TETC.2025.3598369","DOIUrl":"https://doi.org/10.1109/TETC.2025.3598369","url":null,"abstract":"Cloud offloading is an important technique for Internet of Things systems, as it allows devices with limited capabilities to access the powerful resources in the cloud when executing their applications. However, relying solely on the remote cloud is problematic, as the long access time from the far distance to the server makes real-time applications impossible to be executed. Multi-access edge computing addresses this by deploying cloud servers near the devices. The issue then becomes how to allocate devices between either remote cloud and multi-access edge computing, based on the device requirements. In this paper, we propose a Breakout Local Search-based solution that, given our designed binary integer linear programming model of the offloading problem, finds a near-optimal configuration for allocating devices between the two cloud types. The proposal is based on iterating between exploiting the local optimum found so far and perturbation of the current solution to explore more the search space. A comparison study shows that our proposal is better than baseline and conventional algorithms, speeding up the total service delay of tasks by at least 30 ms.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"1328-1338"},"PeriodicalIF":5.4,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145036895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-29DOI: 10.1109/TETC.2025.3562336
Zhuo Li;Fangxing Geng
The potential privacy breaches in centralized artificial intelligence model training have raised significant public concern. Hierarchical federated learning, as a technology addressing privacy and network efficiency issues, coordinates local devices using edge servers for model training and parameter updates, thereby reducing communication with central cloud servers and diminishing the risk of privacy leaks. However, in this context, the rise of node selfishness presents a significant challenge, undermining training efficiency and the quality of local models, thereby impacting the overall system’s performance. This paper addresses the issue by introducing a virtual node selfish queue to characterize dynamic selfishness, considering both training costs and rewards, and formulating the problem of maximizing model quality within the bounds of controlled node selfishness. Utilizing Lyapunov optimization, this issue is divided into two subproblems: controlling the quantity of node data and optimizing node associations. To solve these, we propose the Data Quantity Control and Client Association (DCCA) algorithm, based on the Hungarian method. This algorithm is shown to ensure boundedness, stability, and optimality in the system. Experimental results demonstrate that the DCCA algorithm enhances model quality by 8.43% and 13.83% compared to the Fmore and FedAvg algorithms, respectively.
{"title":"Incentive Mechanism Design for Hierarchical Federated Learning With Selfishness Queue Stability","authors":"Zhuo Li;Fangxing Geng","doi":"10.1109/TETC.2025.3562336","DOIUrl":"https://doi.org/10.1109/TETC.2025.3562336","url":null,"abstract":"The potential privacy breaches in centralized artificial intelligence model training have raised significant public concern. Hierarchical federated learning, as a technology addressing privacy and network efficiency issues, coordinates local devices using edge servers for model training and parameter updates, thereby reducing communication with central cloud servers and diminishing the risk of privacy leaks. However, in this context, the rise of node selfishness presents a significant challenge, undermining training efficiency and the quality of local models, thereby impacting the overall system’s performance. This paper addresses the issue by introducing a virtual node selfish queue to characterize dynamic selfishness, considering both training costs and rewards, and formulating the problem of maximizing model quality within the bounds of controlled node selfishness. Utilizing Lyapunov optimization, this issue is divided into two subproblems: controlling the quantity of node data and optimizing node associations. To solve these, we propose the Data Quantity Control and Client Association (DCCA) algorithm, based on the Hungarian method. This algorithm is shown to ensure boundedness, stability, and optimality in the system. Experimental results demonstrate that the DCCA algorithm enhances model quality by 8.43% and 13.83% compared to the Fmore and FedAvg algorithms, respectively.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"1316-1327"},"PeriodicalIF":5.4,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145036767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-10DOI: 10.1109/TETC.2025.3583872
Saurabh Kumar;Cristian Molinaro;Lirika Sola;V. S. Subrahmanian
We propose a novel Generative Malware Defense strategy. When an antivirus company detects a malware sample $m$, they should: (i) generate a set ${Var}(m)$ of several variants of $m$ and then (ii) train their malware classifiers on their usual training set augmented with ${Var}(m)$. We believe this leads to a more proactive defense by making the classifiers more robust to future malware developed by the attacker. We formally define the malware generation problem as a non-traditional optimization problem. Our novel GLAMP (Generative Learning for Adversarially-robust Malware Prediction) framework analyzes the complexity of the malware generation problem and includes novel malware variant generation algorithms for (i) that leverage the complexity results. Our experiments show that a sufficiently large percentage of samples generated by GLAMP are able to evade both commercial anti-virus and machine learning classifiers with evasion rates up to 83.81% and 50.54%, respectively. GLAMP then proposes an adversarial training model as well. Our experiments show that GLAMP generates running malware that can evade 11 white boxclassifiers and 4 commercial (i.e., black box) detectors. Our experiments show GLAMP’s best adversarial training engine improves the recall by 16.1% and the F1 score by 2.4%-5.4% depending on the test set used.
{"title":"GLAMP: Generative Learning for Adversarially-Robust Malware Prediction","authors":"Saurabh Kumar;Cristian Molinaro;Lirika Sola;V. S. Subrahmanian","doi":"10.1109/TETC.2025.3583872","DOIUrl":"https://doi.org/10.1109/TETC.2025.3583872","url":null,"abstract":"We propose a novel <i>Generative Malware Defense</i> strategy. When an antivirus company detects a malware sample <inline-formula><tex-math>$m$</tex-math></inline-formula>, they should: (i) generate a set <inline-formula><tex-math>${Var}(m)$</tex-math></inline-formula> of several variants of <inline-formula><tex-math>$m$</tex-math></inline-formula> and then (ii) train their malware classifiers on their usual training set augmented with <inline-formula><tex-math>${Var}(m)$</tex-math></inline-formula>. We believe this leads to a more proactive defense by making the classifiers more robust to future malware developed by the attacker. We formally define the malware generation problem as a non-traditional optimization problem. Our novel GLAMP (Generative Learning for Adversarially-robust Malware Prediction) framework analyzes the complexity of the malware generation problem and includes novel malware variant generation algorithms for (i) that leverage the complexity results. Our experiments show that a sufficiently large percentage of samples generated by GLAMP are able to evade both commercial anti-virus and machine learning classifiers with evasion rates up to 83.81% and 50.54%, respectively. GLAMP then proposes an adversarial training model as well. Our experiments show that GLAMP generates running malware that can evade 11 white boxclassifiers and 4 commercial (i.e., black box) detectors. Our experiments show GLAMP’s best adversarial training engine improves the recall by 16.1% and the F1 score by 2.4%-5.4% depending on the test set used.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"1299-1315"},"PeriodicalIF":5.4,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145036833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-10DOI: 10.1109/TETC.2025.3584354
Qiang Wang;Yiheng Chen;Fucai Zhou;Jian Xu
Publicly verifiable outsourced computation (PVC) facilitates the data owner to outsource some computation-intensive tasks to the powerful but untrusted cloud server, while enabling any client to check the integrity of results with little cost. Matrix multiplication is a fundamental operation in mathematics, which is widely used in many real-world applications. In this paper, we focus on PVC for matrix multiplication (PVC2M) and propose a new primitive called privacy-preserving publicly verifiable outsourced distributed computation scheme (PPVDC) for matrix multiplication. Different from the existing PVC2M solutions, our proposed scheme offers higher efficiency and reliability, where the computation is jointly calculated by multiple workers. In such a distributed setting, the computation result can be recovered if the number of workers who perform the computation honestly is no less than threshold. Besides, another technical highlight is to enhance privacy. Even though all workers are corrupted and may collude, they are unable to obtain any knowledge about the matrix $M$ outsourced by the data owner and the vector $x$ issued by the client at the end of the protocol. Security analysis demonstrates that our proposed PPVDC scheme can meet the desired security requirements under the computational Diffie-Hellman assumption. The detailed performance analysis and experimental evaluation further validate the efficiency of our scheme.
{"title":"Privacy-Preserving Publicly Verifiable Outsourced Distributed Computation Scheme for Matrix Multiplication","authors":"Qiang Wang;Yiheng Chen;Fucai Zhou;Jian Xu","doi":"10.1109/TETC.2025.3584354","DOIUrl":"https://doi.org/10.1109/TETC.2025.3584354","url":null,"abstract":"Publicly verifiable outsourced computation (PVC) facilitates the data owner to outsource some computation-intensive tasks to the powerful but untrusted cloud server, while enabling any client to check the integrity of results with little cost. Matrix multiplication is a fundamental operation in mathematics, which is widely used in many real-world applications. In this paper, we focus on PVC for matrix multiplication (PVC2M) and propose a new primitive called privacy-preserving publicly verifiable outsourced distributed computation scheme (PPVDC) for matrix multiplication. Different from the existing PVC2M solutions, our proposed scheme offers higher efficiency and reliability, where the computation is jointly calculated by multiple workers. In such a distributed setting, the computation result can be recovered if the number of workers who perform the computation honestly is no less than threshold. Besides, another technical highlight is to enhance privacy. Even though all workers are corrupted and may collude, they are unable to obtain any knowledge about the matrix <inline-formula><tex-math>$M$</tex-math></inline-formula> outsourced by the data owner and the vector <inline-formula><tex-math>$x$</tex-math></inline-formula> issued by the client at the end of the protocol. Security analysis demonstrates that our proposed PPVDC scheme can meet the desired security requirements under the computational Diffie-Hellman assumption. The detailed performance analysis and experimental evaluation further validate the efficiency of our scheme.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"1285-1298"},"PeriodicalIF":5.4,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145036832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Quantum Annealing (QA) is a metaheuristic designed to enhance Simulated Annealing by leveraging concepts from quantum mechanics, improving parallelization on classical computers. Studies have shown promising results for this technique in the field of NP-hard problems and constrained optimization. In this article, we examine Path Integral Quantum Annealing (PIQA), a well-known technique for simulating QA on conventional computers. We then propose optimizations to the algorithm, offering hardware software developers a suite of parallelization techniques evaluated for their effectiveness in enhancing quality and speed. The proposed approach encompasses four distinct degrees of optimization, leveraging techniques based on multiple-trial parallelism and a novel pre-optimization method. The article further proposes a methodology for handling multiple instances within the search space, whereby problem data is replicated into slices and allocated to concurrent processes during the simulation. Through empirical trials, we evaluate the impact of our optimization techniques on the convergence speed of the algorithm compared to unoptimized PIQA, using the Multidimensional Knapsack Problem as a benchmark. Our findings show that these optimizations, applied individually or collectively, enable the algorithm to achieve equal or superior results with fewer simulation steps. Overall, the results highlight the potential for future implementations of optimized PIQA on dedicated hardware.
{"title":"Path Integral Quantum Annealing Optimizations Validated on 0-1 Multidimensional Knapsack Problem","authors":"Evelina Forno;Riccardo Pignari;Vittorio Fra;Enrico Macii;Gianvito Urgese","doi":"10.1109/TETC.2025.3583224","DOIUrl":"https://doi.org/10.1109/TETC.2025.3583224","url":null,"abstract":"Quantum Annealing (QA) is a metaheuristic designed to enhance Simulated Annealing by leveraging concepts from quantum mechanics, improving parallelization on classical computers. Studies have shown promising results for this technique in the field of NP-hard problems and constrained optimization. In this article, we examine Path Integral Quantum Annealing (PIQA), a well-known technique for simulating QA on conventional computers. We then propose optimizations to the algorithm, offering hardware software developers a suite of parallelization techniques evaluated for their effectiveness in enhancing quality and speed. The proposed approach encompasses four distinct degrees of optimization, leveraging techniques based on multiple-trial parallelism and a novel pre-optimization method. The article further proposes a methodology for handling multiple instances within the search space, whereby problem data is replicated into slices and allocated to concurrent processes during the simulation. Through empirical trials, we evaluate the impact of our optimization techniques on the convergence speed of the algorithm compared to unoptimized PIQA, using the Multidimensional Knapsack Problem as a benchmark. Our findings show that these optimizations, applied individually or collectively, enable the algorithm to achieve equal or superior results with fewer simulation steps. Overall, the results highlight the potential for future implementations of optimized PIQA on dedicated hardware.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"1272-1284"},"PeriodicalIF":5.4,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145036922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-30DOI: 10.1109/TETC.2025.3582551
Yukimasa Sugizaki;Daisuke Takahashi
In this paper, we propose three modular multiplication algorithms that use only the IEEE 754 binary floating-point operations. Several previous studies have used floating-point operations to perform modular multiplication. However, they considered only positive integers and did not utilize the dedicated sign bit in the floating-point representation. Our first algorithm is an extension of these studies, which are based on Shoup multiplication. By allowing operands to be negative, we increased the maximum supported modulus size by approximately 1.21 times. Our remaining two algorithms are based on Montgomery multiplication for positive and signed integers, respectively. Although these algorithms require more round-to-integral operations, they support a modulus size of up to twice as large as that for Shoup multiplication for positive integers. For processors with relatively low round-to-integral performance, we propose versions of the three algorithms without the round-to-integral operation. Evaluations on four CPUs with different levels of instruction performance show that floating-point-based algorithms, including the proposed algorithms, can be regarded as alternatives to integer-based algorithms for mid-sized moduli, especially when floating-point operations are faster on the processors.
{"title":"Improved Modular Multiplication Algorithms Using Solely IEEE 754 Binary Floating-Point Operations","authors":"Yukimasa Sugizaki;Daisuke Takahashi","doi":"10.1109/TETC.2025.3582551","DOIUrl":"https://doi.org/10.1109/TETC.2025.3582551","url":null,"abstract":"In this paper, we propose three modular multiplication algorithms that use only the IEEE 754 binary floating-point operations. Several previous studies have used floating-point operations to perform modular multiplication. However, they considered only positive integers and did not utilize the dedicated sign bit in the floating-point representation. Our first algorithm is an extension of these studies, which are based on Shoup multiplication. By allowing operands to be negative, we increased the maximum supported modulus size by approximately 1.21 times. Our remaining two algorithms are based on Montgomery multiplication for positive and signed integers, respectively. Although these algorithms require more round-to-integral operations, they support a modulus size of up to twice as large as that for Shoup multiplication for positive integers. For processors with relatively low round-to-integral performance, we propose versions of the three algorithms without the round-to-integral operation. Evaluations on four CPUs with different levels of instruction performance show that floating-point-based algorithms, including the proposed algorithms, can be regarded as alternatives to integer-based algorithms for mid-sized moduli, especially when floating-point operations are faster on the processors.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"1259-1271"},"PeriodicalIF":5.4,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145036766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-20DOI: 10.1109/TETC.2025.3575787
Yang Shi;Yimin Li;Qiaoliang Ouyang;Jiayao Gao;Shengjie Zhao
Embedded devices such as sensors and surveillance cameras play a critical role in the Internet of Things (IoT). However, their unattended and wireless features expose them to a high risk of side-channel attacks. These attacks exploit information leakage through side channels to deduce secret keys or even extract implementations of cryptographic algorithms. The possession of such knowledge empowers attackers to decrypt sensitive information transmitted among IoT devices, posing a significant threat to data confidentiality. To address this issue, we propose LSTable, a new white-box cipher enlightened by LS-Design. Instead of directly using secret keys for encryption and decryption, LSTable transforms secret keys into key-dependent lookup tables to mitigate side-channel attacks, and the size of these tables is designed to fit the hardware constraints of embedded devices. The security analysis of LSTable shows its security in both the black-box and white-box models. Furthermore, experimental evaluations on different devices exhibit that even the efficiency of the slowest instances of LSTable is 2.2 to 14.8 times that of existing space-hard white-box ciphers with IoT-friendly table sizes, while the energy consumption is only around 1/13 to 1/3.
{"title":"LSTable: A New White-Box Cipher for Embedded Devices in IoT Against Side-Channel Attacks","authors":"Yang Shi;Yimin Li;Qiaoliang Ouyang;Jiayao Gao;Shengjie Zhao","doi":"10.1109/TETC.2025.3575787","DOIUrl":"https://doi.org/10.1109/TETC.2025.3575787","url":null,"abstract":"Embedded devices such as sensors and surveillance cameras play a critical role in the Internet of Things (IoT). However, their unattended and wireless features expose them to a high risk of side-channel attacks. These attacks exploit information leakage through side channels to deduce secret keys or even extract implementations of cryptographic algorithms. The possession of such knowledge empowers attackers to decrypt sensitive information transmitted among IoT devices, posing a significant threat to data confidentiality. To address this issue, we propose LSTable, a new white-box cipher enlightened by LS-Design. Instead of directly using secret keys for encryption and decryption, LSTable transforms secret keys into key-dependent lookup tables to mitigate side-channel attacks, and the size of these tables is designed to fit the hardware constraints of embedded devices. The security analysis of LSTable shows its security in both the black-box and white-box models. Furthermore, experimental evaluations on different devices exhibit that even the efficiency of the slowest instances of LSTable is 2.2 to 14.8 times that of existing space-hard white-box ciphers with IoT-friendly table sizes, while the energy consumption is only around 1/13 to 1/3.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"1242-1258"},"PeriodicalIF":5.4,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145036776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-19DOI: 10.1109/TETC.2025.3572317
{"title":"IEEE Transactions on Emerging Topics in Computing Publication Information","authors":"","doi":"10.1109/TETC.2025.3572317","DOIUrl":"https://doi.org/10.1109/TETC.2025.3572317","url":null,"abstract":"","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 2","pages":"C2-C2"},"PeriodicalIF":5.1,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11045261","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144323052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}