Pub Date : 2025-04-25DOI: 10.1109/TETC.2025.3562620
Asmaa Abbas;Mohamed Medhat Gaber;Mohammed M. Abdelsamea
Curriculum learning strategies have been proven to be effective in various applications and have gained significant interest in the field of machine learning. It has the ability to improve the final model’s performance and accelerate the training process. However, in the medical imaging domain, data irregularities can make the recognition task more challenging and usually result in misclassification between the different classes in the dataset. Class-decomposition approaches have shown promising results in solving such a problem by learning the boundaries within the classes of the data set. In this paper, we present a novel convolutional neural network (CNN) training method based on the curriculum learning strategy and the class decomposition approach, which we call CLOG-CD, to improve the performance of medical image classification. We evaluated our method on four different imbalanced medical image datasets, such as Chest X-ray (CXR), brain tumour, digital knee x-ray, and histopathology colorectal cancer (CRC). CLOG-CD utilises the learnt weights from the decomposition granularity of the classes, and the training is accomplished from descending to ascending order (i.e. anti-curriculum technique). We also investigated the classification performance of our proposed method based on different acceleration factors and pace function curricula. We used two pre-trained networks, ResNet-50 and DenseNet-121, as the backbone for CLOG-CD. The results with ResNet-50 show that CLOG-CD has the ability to improve classification performance with an accuracy of 96.08% for the CXR dataset, 96.91% for the brain tumour dataset, 79.76% for the digital knee x-ray, and 99.17% for the CRC dataset, compared to other training strategies. In addition, with DenseNet-121, CLOG-CD has achieved 94.86%, 94.63%, 76.19%, and 99.45% for CXR, brain tumour, digital knee x-ray, and CRC datasets, respectively.
{"title":"CLOG-CD: Curriculum Learning Based on Oscillating Granularity of Class Decomposed Medical Image Classification","authors":"Asmaa Abbas;Mohamed Medhat Gaber;Mohammed M. Abdelsamea","doi":"10.1109/TETC.2025.3562620","DOIUrl":"https://doi.org/10.1109/TETC.2025.3562620","url":null,"abstract":"Curriculum learning strategies have been proven to be effective in various applications and have gained significant interest in the field of machine learning. It has the ability to improve the final model’s performance and accelerate the training process. However, in the medical imaging domain, data irregularities can make the recognition task more challenging and usually result in misclassification between the different classes in the dataset. Class-decomposition approaches have shown promising results in solving such a problem by learning the boundaries within the classes of the data set. In this paper, we present a novel convolutional neural network (CNN) training method based on the curriculum learning strategy and the class decomposition approach, which we call <italic>CLOG-CD</i>, to improve the performance of medical image classification. We evaluated our method on four different imbalanced medical image datasets, such as Chest X-ray (CXR), brain tumour, digital knee x-ray, and histopathology colorectal cancer (CRC). <italic>CLOG-CD</i> utilises the learnt weights from the decomposition granularity of the classes, and the training is accomplished from descending to ascending order (i.e. anti-curriculum technique). We also investigated the classification performance of our proposed method based on different acceleration factors and pace function curricula. We used two pre-trained networks, ResNet-50 and DenseNet-121, as the backbone for <italic>CLOG-CD</i>. The results with ResNet-50 show that <italic>CLOG-CD</i> has the ability to improve classification performance with an accuracy of 96.08% for the CXR dataset, 96.91% for the brain tumour dataset, 79.76% for the digital knee x-ray, and 99.17% for the CRC dataset, compared to other training strategies. In addition, with DenseNet-121, <italic>CLOG-CD</i> has achieved 94.86%, 94.63%, 76.19%, and 99.45% for CXR, brain tumour, digital knee x-ray, and CRC datasets, respectively.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"1043-1054"},"PeriodicalIF":5.4,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145057471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In a large-scale distributed machine learning system, coded computing has attracted wide-spread attention since it can effectively alleviate the impact of stragglers. However, several emerging problems greatly limit the performance of coded distributed systems. First, an existence of colluding workers who collude results with each other leads to serious privacy leakage issues. Second, there are few existing works considering security issues in data transmission of distributed computing systems/or coded distributed machine learning systems. Third, the number of required results for which need to wait increases with the degree of decoding functions. In this article, we design a secure and private approximated coded distributed computing (SPACDC) scheme that deals with the above-mentioned problems simultaneously. Our SPACDC scheme guarantees data security during the transmission process using a new encryption algorithm based on elliptic curve cryptography. Especially, the SPACDC scheme does not impose strict constraints on the minimum number of results required to be waited for. An extensive performance analysis is conducted to demonstrate the effectiveness of our SPACDC scheme. Furthermore, we present a secure and private distributed learning algorithm based on the SPACDC scheme, which can provide information-theoretic privacy protection for training data. Our experiments show that the SPACDC-based deep learning algorithm achieves a significant speedup over the baseline approaches.
{"title":"Approximated Coded Computing: Towards Fast, Private and Secure Distributed Machine Learning","authors":"Houming Qiu;Kun Zhu;Nguyen Cong Luong;Dusit Niyato","doi":"10.1109/TETC.2025.3562192","DOIUrl":"https://doi.org/10.1109/TETC.2025.3562192","url":null,"abstract":"In a large-scale distributed machine learning system, coded computing has attracted wide-spread attention since it can effectively alleviate the impact of stragglers. However, several emerging problems greatly limit the performance of coded distributed systems. First, an existence of colluding workers who collude results with each other leads to serious privacy leakage issues. Second, there are few existing works considering security issues in data transmission of distributed computing systems/or coded distributed machine learning systems. Third, the number of required results for which need to wait increases with the degree of decoding functions. In this article, we design a secure and private approximated coded distributed computing (SPACDC) scheme that deals with the above-mentioned problems simultaneously. Our SPACDC scheme guarantees data security during the transmission process using a new encryption algorithm based on elliptic curve cryptography. Especially, the SPACDC scheme does not impose strict constraints on the minimum number of results required to be waited for. An extensive performance analysis is conducted to demonstrate the effectiveness of our SPACDC scheme. Furthermore, we present a secure and private distributed learning algorithm based on the SPACDC scheme, which can provide information-theoretic privacy protection for training data. Our experiments show that the SPACDC-based deep learning algorithm achieves a significant speedup over the baseline approaches.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"1030-1042"},"PeriodicalIF":5.4,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145057472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-24DOI: 10.1109/TETC.2025.3562050
Qi Sun;Yang Lu;Yinxia Sun;Jiguo Li
As a new type of digital signature, sanitizable signature enables a semi-trusted entity to alter a signed document and re-create a signature of the altered document in the name of original signer. This approach offers an effective solution to sanitize sensitive information in signed documents while ensuring the authenticity of sanitized documents. Most of current sanitizable signature schemes have the complex certificate management issue or the key escrow limitation. Recently, two certificateless sanitizable signature schemes have been proposed to address the above issues. However, they both rely on costly bilinear pairings, which incur high computation costs to create signature, make sanitization and perform verification. In the work, we design a pairing-free certificateless sanitizable signature scheme with a designated verifier. The proposed scheme achieves signature verification through a designated verifier, thereby preventing malicious propagation and illegal abuse of signatures. By eliminating the need for pairing operations, the scheme offers substantial improvements in computational efficiency. Security proofs demonstrate that it satisfies existential unforgeability and immutability against adaptive chosen message attacks. In addition, simulation experiments indicate that our approach reduces the computation costs of signature generation, sanitization, and verification by approximately 88.15%/88.48%, 99.98%/99.01%, and 71.22%/78.64%, respectively, when compared to the most recent two certificateless sanitizable signature schemes.
{"title":"Certificateless Sanitizable Signature With Designated Verifier","authors":"Qi Sun;Yang Lu;Yinxia Sun;Jiguo Li","doi":"10.1109/TETC.2025.3562050","DOIUrl":"https://doi.org/10.1109/TETC.2025.3562050","url":null,"abstract":"As a new type of digital signature, sanitizable signature enables a semi-trusted entity to alter a signed document and re-create a signature of the altered document in the name of original signer. This approach offers an effective solution to sanitize sensitive information in signed documents while ensuring the authenticity of sanitized documents. Most of current sanitizable signature schemes have the complex certificate management issue or the key escrow limitation. Recently, two certificateless sanitizable signature schemes have been proposed to address the above issues. However, they both rely on costly bilinear pairings, which incur high computation costs to create signature, make sanitization and perform verification. In the work, we design a pairing-free certificateless sanitizable signature scheme with a designated verifier. The proposed scheme achieves signature verification through a designated verifier, thereby preventing malicious propagation and illegal abuse of signatures. By eliminating the need for pairing operations, the scheme offers substantial improvements in computational efficiency. Security proofs demonstrate that it satisfies existential unforgeability and immutability against adaptive chosen message attacks. In addition, simulation experiments indicate that our approach reduces the computation costs of signature generation, sanitization, and verification by approximately 88.15%/88.48%, 99.98%/99.01%, and 71.22%/78.64%, respectively, when compared to the most recent two certificateless sanitizable signature schemes.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"1019-1029"},"PeriodicalIF":5.4,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145057474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-23DOI: 10.1109/TETC.2025.3562136
Jack Cai;Mostafa Rahimi Azghadi;Roman Genov;Amirali Amirsoleimani
Encryption on large-scale memristor crossbars proves to be challenging due to the spatial and temporal fluctuations of the signals coming from numerous non-idealities. To address this, we utilize Hyperlock, a memristive vector-matrix multiplication accelerator employing hyperdimensional computing for encryption. We demonstrate that stochasticity generated on TiOx memristor crossbars with passive 0T1R arrangement can be decryptable under the appropriate training of a neural network. We present HyperXArray, an architecture for Hyperlock's encryption scheme, that is capable of weight regeneration, and analog/digital encryption without the need for high-resolution Analog-to-Digital Converters (ADCs) and Digital-to-Analog Converters (DACs). We demonstrate 100% decryption accuracy for digital encryption and show that HyperXArray is capable of encryption during analog to digital conversion that reduces the power consumption of ADC by $50times$. In digital encryption, we show that HyperXArray reduces energy consumption by up to $10times$ and footprint by $10-100times$ compared to Field Programmable Gate Array (FPGA) implementations of Advanced Encryption Standard (AES), while maintaining the same level of throughput. Overall, HyperXArray demonstrates its capability to fill the niche for lightweight, noise-resilient encryption on edge with only $ 0.1{text{ mm}}^{2}$ footprint and $60 {text{ pJ/bit}}$ energy efficiency.
{"title":"HyperXArray: Low-Power and Compact Memristive Architecture for In-Memory Encryption on Edge","authors":"Jack Cai;Mostafa Rahimi Azghadi;Roman Genov;Amirali Amirsoleimani","doi":"10.1109/TETC.2025.3562136","DOIUrl":"https://doi.org/10.1109/TETC.2025.3562136","url":null,"abstract":"Encryption on large-scale memristor crossbars proves to be challenging due to the spatial and temporal fluctuations of the signals coming from numerous non-idealities. To address this, we utilize Hyperlock, a memristive vector-matrix multiplication accelerator employing hyperdimensional computing for encryption. We demonstrate that stochasticity generated on TiOx memristor crossbars with passive 0T1R arrangement can be decryptable under the appropriate training of a neural network. We present HyperXArray, an architecture for Hyperlock's encryption scheme, that is capable of weight regeneration, and analog/digital encryption without the need for high-resolution Analog-to-Digital Converters (ADCs) and Digital-to-Analog Converters (DACs). We demonstrate 100% decryption accuracy for digital encryption and show that HyperXArray is capable of encryption during analog to digital conversion that reduces the power consumption of ADC by <inline-formula><tex-math>$50times$</tex-math></inline-formula>. In digital encryption, we show that HyperXArray reduces energy consumption by up to <inline-formula><tex-math>$10times$</tex-math></inline-formula> and footprint by <inline-formula><tex-math>$10-100times$</tex-math></inline-formula> compared to Field Programmable Gate Array (FPGA) implementations of Advanced Encryption Standard (AES), while maintaining the same level of throughput. Overall, HyperXArray demonstrates its capability to fill the niche for lightweight, noise-resilient encryption on edge with only <inline-formula><tex-math>$ 0.1{text{ mm}}^{2}$</tex-math></inline-formula> footprint and <inline-formula><tex-math>$60 {text{ pJ/bit}}$</tex-math></inline-formula> energy efficiency.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 4","pages":"1410-1423"},"PeriodicalIF":5.4,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The widespread adoption of data-centric algorithms, particularly artificial intelligence (AI) and machine learning (ML), has exposed the limitations of centralized processing infrastructures, driving a shift towards edge computing. This necessitates stringent constraints on energy efficiency, which traditional von Neumann architectures struggle to meet. The compute-in-memory (CIM) paradigm has emerged as a better candidate due to its efficient exploitation of the available memory bandwidth. However, existing CIM solutions require a high implementation effort and lack flexibility from a software integration standpoint. This work proposes a novel, software-friendly, general-purpose, and low-integration-effort near-memory computing (NMC) approach, paving the way for the adoption of CIM-based systems in the next generation of edge computing nodes. Two architectural variants, NM-Caesar and NM-Carus, are proposed and characterized to target different trade-offs in area efficiency, performance, and flexibility, covering a wide range of embedded microcontrollers. Post-layout simulations show up to 28.0 × and 53.9 × lower execution time and 25.0 × and 35.6 × higher energy efficiency at system level, respectively, compared to the execution of the same tasks on a state-of-the-art RISC-V CPU (RV32IMC). NM-Carus achieves a peak energy efficiency of 306.7 GOPS/W in 8-bit matrix multiplications, surpassing recent state-of-the-art in- and near-memory circuits.
以数据为中心的算法,特别是人工智能(AI)和机器学习(ML)的广泛采用,暴露了集中式处理基础设施的局限性,推动了向边缘计算的转变。这就需要对能源效率进行严格的限制,而传统的冯·诺伊曼架构很难满足这一要求。内存中计算(CIM)范式由于其对可用内存带宽的有效利用而成为更好的备选方案。然而,从软件集成的角度来看,现有的CIM解决方案需要大量的实现工作,并且缺乏灵活性。这项工作提出了一种新颖的、软件友好的、通用的、低集成度的近内存计算(NMC)方法,为下一代边缘计算节点采用基于cim的系统铺平了道路。提出了两种架构变体NM-Caesar和NM-Carus,并对其进行了描述,以针对面积效率,性能和灵活性的不同权衡,涵盖了广泛的嵌入式微控制器。布局后仿真显示,与在最先进的RISC-V CPU (RV32IMC)上执行相同的任务相比,在系统级别上,执行时间分别降低28.0倍和53.9倍,能效分别提高25.0倍和35.6倍。NM-Carus在8位矩阵乘法中实现了306.7 GOPS/W的峰值能量效率,超过了最近最先进的内存和近内存电路。
{"title":"Scalable and RISC-V Programmable Near-Memory Computing Architectures for Edge Nodes","authors":"Michele Caon;Clément Choné;Pasquale Davide Schiavone;Alexandre Levisse;Guido Masera;Maurizio Martina;David Atienza","doi":"10.1109/TETC.2025.3555869","DOIUrl":"https://doi.org/10.1109/TETC.2025.3555869","url":null,"abstract":"The widespread adoption of data-centric algorithms, particularly artificial intelligence (AI) and machine learning (ML), has exposed the limitations of centralized processing infrastructures, driving a shift towards edge computing. This necessitates stringent constraints on energy efficiency, which traditional von Neumann architectures struggle to meet. The compute-in-memory (CIM) paradigm has emerged as a better candidate due to its efficient exploitation of the available memory bandwidth. However, existing CIM solutions require a high implementation effort and lack flexibility from a software integration standpoint. This work proposes a novel, software-friendly, general-purpose, and low-integration-effort near-memory computing (NMC) approach, paving the way for the adoption of CIM-based systems in the next generation of edge computing nodes. Two architectural variants, NM-Caesar and NM-Carus, are proposed and characterized to target different trade-offs in area efficiency, performance, and flexibility, covering a wide range of embedded microcontrollers. Post-layout simulations show up to 28.0 × and 53.9 × lower execution time and 25.0 × and 35.6 × higher energy efficiency at system level, respectively, compared to the execution of the same tasks on a state-of-the-art RISC-V CPU (RV32IMC). NM-Carus achieves a peak energy efficiency of 306.7 GOPS/W in 8-bit matrix multiplications, surpassing recent state-of-the-art in- and near-memory circuits.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"1003-1018"},"PeriodicalIF":5.4,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145057423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-01DOI: 10.1109/TETC.2025.3546244
Yi-Cheng Chen;Mu-Ping Chang;Wang-Chien Lee
Recent research has focused on the integration of smart manufacturing and deep learning owing to the widespread application of neural computation. For deep learning, how to construct the architecture of a neural network is a critical issue. Especially on defect prediction or detection, a proper neural architecture could effectively extract features from the given manufacturing data to accomplish the targeted task. In this paper, we introduce a Virtual Space concept to effectively shrink the search space of potential neural network structures, with the aim of downgrading the computation complexity for learning and accuracy derivation. In addition, a novel reinforcement learning model, namely, Virtual Proximal Policy Optimization (Virtu-PPO), is developed to efficiently and effectively discover the optimal neural network structure. We also propose an optimization strategy to enhance the searching process of neural architecture for defect prediction. In addition, the proposed model is applied on several real-world manufacturing datasets to show the performance and practicability of defect prediction.
{"title":"Virtual Reinforcement Learning for Defect Prediction in Smart Manufacturing","authors":"Yi-Cheng Chen;Mu-Ping Chang;Wang-Chien Lee","doi":"10.1109/TETC.2025.3546244","DOIUrl":"https://doi.org/10.1109/TETC.2025.3546244","url":null,"abstract":"Recent research has focused on the integration of smart manufacturing and deep learning owing to the widespread application of neural computation. For deep learning, how to construct the architecture of a neural network is a critical issue. Especially on defect prediction or detection, a proper neural architecture could effectively extract features from the given manufacturing data to accomplish the targeted task. In this paper, we introduce a <italic>Virtual Space</i> concept to effectively shrink the search space of potential neural network structures, with the aim of downgrading the computation complexity for learning and accuracy derivation. In addition, a novel reinforcement learning model, namely, <italic>Virtual Proximal Policy Optimization (Virtu-PPO)</i>, is developed to efficiently and effectively discover the optimal neural network structure. We also propose an optimization strategy to enhance the searching process of neural architecture for defect prediction. In addition, the proposed model is applied on several real-world manufacturing datasets to show the performance and practicability of defect prediction.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"990-1002"},"PeriodicalIF":5.4,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145057475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Quantum circuit simulation is playing a critical role in the current era of quantum computing. However, quantum circuit simulation suffers from huge memory requirements that scale exponentially according to the number of qubits. Our observation reveals that the conventional complex number representation using real and imaginary values adds to the memory overhead beyond the intrinsic cost of simulating quantum states. Instead, using the radius and phase value of a complex number better reflects the properties of the complex values used in the quantum circuit simulation providing better memory efficiency. This paper proposes q-Point, a compact numeric format for quantum circuit simulation that utilizes polar form representation instead of rectangular form representation to store complex numbers. The proposed q-Point format consists of three fields: i) exponent bits for radius value ii) mantissa bits for radius value iii) mantissa bits for phase value. However, a naive application of the q-Point format has the potential to cause issues with both simulation accuracy and simulation speed. To preserve simulation accuracy with fewer bits, we use a multi-level encoding scheme that employs different mantissa bits depending on the exponent range. Additionally, to prevent possible slowdown due to the add operation in polar form complex numbers, we use a technique that adaptively applies both polar and rectangular forms. Equipped with these optimizations, the proposed q-Point format demonstrates reasonable simulation accuracy while using only half of the memory requirement using the baseline format. Additionally, the q-Point format enables an average of 1.37× and 1.16× faster simulation for QAOA and VQE benchmark circuits.
{"title":"q-Point: A Numeric Format for Quantum Circuit Simulation Using Polar Form Complex Numbers","authors":"Seungwoo Choi;Enhyeok Jang;Youngmin Kim;Sungwoo Ahn;Won Woo Ro","doi":"10.1109/TETC.2025.3572935","DOIUrl":"https://doi.org/10.1109/TETC.2025.3572935","url":null,"abstract":"Quantum circuit simulation is playing a critical role in the current era of quantum computing. However, quantum circuit simulation suffers from huge memory requirements that scale exponentially according to the number of qubits. Our observation reveals that the conventional complex number representation using real and imaginary values adds to the memory overhead beyond the intrinsic cost of simulating quantum states. Instead, using the radius and phase value of a complex number better reflects the properties of the complex values used in the quantum circuit simulation providing better memory efficiency. This paper proposes q-Point, a compact numeric format for quantum circuit simulation that utilizes polar form representation instead of rectangular form representation to store complex numbers. The proposed q-Point format consists of three fields: i) exponent bits for radius value ii) mantissa bits for radius value iii) mantissa bits for phase value. However, a naive application of the q-Point format has the potential to cause issues with both simulation accuracy and simulation speed. To preserve simulation accuracy with fewer bits, we use a multi-level encoding scheme that employs different mantissa bits depending on the exponent range. Additionally, to prevent possible slowdown due to the add operation in polar form complex numbers, we use a technique that adaptively applies both polar and rectangular forms. Equipped with these optimizations, the proposed q-Point format demonstrates reasonable simulation accuracy while using only half of the memory requirement using the baseline format. Additionally, the q-Point format enables an average of 1.37× and 1.16× faster simulation for QAOA and VQE benchmark circuits.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"1142-1155"},"PeriodicalIF":5.4,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145057428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In today’s digital age, the proliferation of social networks and advanced camera technology has led to countless images being shared on online social platforms daily, potentially resulting in significant breaches of personal privacy. In recent years, many methods have been proposed to protect image privacy, allowing users to be notified of potential privacy leaks before publishing their photos. However, most existing research primarily addresses the privacy protection of image owners or co-owners, while neglecting the privacy of people who appear in the background of others’ images or who are co-occurring with others in the same image. In this paper, we propose a system capable of conducting real-time access control for protecting privacy of every individual appearing in a photo, as well as the privacy of people who co-occur in the same image. Specifically, we first detect all the faces in the image, then use a facial recognition algorithm to identify the corresponding users’ privacy policies, and finally determine whether the image violates any user’s privacy policy. In order to provide real-time access control, we have designed a facial attribute index tree to speed up the process of user identification. The experimental results show that compared with the method without our proposed index tree, our approach improves the time efficiency by almost two orders of magnitude while maintaining the accuracy of more than 97%.
{"title":"Real-Time Access Control for Background and Co-Occurrence Image Privacy Protection","authors":"Chaoquan Cai;Dan Lin;Kannappan Palaniappan;Chris Clifton","doi":"10.1109/TETC.2025.3572396","DOIUrl":"https://doi.org/10.1109/TETC.2025.3572396","url":null,"abstract":"In today’s digital age, the proliferation of social networks and advanced camera technology has led to countless images being shared on online social platforms daily, potentially resulting in significant breaches of personal privacy. In recent years, many methods have been proposed to protect image privacy, allowing users to be notified of potential privacy leaks before publishing their photos. However, most existing research primarily addresses the privacy protection of image owners or co-owners, while neglecting the privacy of people who appear in the background of others’ images or who are co-occurring with others in the same image. In this paper, we propose a system capable of conducting real-time access control for protecting privacy of every individual appearing in a photo, as well as the privacy of people who co-occur in the same image. Specifically, we first detect all the faces in the image, then use a facial recognition algorithm to identify the corresponding users’ privacy policies, and finally determine whether the image violates any user’s privacy policy. In order to provide real-time access control, we have designed a facial attribute index tree to speed up the process of user identification. The experimental results show that compared with the method without our proposed index tree, our approach improves the time efficiency by almost two orders of magnitude while maintaining the accuracy of more than 97%.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"1130-1141"},"PeriodicalIF":5.4,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145057420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-28DOI: 10.1109/TETC.2025.3572277
Hossein Hosseini;Mohsen Ansari;Jörg Henkel
Task replication is a common technique for achieving fault tolerance. However, its effectiveness is limited by the accuracy of the fault detection mechanism; imperfect detection imposes a ceiling on achievable reliability. While perfect fault detection mechanisms offer higher reliability, they introduce significant overhead. To address this, we introduce Dynamic Task Replication, a fault tolerance technique that dynamically determines the number of replicas at runtime to overcome the limitations of imperfect fault detection. Our primary contribution, Reliability-Aware Replica-Efficient Dynamic Task Replication, optimizes this approach by minimizing the expected number of replicas while achieving the desired reliability target. We incorporate actual execution times into the reliability assessment. Additionally, we propose the Energy-Aware Reliability-Guaranteeing scheduling technique, which integrates our optimized replication method into hard real-time systems and leverages Dynamic Voltage and Frequency Scaling to minimize energy consumption while ensuring reliability and system schedulability. Experimental results demonstrate that our method requires 24% fewer replicas on average than the N-Modular Redundancy technique, with the advantage increasing to 58% for tasks with low base reliabilities. Furthermore, our scheduling technique significantly conserves energy and enhances feasibility compared to existing methods across diverse system workloads.
{"title":"Dynamic Task Replication With Imperfect Fault Detection in Multicore Cyber-Physical Systems","authors":"Hossein Hosseini;Mohsen Ansari;Jörg Henkel","doi":"10.1109/TETC.2025.3572277","DOIUrl":"https://doi.org/10.1109/TETC.2025.3572277","url":null,"abstract":"Task replication is a common technique for achieving fault tolerance. However, its effectiveness is limited by the accuracy of the fault detection mechanism; imperfect detection imposes a ceiling on achievable reliability. While perfect fault detection mechanisms offer higher reliability, they introduce significant overhead. To address this, we introduce Dynamic Task Replication, a fault tolerance technique that dynamically determines the number of replicas at runtime to overcome the limitations of imperfect fault detection. Our primary contribution, Reliability-Aware Replica-Efficient Dynamic Task Replication, optimizes this approach by minimizing the expected number of replicas while achieving the desired reliability target. We incorporate actual execution times into the reliability assessment. Additionally, we propose the Energy-Aware Reliability-Guaranteeing scheduling technique, which integrates our optimized replication method into hard real-time systems and leverages Dynamic Voltage and Frequency Scaling to minimize energy consumption while ensuring reliability and system schedulability. Experimental results demonstrate that our method requires 24% fewer replicas on average than the N-Modular Redundancy technique, with the advantage increasing to 58% for tasks with low base reliabilities. Furthermore, our scheduling technique significantly conserves energy and enhances feasibility compared to existing methods across diverse system workloads.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"1113-1129"},"PeriodicalIF":5.4,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145057469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-27DOI: 10.1109/TETC.2025.3552941
Yijun Cui;Jiatong Tian;Chuanchao Lu;Yang Li;Ziying Ni;Chenghua Wang;Weiqiang Liu
Lattice-based cryptography is considered secure against quantum computing attacks. However, naive implementations on embedded devices are vulnerable to side-channel attacks (SCAs) with full key recovery possible through power and electromagnetic leakage analysis. This article presents two protection schemes, masking and shuffling, for the baseline Radix-2 multi-path delay commutator (R2MDC) number theoretic transform (NTT) architecture. The proposed masking NTT scheme introduces a random number to protect the secret key during the decryption phase and leverages the linear property of arithmetic transform in NTT polynomial multiplication. By adjusting the comparing decoding threshold, the masking method greatly reduces the ratio of $t$-$test$ value exceeding the threshold of unprotected NTT scheme from 77.38% to 3.91%. An ingenious shuffling transform process is also proposed to disturb the calculation sequence of butterfly transformation, adapting to the high-throughput architecture of R2MDC-NTT. This shuffling NTT scheme does not require operations to remove shuffle or additional operation cycles, reducing the leakage ratio to 13.49% with minimal extra hardware resources and wide applicability. The proposed masking and shuffling techniques effectively suppress side-channel leakage, improving the security of hardware architecture while maintaining a balance between overall performance and additional hardware resources.
{"title":"Two Low-Cost and Security-Enhanced Implementations Against Side-Channel Attacks of NTT for Lattice-Based Cryptography","authors":"Yijun Cui;Jiatong Tian;Chuanchao Lu;Yang Li;Ziying Ni;Chenghua Wang;Weiqiang Liu","doi":"10.1109/TETC.2025.3552941","DOIUrl":"https://doi.org/10.1109/TETC.2025.3552941","url":null,"abstract":"Lattice-based cryptography is considered secure against quantum computing attacks. However, naive implementations on embedded devices are vulnerable to side-channel attacks (SCAs) with full key recovery possible through power and electromagnetic leakage analysis. This article presents two protection schemes, masking and shuffling, for the baseline Radix-2 multi-path delay commutator (R2MDC) number theoretic transform (NTT) architecture. The proposed masking NTT scheme introduces a random number to protect the secret key during the decryption phase and leverages the linear property of arithmetic transform in NTT polynomial multiplication. By adjusting the comparing decoding threshold, the masking method greatly reduces the ratio of <inline-formula><tex-math>$t$</tex-math></inline-formula>-<inline-formula><tex-math>$test$</tex-math></inline-formula> value exceeding the threshold of unprotected NTT scheme from 77.38% to 3.91%. An ingenious shuffling transform process is also proposed to disturb the calculation sequence of butterfly transformation, adapting to the high-throughput architecture of R2MDC-NTT. This shuffling NTT scheme does not require operations to remove shuffle or additional operation cycles, reducing the leakage ratio to 13.49% with minimal extra hardware resources and wide applicability. The proposed masking and shuffling techniques effectively suppress side-channel leakage, improving the security of hardware architecture while maintaining a balance between overall performance and additional hardware resources.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"977-989"},"PeriodicalIF":5.4,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145057470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}