Fully Homomorphic Encryption (FHE) is a technique that allows arbitrary computations to be performed on encrypted data without the need for decryption, making it ideal for secure computation outsourcing. However, computation on FHE-encrypted data is significantly slower than that on plain data, primarily due to the explosive increases in data size and computation complexity after encryption. To enable real-world FHE applications, recent research has proposed several custom hardware accelerators that provide orders of magnitude speedup over conventional systems. However, the performance of existing FHE accelerators is severely bounded by memory bandwidth, even with expensive on-chip buffers. Processing In-Memory (PIM) is a promising technology that can accelerate data-intensive workloads with extensive internal bandwidth. Unfortunately, existing PIM accelerators cannot efficiently support FHE due to the limited throughput to support FHE’s complex computing and data movement operations. To tackle such challenges, we propose FHEmem, an FHE accelerator using a novel PIM architecture for high-throughput FHE acceleration. Furthermore, we present an optimized end-to-end processing flow with an automated mapping framework to maximize the hardware utilization of FHEmem. Our evaluation shows that FHEmem achieves at least 4.0× speedup and 6.9× energy-delay-area efficiency improvement over state-of-the-art FHE accelerators on popular FHE applications.
{"title":"FHEmem: A Processing In-Memory Accelerator for Fully Homomorphic Encryption","authors":"Minxuan Zhou;Yujin Nam;Pranav Gangwar;Weihong Xu;Arpan Dutta;Chris Wilkerson;Rosario Cammarota;Saransh Gupta;Tajana Rosing","doi":"10.1109/TETC.2025.3528862","DOIUrl":"https://doi.org/10.1109/TETC.2025.3528862","url":null,"abstract":"Fully Homomorphic Encryption (FHE) is a technique that allows arbitrary computations to be performed on encrypted data without the need for decryption, making it ideal for secure computation outsourcing. However, computation on FHE-encrypted data is significantly slower than that on plain data, primarily due to the explosive increases in data size and computation complexity after encryption. To enable real-world FHE applications, recent research has proposed several custom hardware accelerators that provide orders of magnitude speedup over conventional systems. However, the performance of existing FHE accelerators is severely bounded by memory bandwidth, even with expensive on-chip buffers. Processing In-Memory (PIM) is a promising technology that can accelerate data-intensive workloads with extensive internal bandwidth. Unfortunately, existing PIM accelerators cannot efficiently support FHE due to the limited throughput to support FHE’s complex computing and data movement operations. To tackle such challenges, we propose FHEmem, an FHE accelerator using a novel PIM architecture for high-throughput FHE acceleration. Furthermore, we present an optimized end-to-end processing flow with an automated mapping framework to maximize the hardware utilization of FHEmem. Our evaluation shows that FHEmem achieves at least 4.0× speedup and 6.9× energy-delay-area efficiency improvement over state-of-the-art FHE accelerators on popular FHE applications.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 4","pages":"1367-1382"},"PeriodicalIF":5.4,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scatter-gather dynamic-memory-access (SG-DMA) is utilized in applications that require high bandwidth and low latency data transfers between memory and peripherals, where data blocks, described using buffer descriptors (BDs), are distributed throughout the memory system. The data transfer organization and requirements of a Trapped-Ion Quantum Computer (TIQC) possess characteristics similar to those targeted by SG-DMA. In particular, the ion qubits in a TIQC are manipulated by applying control sequences consisting primarily of modulated laser pulses. These optical pulses are defined by parameters that are (re)configured by the electrical control system. Variations in the operating environment and equipment make it necessary to create and run a wide range of control sequence permutations, which can be well represented as BD regions distributed across the main memory. In this article, we experimentally evaluate the latency and throughput of SG-DMA on Xilinx radiofrequency SoC (RFSoC) devices under a variety of BD and payload sizes as a means of determining the benefits and limitations of an RFSoC system architecture for TIQC applications.
{"title":"Scatter-Gather DMA Performance Analysis Within an SoC-Based Control System for Trapped-Ion Quantum Computing","authors":"Tiamike Dudley;Jim Plusquellic;Eirini Eleni Tsiropoulou;Joshua Goldberg;Daniel Stick;Daniel Lobser","doi":"10.1109/TETC.2025.3528899","DOIUrl":"https://doi.org/10.1109/TETC.2025.3528899","url":null,"abstract":"Scatter-gather dynamic-memory-access (SG-DMA) is utilized in applications that require high bandwidth and low latency data transfers between memory and peripherals, where data blocks, described using buffer descriptors (BDs), are distributed throughout the memory system. The data transfer organization and requirements of a Trapped-Ion Quantum Computer (TIQC) possess characteristics similar to those targeted by SG-DMA. In particular, the ion qubits in a TIQC are manipulated by applying control sequences consisting primarily of modulated laser pulses. These optical pulses are defined by parameters that are (re)configured by the electrical control system. Variations in the operating environment and equipment make it necessary to create and run a wide range of control sequence permutations, which can be well represented as BD regions distributed across the main memory. In this article, we experimentally evaluate the latency and throughput of SG-DMA on Xilinx radiofrequency SoC (RFSoC) devices under a variety of BD and payload sizes as a means of determining the benefits and limitations of an RFSoC system architecture for TIQC applications.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"841-852"},"PeriodicalIF":5.4,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145050793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-10DOI: 10.1109/TETC.2024.3520672
Fernando Fernandes dos Santos;Niccolò Cavagnero;Marco Ciccone;Giuseppe Averta;Angeliki Kritikakou;Olivier Sentieys;Paolo Rech;Tatiana Tommasi
Deep Neural Networks (DNNs) have revolutionized several fields, including safety- and mission-critical applications, such as autonomous driving and space exploration. However, recent studies have highlighted that transient hardware faults can corrupt the model's output, leading to high misprediction probabilities. Since traditional reliability strategies, based on modular hardware, software replications, or matrix multiplication checksum impose a high overhead, there is a pressing need for efficient and effective hardening solutions tailored for DNNs. In this article we present several network design choices and a training procedure that increase the robustness of standard deep models and thoroughly evaluate these strategies with experimental analyses on vision classification tasks. We name DieHardNet the specialized DNN obtained by applying all our hardening techniques that combine knowledge from experimental hardware faults characterization and machine learning studies. We conduct extensive ablation studies to quantify the reliability gain of each hardening component in DieHardNet. We perform over 10,000 instruction-level fault injections to validate our approach and expose DieHardNet executed on GPUs to an accelerated neutron beam equivalent to more than 570,000 years of natural radiation. Our evaluation demonstrates that DieHardNet can reduce the critical error rate (i.e., errors that modify the inference) up to 100 times compared to the unprotected baseline model, without causing any increase in inference time.
{"title":"Improving Deep Neural Network Reliability via Transient-Fault-Aware Design and Training","authors":"Fernando Fernandes dos Santos;Niccolò Cavagnero;Marco Ciccone;Giuseppe Averta;Angeliki Kritikakou;Olivier Sentieys;Paolo Rech;Tatiana Tommasi","doi":"10.1109/TETC.2024.3520672","DOIUrl":"https://doi.org/10.1109/TETC.2024.3520672","url":null,"abstract":"Deep Neural Networks (DNNs) have revolutionized several fields, including safety- and mission-critical applications, such as autonomous driving and space exploration. However, recent studies have highlighted that transient hardware faults can corrupt the model's output, leading to high misprediction probabilities. Since traditional reliability strategies, based on modular hardware, software replications, or matrix multiplication checksum impose a high overhead, there is a pressing need for efficient and effective hardening solutions tailored for DNNs. In this article we present several network design choices and a training procedure that increase the robustness of standard deep models and thoroughly evaluate these strategies with experimental analyses on vision classification tasks. We name <italic>DieHardNet</i> the specialized DNN obtained by applying all our hardening techniques that combine knowledge from experimental hardware faults characterization and machine learning studies. We conduct extensive ablation studies to quantify the reliability gain of each hardening component in DieHardNet. We perform over 10,000 instruction-level fault injections to validate our approach and expose DieHardNet executed on GPUs to an accelerated neutron beam equivalent to more than 570,000 years of natural radiation. Our evaluation demonstrates that DieHardNet can reduce the critical error rate (i.e., errors that modify the inference) up to 100 times compared to the unprotected baseline model, without causing any increase in inference time.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"829-840"},"PeriodicalIF":5.4,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145050993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-02DOI: 10.1109/TETC.2024.3522307
Pengfei Huang;Ke Chen;Chenghua Wang;Weiqiang Liu
Approximate computing (AxC) has recently emerged as a successful approach for optimizing energy consumption in error-tolerant applications, such as deep neural networks (DNNs). The enormous model size and high computation cost of DNNs present significant challenges for deployment in energy-efficient and resource-constrained computing systems. Emerging DNN hardware accelerators based on AxC designs selectively approximate the non-critical segments of computation to address these challenges. However, a systematic and principled approach that incorporates domain knowledge and approximate hardware for optimal approximation is still lacking. In this paper, we propose a probabilistic-oriented AxC (PAxC) framework that provides high energy savings with acceptable quality by considering the overall probability effect of approximation. To achieve aggressive approximate designs, we utilize the minimum likelihood error to determine the AxC synergy profile at both application and circuit levels. This enables effective coordination of the trade-off between energy and accuracy. Compared with a baseline design, the power-delay product (PDP) is significantly reduced by up to 83.66% with an acceptable accuracy reduction. Simulation and a case study of the image process validate the effectiveness of the proposed framework.
{"title":"Energy Efficient Approximate Computing Framework for DNN Acceleration Using a Probabilistic-Oriented Method","authors":"Pengfei Huang;Ke Chen;Chenghua Wang;Weiqiang Liu","doi":"10.1109/TETC.2024.3522307","DOIUrl":"https://doi.org/10.1109/TETC.2024.3522307","url":null,"abstract":"Approximate computing (AxC) has recently emerged as a successful approach for optimizing energy consumption in error-tolerant applications, such as deep neural networks (DNNs). The enormous model size and high computation cost of DNNs present significant challenges for deployment in energy-efficient and resource-constrained computing systems. Emerging DNN hardware accelerators based on AxC designs selectively approximate the non-critical segments of computation to address these challenges. However, a systematic and principled approach that incorporates domain knowledge and approximate hardware for optimal approximation is still lacking. In this paper, we propose a probabilistic-oriented AxC (PAxC) framework that provides high energy savings with acceptable quality by considering the overall probability effect of approximation. To achieve aggressive approximate designs, we utilize the minimum likelihood error to determine the AxC synergy profile at both application and circuit levels. This enables effective coordination of the trade-off between energy and accuracy. Compared with a baseline design, the power-delay product (PDP) is significantly reduced by up to 83.66% with an acceptable accuracy reduction. Simulation and a case study of the image process validate the effectiveness of the proposed framework.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"816-828"},"PeriodicalIF":5.4,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145050820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-13DOI: 10.1109/TETC.2024.3513392
Mingfu Xue;Can He;Yushu Zhang;Zhe Liu;Weiqiang Liu
In this article, we propose a novel physical stealth attack against the person detectors in real world. For the first time, we consider the impacts of those complex and challenging 3D physical constraints (e.g., radian, wrinkle, occlusion, angle, etc.) on person stealth attacks, and propose 3D transformations to generate robust 3D invisible cloak. We launch the person stealth attacks in 3D physical space instead of 2D plane by printing the adversarial patches on real clothes. Anyone wearing the cloak can evade the detection of person detectors and achieve stealth under challenging and complex 3D physical scenarios. Experimental results in various indoor and outdoor physical scenarios show that, the proposed person stealth attack method is robust and effective even under those complex and challenging physical conditions, such as the cloak is wrinkled, obscured, curved, and from different/large angles. The attack success rate of the generated adversarial patch in digital domain (Inria dataset) is 86.56% against YOLO v2 and 80.32% against YOLO v5, while the static and dynamic stealth attack success rates of the generated 3D invisible cloak in physical world are 100%, 77% against YOLO v2 and 100%, 83.95% against YOLO v5, respectively, which are significantly better than state-of-the-art works.
{"title":"3D Invisible Cloak: A Robust Person Stealth Attack Against Object Detector in Complex 3D Physical Scenarios","authors":"Mingfu Xue;Can He;Yushu Zhang;Zhe Liu;Weiqiang Liu","doi":"10.1109/TETC.2024.3513392","DOIUrl":"https://doi.org/10.1109/TETC.2024.3513392","url":null,"abstract":"In this article, we propose a novel physical stealth attack against the person detectors in real world. For the first time, we consider the impacts of those complex and challenging 3D physical constraints (e.g., radian, wrinkle, occlusion, angle, etc.) on person stealth attacks, and propose 3D transformations to generate robust 3D invisible cloak. We launch the person stealth attacks in 3D physical space instead of 2D plane by printing the adversarial patches on real clothes. Anyone wearing the cloak can evade the detection of person detectors and achieve stealth under challenging and complex 3D physical scenarios. Experimental results in various indoor and outdoor physical scenarios show that, the proposed person stealth attack method is robust and effective even under those complex and challenging physical conditions, such as the cloak is wrinkled, obscured, curved, and from different/large angles. The attack success rate of the generated adversarial patch in digital domain (Inria dataset) is 86.56% against YOLO v2 and 80.32% against YOLO v5, while the static and dynamic stealth attack success rates of the generated 3D invisible cloak in physical world are 100%, 77% against YOLO v2 and 100%, 83.95% against YOLO v5, respectively, which are significantly better than state-of-the-art works.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"799-815"},"PeriodicalIF":5.4,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145050821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-11DOI: 10.1109/TETC.2024.3511676
Alessio Carpegna;Alessandro Savino;Stefano Di Carlo
Including Artificial Neural Networks in embedded systems at the edge allows applications to exploit Artificial Intelligence capabilities directly within devices operating at the network periphery, containing sensitive data within the boundaries of the edge device. This facilitates real-time decision-making, reduces latency and power consumption, and enhances privacy and security. Spiking Neural Networks (SNNs) offer a promising computing paradigm in these environments. However, deploying efficient SNNs in resource-constrained edge devices requires highly parallel and reconfigurable hardware implementations. We introduce Spiker+, a comprehensive framework for generating efficient, low-power, and low-area SNN accelerators on Field Programmable Gate Arrays for inference at the edge. Spiker+ presents a configurable multi-layer SNN hardware architecture, a library of highly efficient neuron architectures, and a design framework to enable easy, Python-based customization of accelerators. Spiker+ is tested on three benchmark datasets: MNIST, Spiking Heidelberg Dataset (SHD), and AudioMNIST. On MNIST, it outperforms state-of-the-art SNN accelerators in terms of resource allocation, with a requirement of 7,612 logic cells and 18 Block RAMS (BRAMs), and power consumption, draining only 180 mW, with comparable latency (780 $mu$s/img) and accuracy (97%). On SHD and AudioMNIST, Spiker+ requires 18,268 and 10,124 logic cells, respectively, requiring 51 and 16 BRAMs, consuming 430 mW and 290 mW, with an accuracy of 75% and 95%. These results underscore the significance of Spiker+ in the hardware-accelerated SNN landscape, making it an excellent solution for deploying configurable and tunable SNN architectures in resource and power-constrained edge applications.
{"title":"Spiker+: A Framework for the Generation of Efficient Spiking Neural Networks FPGA Accelerators for Inference at the Edge","authors":"Alessio Carpegna;Alessandro Savino;Stefano Di Carlo","doi":"10.1109/TETC.2024.3511676","DOIUrl":"https://doi.org/10.1109/TETC.2024.3511676","url":null,"abstract":"Including Artificial Neural Networks in embedded systems at the edge allows applications to exploit Artificial Intelligence capabilities directly within devices operating at the network periphery, containing sensitive data within the boundaries of the edge device. This facilitates real-time decision-making, reduces latency and power consumption, and enhances privacy and security. Spiking Neural Networks (SNNs) offer a promising computing paradigm in these environments. However, deploying efficient SNNs in resource-constrained edge devices requires highly parallel and reconfigurable hardware implementations. We introduce Spiker+, a comprehensive framework for generating efficient, low-power, and low-area SNN accelerators on Field Programmable Gate Arrays for inference at the edge. Spiker+ presents a configurable multi-layer SNN hardware architecture, a library of highly efficient neuron architectures, and a design framework to enable easy, Python-based customization of accelerators. Spiker+ is tested on three benchmark datasets: MNIST, Spiking Heidelberg Dataset (SHD), and AudioMNIST. On MNIST, it outperforms state-of-the-art SNN accelerators in terms of resource allocation, with a requirement of 7,612 logic cells and 18 Block RAMS (BRAMs), and power consumption, draining only 180 mW, with comparable latency (780 <inline-formula><tex-math>$mu$</tex-math></inline-formula>s/img) and accuracy (97%). On SHD and AudioMNIST, Spiker+ requires 18,268 and 10,124 logic cells, respectively, requiring 51 and 16 BRAMs, consuming 430 mW and 290 mW, with an accuracy of 75% and 95%. These results underscore the significance of Spiker+ in the hardware-accelerated SNN landscape, making it an excellent solution for deploying configurable and tunable SNN architectures in resource and power-constrained edge applications.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"784-798"},"PeriodicalIF":5.4,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10794606","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145051041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-05DOI: 10.1109/TETC.2024.3488452
Ke Chen;Shanshan Liu;Weiqiang Liu;Fabrizio Lombardi;Nader Bagherzadeh
{"title":"Guest Editorial: Special Section on “Approximate Data Processing: Computing, Storage and Applications”","authors":"Ke Chen;Shanshan Liu;Weiqiang Liu;Fabrizio Lombardi;Nader Bagherzadeh","doi":"10.1109/TETC.2024.3488452","DOIUrl":"https://doi.org/10.1109/TETC.2024.3488452","url":null,"abstract":"","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"12 4","pages":"954-955"},"PeriodicalIF":5.1,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10779333","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142777561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-05DOI: 10.1109/TETC.2024.3499715
{"title":"IEEE Transactions on Emerging Topics in Computing Information for Authors","authors":"","doi":"10.1109/TETC.2024.3499715","DOIUrl":"https://doi.org/10.1109/TETC.2024.3499715","url":null,"abstract":"","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"12 4","pages":"C2-C2"},"PeriodicalIF":5.1,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10779345","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142777661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-18DOI: 10.1109/TETC.2024.3496835
Yang Zhang;Ruohan Zong;Lanyu Shang;Dong Wang
This paper focuses on a public health policy-adherence assessment (PHPA) application that aims to automatically assess people's public health policy adherence during emergent global health crisis events (e.g., COVID-19, MonkeyPox) by leveraging massive public health policy adherence imagery data from the social media. In particular, we study an optimal AI model design problem in the PHPA application, where the goal is to leverage the crowdsourced human intelligence to accurately identify the optimal AI model design (i.e., network architecture and hyperparameter configuration combination) without the need of AI experts. However, two critical challenges exist in our problem: 1) it is challenging to effectively optimize the AI model design given the interdependence between network architecture and hyperparameter configuration; 2) it is non-trivial to leverage the human intelligence queried from ordinary crowd workers to identify the optimal AI model design in the PHPA application. To address these challenges, we develop CrowdDesign, a subjective logic-driven human-AI collaborative learning framework that explores the complementary strength of AI and human intelligence to jointly identify the optimal network architecture and hyperparameter configuration of an AI model in the PHPA application. The experimental results from two real-world PHPA applications demonstrate that CrowdDesign consistently outperforms the state-of-the-art baseline methods by achieving the best PHPA performance.
{"title":"A Crowdsourcing-Driven AI Model Design Framework to Public Health Policy-Adherence Assessment","authors":"Yang Zhang;Ruohan Zong;Lanyu Shang;Dong Wang","doi":"10.1109/TETC.2024.3496835","DOIUrl":"https://doi.org/10.1109/TETC.2024.3496835","url":null,"abstract":"This paper focuses on a <italic>public health policy-adherence assessment (PHPA)</i> application that aims to automatically assess people's public health policy adherence during emergent global health crisis events (e.g., COVID-19, MonkeyPox) by leveraging massive public health policy adherence imagery data from the social media. In particular, we study an <italic>optimal AI model design</i> problem in the PHPA application, where the goal is to leverage the crowdsourced human intelligence to accurately identify the optimal AI model design (i.e., network architecture and hyperparameter configuration combination) without the need of AI experts. However, two critical challenges exist in our problem: 1) it is challenging to effectively optimize the AI model design given the interdependence between network architecture and hyperparameter configuration; 2) it is non-trivial to leverage the human intelligence queried from ordinary crowd workers to identify the optimal AI model design in the PHPA application. To address these challenges, we develop <italic>CrowdDesign</i>, a subjective logic-driven human-AI collaborative learning framework that explores the complementary strength of AI and human intelligence to jointly identify the optimal network architecture and hyperparameter configuration of an AI model in the PHPA application. The experimental results from two real-world PHPA applications demonstrate that CrowdDesign consistently outperforms the state-of-the-art baseline methods by achieving the best PHPA performance.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"768-783"},"PeriodicalIF":5.4,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10756632","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145050805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-18DOI: 10.1109/TETC.2024.3496195
Luigi De Simone;Mario Di Mauro;Roberto Natella;Fabio Postiglione
Network Function Virtualization (NFV) converts legacy telecommunication systems into modular software appliances, known as service chains, running on the cloud. To address potential software aging-related issues, rejuvenation is often employed to clean up their state and maximize performance and availability. In this work, we propose a framework to model the performability of service chains with rejuvenation. Performance modeling uses queueing theory, specifically adopting an $M/G/m$ model with the Allen-Cunneen approximation, to capture real-world aspects related to service times. Availability modeling is addressed through the Multidimensional Universal Generating Function (MUGF), a recent technique that achieves computational efficiency when dealing with systems with many sub-elements, particularly useful for multi-provider service chains. Additionally, we deploy an experimental testbed based on the Open5GS service chain, to estimate key performance and availability parameters. Supported by experimental results, we evaluate the impact of rejuvenation on the performability of the Open5GS service chain. The numerical analysis shows that i) the configuration of replicas across nodes is important to meet availability goals; ii) rejuvenation can bring one additional “nine” of availability, depending on the time to recovery; and iii) MUGF can significantly reduce computational complexity through straightforward algebraic manipulations.
网络功能虚拟化(NFV)将传统的电信系统转换为运行在云上的模块化软件设备,即服务链。为了解决潜在的与软件老化相关的问题,通常使用恢复来清理它们的状态并最大化性能和可用性。在这项工作中,我们提出了一个框架来模拟服务链的可执行性。性能建模使用排队理论,特别是采用带有Allen-Cunneen近似的$M/G/ M $模型,来捕获与服务时间相关的真实方面。可用性建模是通过多维通用生成函数(multi- Universal Generating Function, MUGF)解决的,MUGF是一种最近的技术,在处理具有许多子元素的系统时实现了计算效率,对于多提供者服务链特别有用。此外,我们部署了一个基于Open5GS服务链的实验测试平台,以评估关键性能和可用性参数。在实验结果的支持下,我们评估了再生对Open5GS服务链性能的影响。数值分析表明,跨节点的副本配置对于满足可用性目标非常重要;Ii)恢复活力可以带来额外的“9”可用性,具体取决于恢复的时间;iii) MUGF可以通过直接的代数操作显著降低计算复杂度。
{"title":"Performability of Service Chains With Rejuvenation: A Multidimensional Universal Generating Function Approach","authors":"Luigi De Simone;Mario Di Mauro;Roberto Natella;Fabio Postiglione","doi":"10.1109/TETC.2024.3496195","DOIUrl":"https://doi.org/10.1109/TETC.2024.3496195","url":null,"abstract":"Network Function Virtualization (NFV) converts legacy telecommunication systems into modular software appliances, known as service chains, running on the cloud. To address potential software aging-related issues, rejuvenation is often employed to clean up their state and maximize performance and availability. In this work, we propose a framework to model the <i>performability</i> of service chains with rejuvenation. Performance modeling uses queueing theory, specifically adopting an <inline-formula><tex-math>$M/G/m$</tex-math></inline-formula> model with the Allen-Cunneen approximation, to capture real-world aspects related to service times. Availability modeling is addressed through the Multidimensional Universal Generating Function (MUGF), a recent technique that achieves computational efficiency when dealing with systems with many sub-elements, particularly useful for multi-provider service chains. Additionally, we deploy an experimental testbed based on the Open5GS service chain, to estimate key performance and availability parameters. Supported by experimental results, we evaluate the impact of rejuvenation on the performability of the Open5GS service chain. The numerical analysis shows that <i>i)</i> the configuration of replicas across nodes is important to meet availability goals; <i>ii)</i> rejuvenation can bring one additional “nine” of availability, depending on the time to recovery; and <i>iii)</i> MUGF can significantly reduce computational complexity through straightforward algebraic manipulations.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 2","pages":"341-353"},"PeriodicalIF":5.1,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144323051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}