首页 > 最新文献

Journal of Systems Architecture最新文献

英文 中文
VM-PHRs: Efficient and verifiable multi-delegated PHRs search scheme for cloud–edge collaborative services vm - phr:针对云边缘协同服务的高效且可验证的多委托phr搜索方案
IF 4.1 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-05 DOI: 10.1016/j.sysarc.2026.103689
Shiwen Zhang , Wenrui Zhu , Wei Liang , Arthur Sandor Voundi Koe , Neal N. Xiong
With the proliferation of smart healthcare services, many hospitals delegate PHRs processing to cloud-based resources. Despite its effectiveness for bounded search and selective record sharing over encrypted data, key-aggregate searchable encryption still suffers from significant drawbacks in current constructions. First, the existing trapdoor matching algorithms fail to achieve accurate matching and exhibit poor robustness against guessing attacks. Second, current works lack efficient mechanisms to enable fine-grained verification of search results. Third, there is currently no efficient mechanism to delegate user privileges. In this paper, we design an efficient and verifiable multi-delegated PHRs search scheme for cloud–edge collaborative services (VM-PHRs). To enable exact trapdoor matching and resist guessing attacks, we develop a new algorithm, EDAsearch. To achieve fine-grained verification of data integrity and correctness, we design a novel distributed protocol that operates over a network of edge servers. To accommodate real-world emergency scenarios, we develop a novel threshold mechanism that supports privilege delegation based on user attributes and hash commitments. Extensive security analysis and performance evaluation of VM-PHRs demonstrate that it is scalable, secure, and practical.
随着智能医疗保健服务的普及,许多医院将phrr处理委托给基于云的资源。尽管它在有界搜索和加密数据的选择性记录共享方面是有效的,但在当前的结构中,键聚合可搜索加密仍然存在明显的缺点。首先,现有的陷门匹配算法无法实现精确匹配,并且对猜测攻击的鲁棒性较差。其次,目前的工作缺乏有效的机制来实现对搜索结果的细粒度验证。第三,目前没有有效的机制来委派用户权限。本文针对云边缘协同服务(VM-PHRs),设计了一种高效、可验证的多委托PHRs搜索方案。为了实现精确的活板门匹配和抵抗猜测攻击,我们开发了一种新的算法,EDAsearch。为了实现对数据完整性和正确性的细粒度验证,我们设计了一个在边缘服务器网络上运行的新型分布式协议。为了适应现实世界的紧急情况,我们开发了一种新的阈值机制,该机制支持基于用户属性和哈希承诺的特权委托。VM-PHRs的广泛安全分析和性能评估表明,它具有可扩展性、安全性和实用性。
{"title":"VM-PHRs: Efficient and verifiable multi-delegated PHRs search scheme for cloud–edge collaborative services","authors":"Shiwen Zhang ,&nbsp;Wenrui Zhu ,&nbsp;Wei Liang ,&nbsp;Arthur Sandor Voundi Koe ,&nbsp;Neal N. Xiong","doi":"10.1016/j.sysarc.2026.103689","DOIUrl":"10.1016/j.sysarc.2026.103689","url":null,"abstract":"<div><div>With the proliferation of smart healthcare services, many hospitals delegate PHRs processing to cloud-based resources. Despite its effectiveness for bounded search and selective record sharing over encrypted data, key-aggregate searchable encryption still suffers from significant drawbacks in current constructions. First, the existing trapdoor matching algorithms fail to achieve accurate matching and exhibit poor robustness against guessing attacks. Second, current works lack efficient mechanisms to enable fine-grained verification of search results. Third, there is currently no efficient mechanism to delegate user privileges. In this paper, we design an efficient and verifiable multi-delegated PHRs search scheme for cloud–edge collaborative services (VM-PHRs). To enable exact trapdoor matching and resist guessing attacks, we develop a new algorithm, EDAsearch. To achieve fine-grained verification of data integrity and correctness, we design a novel distributed protocol that operates over a network of edge servers. To accommodate real-world emergency scenarios, we develop a novel threshold mechanism that supports privilege delegation based on user attributes and hash commitments. Extensive security analysis and performance evaluation of VM-PHRs demonstrate that it is scalable, secure, and practical.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"172 ","pages":"Article 103689"},"PeriodicalIF":4.1,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Group theory-based differential evolution algorithm for efficient DAG scheduling on heterogeneous clustered multi-core system 基于群理论的异构集群多核系统DAG高效调度差分进化算法
IF 4.1 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-05 DOI: 10.1016/j.sysarc.2026.103695
Yaodong Guo, Shuangshuang Chang, Dong Ji, Shiyue Qin, Te Xu
Efficient parallel application scheduling algorithms are crucial for optimizing performance on heterogeneous clustered multi-core systems. The primary objective of scheduling is to reduce the makespan of parallel applications, typically represented as Directed Acyclic Graphs (DAGs). This paper introduces a Group Theory-based Differential Evolution (GTDE1) algorithm to address the NP-complete DAG scheduling problem, to minimize makespan and computation time. The GTDE algorithm leverages group theory to explore the inherent symmetry in system architectures, enabling the classification of scheduling schemes and thus reducing redundant computations while maintaining population diversity. To further enhance performance, the algorithm employs an Opposition-Based Learning (OBL) strategy to improve the initial population and integrates a hybrid mutation strategy for more efficient exploration of the solution space. Experimental results demonstrate that the GTDE algorithm consistently outperforms state-of-the-art DAG scheduling algorithms in terms of performance metrics, such as makespan and computation time, with average improvements of 36% and 73%, respectively, achieving superior performance across various scenarios.
高效的并行应用程序调度算法对于优化异构集群多核系统的性能至关重要。调度的主要目标是减少并行应用程序的最大运行时间,通常表示为有向无环图(dag)。本文提出了一种基于群论的差分进化算法(GTDE1)来解决np完全DAG调度问题,以最小化完工时间和计算时间。GTDE算法利用群理论来探索系统架构中固有的对称性,实现调度方案的分类,从而在保持种群多样性的同时减少冗余计算。为了进一步提高性能,该算法采用了基于对立的学习(OBL)策略来改进初始种群,并集成了混合突变策略来更有效地探索解空间。实验结果表明,GTDE算法在makespan和计算时间等性能指标上始终优于最先进的DAG调度算法,平均分别提高36%和73%,在各种场景下都具有卓越的性能。
{"title":"Group theory-based differential evolution algorithm for efficient DAG scheduling on heterogeneous clustered multi-core system","authors":"Yaodong Guo,&nbsp;Shuangshuang Chang,&nbsp;Dong Ji,&nbsp;Shiyue Qin,&nbsp;Te Xu","doi":"10.1016/j.sysarc.2026.103695","DOIUrl":"10.1016/j.sysarc.2026.103695","url":null,"abstract":"<div><div>Efficient parallel application scheduling algorithms are crucial for optimizing performance on heterogeneous clustered multi-core systems. The primary objective of scheduling is to reduce the makespan of parallel applications, typically represented as Directed Acyclic Graphs (DAGs). This paper introduces a Group Theory-based Differential Evolution (GTDE<span><span><sup>1</sup></span></span>) algorithm to address the NP-complete DAG scheduling problem, to minimize makespan and computation time. The GTDE algorithm leverages group theory to explore the inherent symmetry in system architectures, enabling the classification of scheduling schemes and thus reducing redundant computations while maintaining population diversity. To further enhance performance, the algorithm employs an Opposition-Based Learning (OBL) strategy to improve the initial population and integrates a hybrid mutation strategy for more efficient exploration of the solution space. Experimental results demonstrate that the GTDE algorithm consistently outperforms state-of-the-art DAG scheduling algorithms in terms of performance metrics, such as makespan and computation time, with average improvements of 36% and 73%, respectively, achieving superior performance across various scenarios.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"172 ","pages":"Article 103695"},"PeriodicalIF":4.1,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A co-optimization framework toward energy-efficient cloud–edge inference with stochastic computing and precision-compensating NAS 基于随机计算和精度补偿NAS的节能云边缘推理协同优化框架
IF 4.1 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-05 DOI: 10.1016/j.sysarc.2026.103693
Jihe Wang , Mengchao Zhang , Huijuan Duan , Kuizhi Mei , Danghui Wang , Meikang Qiu
The advancement of edge and cloud computing technologies presents a significant challenge in intelligent computing: efficiently allocating computational tasks between edge devices and the cloud to leverage their respective resource advantages and optimize overall system performance. A core difficulty lies in balancing accuracy, energy efficiency, and latency, particularly given the resource-constrained nature of edge devices compared to the powerful computational capabilities of the cloud. To address this challenge, we propose a co-optimization framework for energy-efficient cloud–edge inference based on stochastic computing. In this framework, the frontend employs stochastic computing (SC) alongside a search for optimal bit-width and layer count to achieve a lightweight design and reduce power consumption. The backend utilizes neural architecture search (NAS) to optimize accuracy. A joint optimization framework holistically balances power consumption, latency, and accuracy to enhance overall system performance. Experimental results indicate that the frontend power consumption is reduced by approximately 35% compared to conventional binary computing methods. The co-optimization framework maintains near-baseline accuracy with only 0.2% degradation, while achieving an energy efficiency ratio more than 1.55 times greater and a power-delay product (PDP) between 0.77 and 0.92 times that of the original binary computing.
边缘和云计算技术的进步对智能计算提出了重大挑战:在边缘设备和云之间有效分配计算任务,以利用各自的资源优势并优化整体系统性能。核心困难在于平衡准确性、能源效率和延迟,特别是考虑到与云的强大计算能力相比,边缘设备的资源限制性质。为了解决这一挑战,我们提出了一种基于随机计算的节能云边缘推理协同优化框架。在这个框架中,前端采用随机计算(SC),同时搜索最佳位宽和层数,以实现轻量级设计并降低功耗。后端利用神经结构搜索(NAS)优化精度。联合优化框架从整体上平衡了功耗、延迟和准确性,以提高整体系统性能。实验结果表明,与传统的二进制计算方法相比,前端功耗降低了约35%。协同优化框架保持了接近基线的精度,仅下降0.2%,同时实现了比原始二进制计算高出1.55倍以上的能效比和0.77到0.92倍的功率延迟积(PDP)。
{"title":"A co-optimization framework toward energy-efficient cloud–edge inference with stochastic computing and precision-compensating NAS","authors":"Jihe Wang ,&nbsp;Mengchao Zhang ,&nbsp;Huijuan Duan ,&nbsp;Kuizhi Mei ,&nbsp;Danghui Wang ,&nbsp;Meikang Qiu","doi":"10.1016/j.sysarc.2026.103693","DOIUrl":"10.1016/j.sysarc.2026.103693","url":null,"abstract":"<div><div>The advancement of edge and cloud computing technologies presents a significant challenge in intelligent computing: efficiently allocating computational tasks between edge devices and the cloud to leverage their respective resource advantages and optimize overall system performance. A core difficulty lies in balancing accuracy, energy efficiency, and latency, particularly given the resource-constrained nature of edge devices compared to the powerful computational capabilities of the cloud. To address this challenge, we propose a co-optimization framework for energy-efficient cloud–edge inference based on stochastic computing. In this framework, the frontend employs stochastic computing (SC) alongside a search for optimal bit-width and layer count to achieve a lightweight design and reduce power consumption. The backend utilizes neural architecture search (NAS) to optimize accuracy. A joint optimization framework holistically balances power consumption, latency, and accuracy to enhance overall system performance. Experimental results indicate that the frontend power consumption is reduced by approximately 35% compared to conventional binary computing methods. The co-optimization framework maintains near-baseline accuracy with only 0.2% degradation, while achieving an energy efficiency ratio more than 1.55 times greater and a power-delay product (PDP) between 0.77 and 0.92 times that of the original binary computing.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"172 ","pages":"Article 103693"},"PeriodicalIF":4.1,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accelerating language giants: A survey of optimization strategies for LLM inference on hardware platforms 加速语言巨头:硬件平台上LLM推理优化策略的调查
IF 4.1 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-03 DOI: 10.1016/j.sysarc.2026.103690
Young Chan Kim , Seok Kyu Yoon , Sung Soo Han, Chae Won Park, Jun Oh Park, Jun Ha Ko, Hyun Kim
With the emergence of transformer-based models that have demonstrated remarkable performance in natural language processing tasks, large language models (LLMs) built upon the transformer architecture and trained on massive datasets have achieved outstanding results in various tasks such as translation and summarization. Among these, decoder-only LLMs have garnered significant attention due to their superior few-shot and zero-shot capabilities compared to other architectures. Motivated by their exceptional performance, numerous efforts have been made to deploy decoder-only LLMs on diverse hardware platforms. However, the substantial computational and memory demands during both training and inference pose considerable challenges for resource-constrained hardware. Although efficient architectural designs have been proposed to address these issues, LLM inference continues to require excessive computational and memory resources. Consequently, extensive research has been conducted to compress model components and enhance inference efficiency across different hardware platforms. To further accelerate the inherently repetitive computations of LLMs, a variety of approaches have been introduced, integrating operator-level optimizations within Transformer blocks and system-level optimizations at the granularity of repeated Transformer block execution. This paper surveys recent research on decoder-only LLM inference acceleration, categorizing existing approaches based on optimization levels specific to each hardware platform. Building on this classification, we provide a comprehensive analysis of prior decoder-only LLM acceleration techniques from multiple perspectives.
随着基于变压器的模型在自然语言处理任务中表现出色的出现,基于变压器架构并在海量数据集上训练的大型语言模型(llm)在翻译和摘要等各种任务中取得了优异的成绩。其中,与其他架构相比,仅解码器的llm由于其优越的少射和零射能力而引起了极大的关注。由于其卓越的性能,人们已经做出了许多努力,在不同的硬件平台上部署只有解码器的llm。然而,在训练和推理过程中,大量的计算和内存需求对资源受限的硬件构成了相当大的挑战。尽管已经提出了有效的体系结构设计来解决这些问题,但LLM推理仍然需要过多的计算和内存资源。因此,人们对压缩模型组件和提高不同硬件平台的推理效率进行了广泛的研究。为了进一步加速llm固有的重复计算,已经引入了各种方法,在Transformer块中集成了操作员级优化,并在重复Transformer块执行的粒度上集成了系统级优化。本文综述了最近关于仅解码器的LLM推理加速的研究,并基于特定于每个硬件平台的优化级别对现有方法进行了分类。在此分类的基础上,我们从多个角度对先前的仅解码器的LLM加速技术进行了全面分析。
{"title":"Accelerating language giants: A survey of optimization strategies for LLM inference on hardware platforms","authors":"Young Chan Kim ,&nbsp;Seok Kyu Yoon ,&nbsp;Sung Soo Han,&nbsp;Chae Won Park,&nbsp;Jun Oh Park,&nbsp;Jun Ha Ko,&nbsp;Hyun Kim","doi":"10.1016/j.sysarc.2026.103690","DOIUrl":"10.1016/j.sysarc.2026.103690","url":null,"abstract":"<div><div>With the emergence of transformer-based models that have demonstrated remarkable performance in natural language processing tasks, large language models (LLMs) built upon the transformer architecture and trained on massive datasets have achieved outstanding results in various tasks such as translation and summarization. Among these, decoder-only LLMs have garnered significant attention due to their superior few-shot and zero-shot capabilities compared to other architectures. Motivated by their exceptional performance, numerous efforts have been made to deploy decoder-only LLMs on diverse hardware platforms. However, the substantial computational and memory demands during both training and inference pose considerable challenges for resource-constrained hardware. Although efficient architectural designs have been proposed to address these issues, LLM inference continues to require excessive computational and memory resources. Consequently, extensive research has been conducted to compress model components and enhance inference efficiency across different hardware platforms. To further accelerate the inherently repetitive computations of LLMs, a variety of approaches have been introduced, integrating operator-level optimizations within Transformer blocks and system-level optimizations at the granularity of repeated Transformer block execution. This paper surveys recent research on decoder-only LLM inference acceleration, categorizing existing approaches based on optimization levels specific to each hardware platform. Building on this classification, we provide a comprehensive analysis of prior decoder-only LLM acceleration techniques from multiple perspectives.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"172 ","pages":"Article 103690"},"PeriodicalIF":4.1,"publicationDate":"2026-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MetaDTS: Distribution difference-based adaptive test input selection for Deep Neural Networks 基于分布差异的深度神经网络自适应测试输入选择
IF 4.1 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-03 DOI: 10.1016/j.sysarc.2025.103682
Xiang Su , Zhibin Yang , Qu Liu , Hao Liu , Yong Zhou , Zhiqiu Huang
Deep Neural Networks (DNNs) are widely used in various safety-critical domains. Due to data distribution shifts, DNNs may exhibit unforeseen faults that lead to serious safety risks. To reveal potential faults within DNNs, a vast number of labeled test data are required, but the labeling process is time-consuming. To solve this problem, test input selection methods improve efficiency by selecting a subset of test inputs more likely to reveal DNN model faults. However, existing methods often focus solely on single data distribution characteristics and overlook the complex differences and diversity among test inputs. In this paper, we propose MetaDTS, a distribution difference-based adaptive test input selection method that comprehensively assesses the feature and probability distribution differences of test inputs. MetaDTS employs a meta-model to evaluate the probability of misclassification of test inputs and select them. To effectively capture the complex differences among test inputs, we introduce two novel uncertainty metrics: Feature Distribution Difference (FDD) and Probability Distribution Difference (PDD). By integrating these metrics, MetaDTS adaptively selects test inputs that can reveal diverse faults of DNN models. We conducted extensive experiments on five datasets and seven models, comparing MetaDTS with nine baseline methods. The results demonstrate that MetaDTS significantly outperforms the baseline methods in selecting test inputs with high fault-revealing capability, model optimization, and enhancing test inputs diversity.
深度神经网络(dnn)广泛应用于各种安全关键领域。由于数据分布的变化,dnn可能会出现不可预见的故障,从而导致严重的安全风险。为了揭示深层神经网络的潜在缺陷,需要大量的标记测试数据,但标记过程非常耗时。为了解决这个问题,测试输入选择方法通过选择更有可能显示DNN模型故障的测试输入子集来提高效率。然而,现有的方法往往只关注单个数据的分布特征,而忽略了测试输入之间的复杂差异和多样性。本文提出了一种基于分布差异的自适应测试输入选择方法MetaDTS,该方法综合评估测试输入的特征和概率分布差异。MetaDTS使用元模型来评估测试输入错误分类的概率并选择它们。为了有效地捕获测试输入之间的复杂差异,我们引入了两种新的不确定性度量:特征分布差异(FDD)和概率分布差异(PDD)。通过整合这些指标,MetaDTS自适应地选择可以揭示DNN模型各种故障的测试输入。我们在5个数据集和7个模型上进行了广泛的实验,比较了MetaDTS和9种基线方法。结果表明,MetaDTS在选择具有高故障揭示能力的测试输入、模型优化和增强测试输入多样性方面显著优于基线方法。
{"title":"MetaDTS: Distribution difference-based adaptive test input selection for Deep Neural Networks","authors":"Xiang Su ,&nbsp;Zhibin Yang ,&nbsp;Qu Liu ,&nbsp;Hao Liu ,&nbsp;Yong Zhou ,&nbsp;Zhiqiu Huang","doi":"10.1016/j.sysarc.2025.103682","DOIUrl":"10.1016/j.sysarc.2025.103682","url":null,"abstract":"<div><div>Deep Neural Networks (DNNs) are widely used in various safety-critical domains. Due to data distribution shifts, DNNs may exhibit unforeseen faults that lead to serious safety risks. To reveal potential faults within DNNs, a vast number of labeled test data are required, but the labeling process is time-consuming. To solve this problem, test input selection methods improve efficiency by selecting a subset of test inputs more likely to reveal DNN model faults. However, existing methods often focus solely on single data distribution characteristics and overlook the complex differences and diversity among test inputs. In this paper, we propose MetaDTS, a distribution difference-based adaptive test input selection method that comprehensively assesses the feature and probability distribution differences of test inputs. MetaDTS employs a meta-model to evaluate the probability of misclassification of test inputs and select them. To effectively capture the complex differences among test inputs, we introduce two novel uncertainty metrics: Feature Distribution Difference (FDD) and Probability Distribution Difference (PDD). By integrating these metrics, MetaDTS adaptively selects test inputs that can reveal diverse faults of DNN models. We conducted extensive experiments on five datasets and seven models, comparing MetaDTS with nine baseline methods. The results demonstrate that MetaDTS significantly outperforms the baseline methods in selecting test inputs with high fault-revealing capability, model optimization, and enhancing test inputs diversity.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"172 ","pages":"Article 103682"},"PeriodicalIF":4.1,"publicationDate":"2026-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A survey on data management for Out-of-Core GNN systems 核外GNN系统数据管理综述
IF 4.1 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-02 DOI: 10.1016/j.sysarc.2026.103684
Junchi Ren , Chao Li , Xiaowei Chen , Yao Chen , Zehao Chen , Qian Wei , Zhaoyan Shen
Graph neural networks (GNNs) have emerged as a powerful model for their effectiveness in learning over graphs, with broad applications in domains such as biology, e-commerce, and materials science. With the rapid growth of real-world graphs, efficient data management in GNNs has become a formidable challenge. Out-of-core (OOC) GNN system, as a representative solution, leverages external storage (CPU memory and SSD) to enable large-scale graph training on a single machine. Based on the scale of graph data, OOC GNN systems adopt different levels of storage extension and can be classified into two categories, semi OOC and fully OOC systems. However, the optimization details for both categories of OOC GNN systems remain only preliminarily understood. To address this gap, we provide a comprehensive survey of existing optimization techniques for semi OOC and fully OOC systems from the perspective of data management. We decompose the data management mechanisms into three layers: data storage, data organization, and data transfer, where data storage refers to the placement of graph data on disk, data organization pertains to the adaptive strategy that decides where to place graph data across different memory hierarchies during training, and data transfer concerns the I/O path of data movement between these storage layers. For each layer, we discuss the key challenges, review the corresponding optimization strategies proposed in existing OOC GNN systems, and analyze their advantages and limitations. Furthermore, we outline future research directions for data management of OOC GNN systems.
图神经网络(gnn)已经成为一种强大的模型,因为它们在图上学习的有效性,在生物学、电子商务和材料科学等领域有着广泛的应用。随着现实世界图的快速增长,gnn的高效数据管理已成为一项艰巨的挑战。外核(OOC) GNN系统作为一种代表性的解决方案,利用外部存储(CPU内存和SSD)在单个机器上实现大规模图训练。基于图数据的规模,OOC GNN系统采用不同程度的存储扩展,可分为半OOC和完全OOC两类系统。然而,这两类OOC GNN系统的优化细节还只是初步了解。为了解决这一差距,我们从数据管理的角度对半面向对象和完全面向对象系统的现有优化技术进行了全面的调查。我们将数据管理机制分解为三层:数据存储、数据组织和数据传输,其中数据存储指的是图形数据在磁盘上的放置,数据组织涉及自适应策略,该策略决定在训练期间在不同内存层次中放置图形数据的位置,数据传输涉及这些存储层之间数据移动的I/O路径。对于每一层,我们讨论了关键挑战,回顾了现有OOC GNN系统中提出的相应优化策略,并分析了它们的优势和局限性。展望了未来OOC GNN系统数据管理的研究方向。
{"title":"A survey on data management for Out-of-Core GNN systems","authors":"Junchi Ren ,&nbsp;Chao Li ,&nbsp;Xiaowei Chen ,&nbsp;Yao Chen ,&nbsp;Zehao Chen ,&nbsp;Qian Wei ,&nbsp;Zhaoyan Shen","doi":"10.1016/j.sysarc.2026.103684","DOIUrl":"10.1016/j.sysarc.2026.103684","url":null,"abstract":"<div><div>Graph neural networks (GNNs) have emerged as a powerful model for their effectiveness in learning over graphs, with broad applications in domains such as biology, e-commerce, and materials science. With the rapid growth of real-world graphs, efficient data management in GNNs has become a formidable challenge. Out-of-core (OOC) GNN system, as a representative solution, leverages external storage (CPU memory and SSD) to enable large-scale graph training on a single machine. Based on the scale of graph data, OOC GNN systems adopt different levels of storage extension and can be classified into two categories, semi OOC and fully OOC systems. However, the optimization details for both categories of OOC GNN systems remain only preliminarily understood. To address this gap, we provide a comprehensive survey of existing optimization techniques for semi OOC and fully OOC systems from the perspective of data management. We decompose the data management mechanisms into three layers: data storage, data organization, and data transfer, where data storage refers to the placement of graph data on disk, data organization pertains to the adaptive strategy that decides where to place graph data across different memory hierarchies during training, and data transfer concerns the I/O path of data movement between these storage layers. For each layer, we discuss the key challenges, review the corresponding optimization strategies proposed in existing OOC GNN systems, and analyze their advantages and limitations. Furthermore, we outline future research directions for data management of OOC GNN systems.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"173 ","pages":"Article 103684"},"PeriodicalIF":4.1,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An efficient privacy-preserving transformer inference scheme for cloud-based intelligent decision-making in AIoT 基于云的AIoT智能决策中一种高效的保隐私变压器推理方案
IF 4.1 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-02 DOI: 10.1016/j.sysarc.2026.103687
Mingshun Luo , Haolei He , Wenti Yang , Shuai Yuan , Zhitao Guan
The Artificial Intelligence of Things (AIoT) is transforming modern society by combining the data-collection capabilities of IoT devices with the inference power of cloud-based large language models (LLMs). However, transmitting sensitive data to the cloud for intelligent decision-making raises significant privacy concerns. Cryptographic techniques such as homomorphic encryption (HE) and secure multi-party computation (MPC) provide promising solutions for privacy-preserving inference. However, existing schemes primarily target small-scale models and are inefficient when applied to Transformer-based LLMs, which involve large-scale matrix multiplications and complex non-linear functions, and deep model architectures. To address these challenges, we propose an efficient privacy-preserving Transformer inference scheme for cloud-based AIoT scenarios. Our framework integrates HE and MPC to ensure data confidentiality while minimizing computational and communication overhead. We design a fast HE-based matrix multiplication protocol using an offline-online collaborative pipeline and single instruction multiple data (SIMD)-based packing rules. Furthermore, we develop an accurate and efficient MPC-based non-linear function evaluation protocol using optimized piecewise polynomial approximation and integer-fraction decomposition. Experimental results show that our approach achieves 8.3×–91.6× faster in matrix multiplication, 1.4×–19× faster in non-linear function evaluation, and 3.5×–137.9× reduction in communication overhead with the LAN network, while maintaining lossless accuracy, thus enabling secure and scalable intelligent decision-making in AIoT environments.
物联网人工智能(AIoT)通过将物联网设备的数据收集能力与基于云的大型语言模型(llm)的推理能力相结合,正在改变现代社会。然而,将敏感数据传输到云端以进行智能决策引发了严重的隐私问题。同态加密(HE)和安全多方计算(MPC)等密码学技术为隐私保护推理提供了有前途的解决方案。然而,现有方案主要针对小规模模型,并且在应用于基于变压器的llm时效率低下,这涉及大规模矩阵乘法和复杂的非线性函数,以及深度模型架构。为了解决这些挑战,我们为基于云的AIoT场景提出了一种有效的隐私保护Transformer推理方案。我们的框架集成了HE和MPC,以确保数据机密性,同时最大限度地减少计算和通信开销。我们使用离线-在线协同管道和基于单指令多数据(SIMD)的打包规则设计了一个基于he的快速矩阵乘法协议。此外,我们开发了一种精确和高效的基于mpc的非线性函数评估协议,该协议采用优化的分段多项式近似和积分分数分解。实验结果表明,我们的方法在保持无损精度的同时,实现了8.3×-91.6×更快的矩阵乘法速度,1.4×-19×更快的非线性函数求值速度,3.5×-137.9×减少了与局域网网络的通信开销,从而实现了AIoT环境中安全且可扩展的智能决策。
{"title":"An efficient privacy-preserving transformer inference scheme for cloud-based intelligent decision-making in AIoT","authors":"Mingshun Luo ,&nbsp;Haolei He ,&nbsp;Wenti Yang ,&nbsp;Shuai Yuan ,&nbsp;Zhitao Guan","doi":"10.1016/j.sysarc.2026.103687","DOIUrl":"10.1016/j.sysarc.2026.103687","url":null,"abstract":"<div><div>The Artificial Intelligence of Things (AIoT) is transforming modern society by combining the data-collection capabilities of IoT devices with the inference power of cloud-based large language models (LLMs). However, transmitting sensitive data to the cloud for intelligent decision-making raises significant privacy concerns. Cryptographic techniques such as homomorphic encryption (HE) and secure multi-party computation (MPC) provide promising solutions for privacy-preserving inference. However, existing schemes primarily target small-scale models and are inefficient when applied to Transformer-based LLMs, which involve large-scale matrix multiplications and complex non-linear functions, and deep model architectures. To address these challenges, we propose an efficient privacy-preserving Transformer inference scheme for cloud-based AIoT scenarios. Our framework integrates HE and MPC to ensure data confidentiality while minimizing computational and communication overhead. We design a fast HE-based matrix multiplication protocol using an offline-online collaborative pipeline and single instruction multiple data (SIMD)-based packing rules. Furthermore, we develop an accurate and efficient MPC-based non-linear function evaluation protocol using optimized piecewise polynomial approximation and integer-fraction decomposition. Experimental results show that our approach achieves 8.3<span><math><mo>×</mo></math></span>–91.6<span><math><mo>×</mo></math></span> faster in matrix multiplication, 1.4<span><math><mo>×</mo></math></span>–19<span><math><mo>×</mo></math></span> faster in non-linear function evaluation, and 3.5<span><math><mo>×</mo></math></span>–137.9<span><math><mo>×</mo></math></span> reduction in communication overhead with the LAN network, while maintaining lossless accuracy, thus enabling secure and scalable intelligent decision-making in AIoT environments.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"172 ","pages":"Article 103687"},"PeriodicalIF":4.1,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards privacy preservation in smart grids via controlled redactable signatures 通过可控可读签名实现智能电网的隐私保护
IF 4.1 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-02 DOI: 10.1016/j.sysarc.2026.103688
Siyuan Shen , Xiaoying Jia , Min Luo , Zhiyan Xu , Zhichao Zhou
The smart grid provides a flexible interactive platform for energy stakeholders by sharing energy usage data to enhance management efficiency and service accuracy. However, such data are often highly sensitive and vulnerable to eavesdropping and tampering during transmission. Ensuring data authenticity, integrity and users’ privacy is therefore critical. Redactable signatures have emerged as a promising cryptographic primitive to address these concerns. Nonetheless, most existing redactable signature schemes lack fine-grained control over the redaction process, making them susceptible to unauthorized or malicious modifications. To address this issue, we propose an identity-based Controlled Redactable Signature Scheme (CRSS), enabling users to selectively disclose information under controlled conditions without revealing private information. We define a formal security model and prove that the proposed scheme achieves unforgeability, redaction controllability, privacy, and transparency. Furthermore, theoretical analysis and experimental evaluation demonstrate that our scheme offers superior efficiency and practicality compared to existing approaches.
智能电网通过共享能源使用数据,为能源利益相关者提供灵活的互动平台,提高管理效率和服务准确性。然而,这些数据往往高度敏感,在传输过程中容易被窃听和篡改。因此,确保数据的真实性、完整性和用户隐私至关重要。可读签名已经成为解决这些问题的一种很有前途的加密原语。尽管如此,大多数现有的可重写签名方案缺乏对编校过程的细粒度控制,使它们容易受到未经授权或恶意修改的影响。为了解决这个问题,我们提出了一种基于身份的受控可读签名方案(CRSS),使用户能够在受控条件下选择性地披露信息而不会泄露私人信息。我们定义了一个形式化的安全模型,并证明了该方案具有不可伪造性、编校可控性、隐私性和透明性。理论分析和实验评价表明,与现有方法相比,该方案具有更高的效率和实用性。
{"title":"Towards privacy preservation in smart grids via controlled redactable signatures","authors":"Siyuan Shen ,&nbsp;Xiaoying Jia ,&nbsp;Min Luo ,&nbsp;Zhiyan Xu ,&nbsp;Zhichao Zhou","doi":"10.1016/j.sysarc.2026.103688","DOIUrl":"10.1016/j.sysarc.2026.103688","url":null,"abstract":"<div><div>The smart grid provides a flexible interactive platform for energy stakeholders by sharing energy usage data to enhance management efficiency and service accuracy. However, such data are often highly sensitive and vulnerable to eavesdropping and tampering during transmission. Ensuring data authenticity, integrity and users’ privacy is therefore critical. Redactable signatures have emerged as a promising cryptographic primitive to address these concerns. Nonetheless, most existing redactable signature schemes lack fine-grained control over the redaction process, making them susceptible to unauthorized or malicious modifications. To address this issue, we propose an identity-based Controlled Redactable Signature Scheme (CRSS), enabling users to selectively disclose information under controlled conditions without revealing private information. We define a formal security model and prove that the proposed scheme achieves unforgeability, redaction controllability, privacy, and transparency. Furthermore, theoretical analysis and experimental evaluation demonstrate that our scheme offers superior efficiency and practicality compared to existing approaches.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"172 ","pages":"Article 103688"},"PeriodicalIF":4.1,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Covert feature-space adversarial perturbation using natural evolution strategies in distributed deep learning 分布式深度学习中使用自然进化策略的隐蔽特征空间对抗摄动
IF 4.1 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-02 DOI: 10.1016/j.sysarc.2026.103691
Arash Golabi , Abdelkarim Erradi , Ahmed Bensaid , Abdulla Al-Ali , Uvais Qidwai
Distributed Deep Learning (DDL) partitions deep neural networks across multiple devices, enhancing efficiency in large-scale inference tasks. However, this segmentation exposes intermediate-layer feature maps to new security vulnerabilities, expanding the attack surface beyond traditional input-level threats. This work investigates an adaptation of Natural Evolution Strategies (NES), named NES with Random Uniform Perturbation (NES-RUP), for adversarial manipulation of intermediate-layer feature maps in horizontally distributed inference systems. Instead of Gaussian-based perturbation sampling, the proposed method utilizes uniformly distributed noise and targets only a subset of feature map channels. This design improves stealth by using uniform noise distributions, which avoid extreme outliers and limit perturbations to bounded ranges, keeping activations closer to their clean values and thereby reducing anomaly detection likelihood, while also aligning with the performance and privacy constraints of AIoT-enabled smart environments. Extensive experiments on VGG16, ResNet50 and DeiT-Tiny (a Vision Transformer) using the CIFAR-10 and Mini-ImageNet datasets demonstrate that the adapted NES method achieves high misclassification rates with minimal feature-level distortion, preserving the statistical characteristics of natural feature activations. Furthermore, it successfully bypasses common defenses such as low-pass filtering and feature map anomaly detection (e.g., PseudoNet), revealing critical vulnerabilities in collaborative inference. These findings underscore the need for dedicated defense strategies that address intermediate-layer threats in secure AIoT infrastructures.
分布式深度学习(DDL)将深度神经网络划分到多个设备上,提高了大规模推理任务的效率。然而,这种分割将中间层特征映射暴露给新的安全漏洞,将攻击面扩展到传统的输入级威胁之外。这项工作研究了自然进化策略(NES)的适应性,称为随机均匀扰动(NES- rup)的NES,用于水平分布推理系统中中间层特征映射的对抗性操作。该方法利用均匀分布的噪声,仅针对特征映射通道的子集,而不是基于高斯的扰动采样。这种设计通过使用均匀的噪声分布来提高隐身性,避免了极端的异常值,并将扰动限制在有限的范围内,使激活更接近其干净值,从而降低了异常检测的可能性,同时也符合支持aiiot的智能环境的性能和隐私约束。使用CIFAR-10和Mini-ImageNet数据集在VGG16、ResNet50和DeiT-Tiny (Vision Transformer)上进行的大量实验表明,改进的NES方法在保持自然特征激活的统计特征的同时,以最小的特征级失真实现了较高的误分类率。此外,它成功地绕过了常见的防御,如低通滤波和特征映射异常检测(例如,伪网络),揭示了协作推理中的关键漏洞。这些发现强调了需要专门的防御策略来解决安全AIoT基础设施中的中间层威胁。
{"title":"Covert feature-space adversarial perturbation using natural evolution strategies in distributed deep learning","authors":"Arash Golabi ,&nbsp;Abdelkarim Erradi ,&nbsp;Ahmed Bensaid ,&nbsp;Abdulla Al-Ali ,&nbsp;Uvais Qidwai","doi":"10.1016/j.sysarc.2026.103691","DOIUrl":"10.1016/j.sysarc.2026.103691","url":null,"abstract":"<div><div>Distributed Deep Learning (DDL) partitions deep neural networks across multiple devices, enhancing efficiency in large-scale inference tasks. However, this segmentation exposes intermediate-layer feature maps to new security vulnerabilities, expanding the attack surface beyond traditional input-level threats. This work investigates an adaptation of Natural Evolution Strategies (NES), named NES with Random Uniform Perturbation (NES-RUP), for adversarial manipulation of intermediate-layer feature maps in horizontally distributed inference systems. Instead of Gaussian-based perturbation sampling, the proposed method utilizes uniformly distributed noise and targets only a subset of feature map channels. This design improves stealth by using uniform noise distributions, which avoid extreme outliers and limit perturbations to bounded ranges, keeping activations closer to their clean values and thereby reducing anomaly detection likelihood, while also aligning with the performance and privacy constraints of AIoT-enabled smart environments. Extensive experiments on VGG16, ResNet50 and DeiT-Tiny (a Vision Transformer) using the CIFAR-10 and Mini-ImageNet datasets demonstrate that the adapted NES method achieves high misclassification rates with minimal feature-level distortion, preserving the statistical characteristics of natural feature activations. Furthermore, it successfully bypasses common defenses such as low-pass filtering and feature map anomaly detection (e.g., PseudoNet), revealing critical vulnerabilities in collaborative inference. These findings underscore the need for dedicated defense strategies that address intermediate-layer threats in secure AIoT infrastructures.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"172 ","pages":"Article 103691"},"PeriodicalIF":4.1,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unmixing gradients: Uncovering persistent leakage in Federated Learning via Independent Component Analysis 解混梯度:通过独立成分分析发现联邦学习中的持续泄漏
IF 4.1 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-31 DOI: 10.1016/j.sysarc.2025.103681
Minhao Li , Le Wang , Zhaohua Li , Rongxin Hu , Tang Zhou , Binxing Fang
Federated Learning (FL), a cornerstone of privacy-preserving technology in the Artificial Intelligence of Things (AIoT) era, enables collaborative model training across edge devices via gradient sharing. However, Gradient Leakage Attacks (GLA) can reconstruct private data from these gradients, fundamentally undermining FL’s privacy guarantees. While recent Generation-based GLAs demonstrate significant promise in efficiency and quality, their core feature separation algorithms rely heavily on idealized properties of early-stage training. This dependency causes attack performance to decay rapidly as training progresses and fails in challenging scenarios, such as batches with duplicate labels. To overcome this, we reframe feature separation as a Blind Source Separation (BSS) problem, solved using Independent Component Analysis (ICA). Through a systematic analysis of the entire training lifecycle, we uncover the adversarial dynamics of key mathematical properties that govern BSS solvability. Based on these insights, we introduce a novel framework: ICA-driven Generative Attacks (ICA-GA). Extensive experiments show that ICA-GA significantly outperforms baselines throughout the training lifecycle and exhibits remarkable robustness against challenging conditions, including batches with full label duplication, FedAVG, and gradient compression. Furthermore, our incremental generator fine-tuning strategy reduces the marginal cost of continuous multi-round attacks by an order of magnitude, making such threats highly practical. Our work reveals that the privacy risk in FL is far more persistent and severe than previously understood. Our code is publicly available at https://github.com/liminhao-99/ICA-GA.
联邦学习(FL)是物联网人工智能(AIoT)时代隐私保护技术的基石,它通过梯度共享实现了跨边缘设备的协作模型训练。然而,梯度泄漏攻击(GLA)可以从这些梯度重建私有数据,从根本上破坏FL的隐私保证。虽然最近基于世代的GLAs在效率和质量方面表现出了巨大的希望,但它们的核心特征分离算法严重依赖于早期训练的理想化属性。这种依赖性导致攻击性能随着训练的进展而迅速衰减,并在具有挑战性的场景中失败,例如具有重复标签的批次。为了克服这个问题,我们将特征分离重新定义为盲源分离(BSS)问题,并使用独立分量分析(ICA)来解决。通过对整个训练生命周期的系统分析,我们揭示了控制BSS可解性的关键数学属性的对抗动力学。基于这些见解,我们引入了一个新的框架:ica驱动的生成攻击(ICA-GA)。大量的实验表明,ICA-GA在整个训练生命周期中显著优于基线,并且在具有挑战性的条件下表现出显著的鲁棒性,包括具有完全标签重复、FedAVG和梯度压缩的批次。此外,我们的增量发电机微调策略将连续多轮攻击的边际成本降低了一个数量级,使此类威胁具有很高的实用性。我们的工作表明,FL中的隐私风险比以前所理解的要持久和严重得多。我们的代码可以在https://github.com/liminhao-99/ICA-GA上公开获得。
{"title":"Unmixing gradients: Uncovering persistent leakage in Federated Learning via Independent Component Analysis","authors":"Minhao Li ,&nbsp;Le Wang ,&nbsp;Zhaohua Li ,&nbsp;Rongxin Hu ,&nbsp;Tang Zhou ,&nbsp;Binxing Fang","doi":"10.1016/j.sysarc.2025.103681","DOIUrl":"10.1016/j.sysarc.2025.103681","url":null,"abstract":"<div><div>Federated Learning (FL), a cornerstone of privacy-preserving technology in the Artificial Intelligence of Things (AIoT) era, enables collaborative model training across edge devices via gradient sharing. However, Gradient Leakage Attacks (GLA) can reconstruct private data from these gradients, fundamentally undermining FL’s privacy guarantees. While recent Generation-based GLAs demonstrate significant promise in efficiency and quality, their core feature separation algorithms rely heavily on idealized properties of early-stage training. This dependency causes attack performance to decay rapidly as training progresses and fails in challenging scenarios, such as batches with duplicate labels. To overcome this, we reframe feature separation as a Blind Source Separation (BSS) problem, solved using Independent Component Analysis (ICA). Through a systematic analysis of the entire training lifecycle, we uncover the adversarial dynamics of key mathematical properties that govern BSS solvability. Based on these insights, we introduce a novel framework: <strong>ICA-driven Generative Attacks (ICA-GA)</strong>. Extensive experiments show that ICA-GA significantly outperforms baselines throughout the training lifecycle and exhibits remarkable robustness against challenging conditions, including batches with full label duplication, FedAVG, and gradient compression. Furthermore, our incremental generator fine-tuning strategy reduces the marginal cost of continuous multi-round attacks by an order of magnitude, making such threats highly practical. Our work reveals that the privacy risk in FL is far more persistent and severe than previously understood. Our code is publicly available at <span><span>https://github.com/liminhao-99/ICA-GA</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"173 ","pages":"Article 103681"},"PeriodicalIF":4.1,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145941384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Systems Architecture
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1