Pub Date : 2026-01-28DOI: 10.1016/j.sysarc.2026.103722
Zhenzhou Tian , Jiale Zhao , Ming Fan , Jiaze Sun , Yanping Chen , Lingwei Chen
Deep learning (DL)-based vulnerability detection in source code are prevalent, yet detecting vulnerabilities in binary code using this paradigm remains underexplored. The few works typically treat input instructions as individual entities, failing to extract and leverage fine-grained information due to their inability to account for the inherent connections and correlations between code segments and the impact of compilation optimizations. To address these challenges, this paper proposes Delta, a novel approach that incorporates Dynamic contrastive lEarning with vuLnerabiliTy repair Awareness to fine-tune pre-trained models, significantly enhancing the accuracy and efficiency of vulnerability detection in binary code. Delta proceeds by standardizing assembly instructions and utilizing function pairs that represent code before and after vulnerability repair along with their versions compiled under different optimization settings as contrastive learning samples. Building on these rich and diverse training signals, Delta fine-tunes CodeBERT using contrastive learning augmented with masked language modeling, resulting in a feature encoder CMBERT, which is adept at capturing nuanced vulnerability patterns in binary code and remain resilient to the impacts of compilation optimizations. DELTA is evaluated on the Juliet Test Suite dataset, achieving an average performance improvement of 8.04% in detection accuracy and 7.13% in F1 score compared to alternative methods.
基于深度学习(DL)的源代码漏洞检测很普遍,但使用这种范式检测二进制代码中的漏洞仍未得到充分探索。少数作品通常将输入指令视为单独的实体,由于无法解释代码段之间的内在联系和相关性以及编译优化的影响,因此无法提取和利用细粒度信息。为了应对这些挑战,本文提出了一种新的方法Delta,该方法将动态对比学习与漏洞修复意识相结合,对预训练模型进行微调,显著提高了二进制代码漏洞检测的准确性和效率。Delta将汇编指令标准化,并利用代表漏洞修复前后代码的函数对,以及在不同优化设置下编译的版本,作为对比学习样本。在这些丰富多样的训练信号的基础上,Delta使用对比学习和掩码语言建模对CodeBERT进行微调,从而产生了一个特征编码器CMBERT,它擅长捕捉二进制代码中细微的漏洞模式,并对编译优化的影响保持弹性。DELTA在Juliet Test Suite数据集上进行了评估,与其他方法相比,其检测精度平均提高了8.04%,F1分数提高了7.13%。
{"title":"When fixes teach: Repair-aware contrastive learning for optimization-resilient binary vulnerability detection","authors":"Zhenzhou Tian , Jiale Zhao , Ming Fan , Jiaze Sun , Yanping Chen , Lingwei Chen","doi":"10.1016/j.sysarc.2026.103722","DOIUrl":"10.1016/j.sysarc.2026.103722","url":null,"abstract":"<div><div>Deep learning (DL)-based vulnerability detection in source code are prevalent, yet detecting vulnerabilities in binary code using this paradigm remains underexplored. The few works typically treat input instructions as individual entities, failing to extract and leverage fine-grained information due to their inability to account for the inherent connections and correlations between code segments and the impact of compilation optimizations. To address these challenges, this paper proposes <strong><span>Delta</span></strong>, a novel approach that incorporates <strong>D</strong>ynamic contrastive l<strong>E</strong>arning with vu<strong>L</strong>nerabili<strong>T</strong>y repair <strong>A</strong>wareness to fine-tune pre-trained models, significantly enhancing the accuracy and efficiency of vulnerability detection in binary code. <span>Delta</span> proceeds by standardizing assembly instructions and utilizing function pairs that represent code before and after vulnerability repair along with their versions compiled under different optimization settings as contrastive learning samples. Building on these rich and diverse training signals, <span>Delta</span> fine-tunes CodeBERT using contrastive learning augmented with masked language modeling, resulting in a feature encoder CMBERT, which is adept at capturing nuanced vulnerability patterns in binary code and remain resilient to the impacts of compilation optimizations. DELTA is evaluated on the Juliet Test Suite dataset, achieving an average performance improvement of 8.04% in detection accuracy and 7.13% in F1 score compared to alternative methods.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"173 ","pages":"Article 103722"},"PeriodicalIF":4.1,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146090397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-27DOI: 10.1016/j.sysarc.2026.103700
Rajesh Devaraj
Heterogeneous computing systems (HCSs) use different types of processors to balance performance and efficiency for complex applications like workflows. These processors often share a communication bus. When multiple parts of an application try to send data at the same time, this shared bus gets congested, causing delays. Despite this being a common problem, few studies have looked at how to handle this communication bottleneck. To solve this, a new method called Contention-Aware Clustering-based List scheduling (CACL) is proposed. The objective of CACL is to minimize the overall schedule length for the input workflow application modeled as a Directed Acyclic Graph (DAG) to be executed on a HCS interconnected via shared communication buses. While solving this problem, CACL first assigns priorities to task nodes. However, this task priority assignment may occasionally lead to situations where a task is erroneously assigned higher priority compared to one or more of its predecessor tasks in the task graph. Since tasks are selected for processor assignment in the order of their priorities, this situation subsequently leads to violation of precedence relationships between tasks. In this comment, we present a counter example to highlight the design flaw in the task prioritization scheme and discuss possible ways to fix this flaw.
{"title":"Comments on “Contention-aware workflow scheduling on heterogeneous computing systems with shared buses”","authors":"Rajesh Devaraj","doi":"10.1016/j.sysarc.2026.103700","DOIUrl":"10.1016/j.sysarc.2026.103700","url":null,"abstract":"<div><div>Heterogeneous computing systems (HCSs) use different types of processors to balance performance and efficiency for complex applications like workflows. These processors often share a communication bus. When multiple parts of an application try to send data at the same time, this shared bus gets congested, causing delays. Despite this being a common problem, few studies have looked at how to handle this communication bottleneck. To solve this, a new method called <em>Contention-Aware Clustering-based List scheduling</em> (CACL) is proposed. The objective of <em>CACL</em> is to minimize the overall schedule length for the input workflow application modeled as a Directed Acyclic Graph (DAG) to be executed on a HCS interconnected via shared communication buses. While solving this problem, <em>CACL</em> first assigns priorities to task nodes. However, this task priority assignment may occasionally lead to situations where a task is erroneously assigned higher priority compared to one or more of its predecessor tasks in the task graph. Since tasks are selected for processor assignment in the order of their priorities, this situation subsequently leads to violation of precedence relationships between tasks. In this comment, we present a counter example to highlight the design flaw in the task prioritization scheme and discuss possible ways to fix this flaw.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"173 ","pages":"Article 103700"},"PeriodicalIF":4.1,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146090398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-27DOI: 10.1016/j.sysarc.2026.103725
Yi-Wen Zhang, Quan-Huang Zhang
Prior work on mixed-criticality scheduling with resource synchronization based on Earliest Deadline First with Virtual Deadlines immediately abandons all low-criticality (LO) tasks when the system enters high-criticality (HI) mode, which is not reasonable in practical systems. In this paper, we address the scheduling problem of the imprecise mixed-criticality task model with shared resources in which LO tasks continue to execute with a reduced time budget in HI mode. Thereafter, we propose a new resource access protocol called IMC-SRP, and outline some properties of the IMC-SRP. Moreover, we present sufficient conditions for the schedulability analysis of the IMC-SRP. To save energy, we propose a new algorithm called EAS-IMC-SRP. Furthermore, we use synthetic tasksets to evaluate the proposed algorithm.
{"title":"EDF-VD-based energy efficient scheduling for imprecise mixed-criticality task with resource synchronization","authors":"Yi-Wen Zhang, Quan-Huang Zhang","doi":"10.1016/j.sysarc.2026.103725","DOIUrl":"10.1016/j.sysarc.2026.103725","url":null,"abstract":"<div><div>Prior work on mixed-criticality scheduling with resource synchronization based on Earliest Deadline First with Virtual Deadlines immediately abandons all low-criticality (LO) tasks when the system enters high-criticality (HI) mode, which is not reasonable in practical systems. In this paper, we address the scheduling problem of the imprecise mixed-criticality task model with shared resources in which LO tasks continue to execute with a reduced time budget in HI mode. Thereafter, we propose a new resource access protocol called IMC-SRP, and outline some properties of the IMC-SRP. Moreover, we present sufficient conditions for the schedulability analysis of the IMC-SRP. To save energy, we propose a new algorithm called EAS-IMC-SRP. Furthermore, we use synthetic tasksets to evaluate the proposed algorithm.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"173 ","pages":"Article 103725"},"PeriodicalIF":4.1,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146090347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-24DOI: 10.1016/j.sysarc.2026.103719
Chenlu Xie, Xiaolin Gui
Internet of Medical Things (IoMT) devices generate a huge amount of real-time data on a daily basis, which can be analyzed by medical practitioners to optimize diagnosis and treatment. Due to the complexity of devices and users in IoMT systems, robust measures are needed to ensure security and quality of service or information. However, most existing schemes require a trusted central authority to generate secret keys for users, which is often impractical in real-world scenarios. Although many multi-authority access control schemes have been proposed to address this issue, they still lack stronger defense and supervision mechanisms to effectively regulate user access. In this paper, we propose a privacy-preserving multi-authority access control scheme that enables policy hiding and efficiently prevents malicious access attacks. Specifically, multiple untrusted authorities independently generate attribute keys through secure two-party computation and zero-knowledge proofs. Even if multiple authorities collude, they cannot trace secret keys. Furthermore, the scheme enhances privacy by breaking the mapping between attributes and the access matrix. What is more, we also construct a dynamic access control mechanism based on trust management, which can effectively curb persistent access attacks by malicious data users. Our security analysis and experimental results show that the scheme achieves semantic security, resists collusion attacks, and constrains the malicious behavior of data users with minimal online encryption computational costs compared to other schemes.
{"title":"Privacy-preserving access control and trust management for multi-authority in IoMT systems","authors":"Chenlu Xie, Xiaolin Gui","doi":"10.1016/j.sysarc.2026.103719","DOIUrl":"10.1016/j.sysarc.2026.103719","url":null,"abstract":"<div><div>Internet of Medical Things (IoMT) devices generate a huge amount of real-time data on a daily basis, which can be analyzed by medical practitioners to optimize diagnosis and treatment. Due to the complexity of devices and users in IoMT systems, robust measures are needed to ensure security and quality of service or information. However, most existing schemes require a trusted central authority to generate secret keys for users, which is often impractical in real-world scenarios. Although many multi-authority access control schemes have been proposed to address this issue, they still lack stronger defense and supervision mechanisms to effectively regulate user access. In this paper, we propose a privacy-preserving multi-authority access control scheme that enables policy hiding and efficiently prevents malicious access attacks. Specifically, multiple untrusted authorities independently generate attribute keys through secure two-party computation and zero-knowledge proofs. Even if multiple authorities collude, they cannot trace secret keys. Furthermore, the scheme enhances privacy by breaking the mapping between attributes and the access matrix. What is more, we also construct a dynamic access control mechanism based on trust management, which can effectively curb persistent access attacks by malicious data users. Our security analysis and experimental results show that the scheme achieves semantic security, resists collusion attacks, and constrains the malicious behavior of data users with minimal online encryption computational costs compared to other schemes.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"173 ","pages":"Article 103719"},"PeriodicalIF":4.1,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146090487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-24DOI: 10.1016/j.sysarc.2026.103698
Rui Wang , Shichun Yang , Yuyi Chen , Zhuoyang Li , Jiayi Lu , Zexiang Tong , Jianyi Xu , Bin Sun , Xinjie Feng , Yaoguang Cao
Road terrain conditions are vital for ensuring the driving safety of autonomous vehicles (AVs). However, traditional sensors like cameras and LiDARs are sensitive to changes in lighting and weather, posing challenges for real-time road condition perception. In this paper, we propose an illumination-aware visual–tactile fusion system (IVTF) for terrain perception, integrating visual and tactile data while optimizing the fusion process based on illumination characteristics. The system employs a camera and an intelligent tire to capture visual and tactile data across various lighting conditions and vehicle speeds. Additionally, we also design a visual–tactile fusion module that dynamically adjusts the weights of different modalities according to illumination features. Comparative results with single-modality perception methods demonstrate the superior ability of visual–tactile fusion to accurately perceive road terrains under diverse lighting conditions. This approach significantly advances the robustness and reliability of terrain perception in AVs, contributing to enhanced driving safety.
{"title":"A visual–tactile fusion system for terrain perception under varying illumination conditions","authors":"Rui Wang , Shichun Yang , Yuyi Chen , Zhuoyang Li , Jiayi Lu , Zexiang Tong , Jianyi Xu , Bin Sun , Xinjie Feng , Yaoguang Cao","doi":"10.1016/j.sysarc.2026.103698","DOIUrl":"10.1016/j.sysarc.2026.103698","url":null,"abstract":"<div><div>Road terrain conditions are vital for ensuring the driving safety of autonomous vehicles (AVs). However, traditional sensors like cameras and LiDARs are sensitive to changes in lighting and weather, posing challenges for real-time road condition perception. In this paper, we propose an illumination-aware visual–tactile fusion system (IVTF) for terrain perception, integrating visual and tactile data while optimizing the fusion process based on illumination characteristics. The system employs a camera and an intelligent tire to capture visual and tactile data across various lighting conditions and vehicle speeds. Additionally, we also design a visual–tactile fusion module that dynamically adjusts the weights of different modalities according to illumination features. Comparative results with single-modality perception methods demonstrate the superior ability of visual–tactile fusion to accurately perceive road terrains under diverse lighting conditions. This approach significantly advances the robustness and reliability of terrain perception in AVs, contributing to enhanced driving safety.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"174 ","pages":"Article 103698"},"PeriodicalIF":4.1,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-22DOI: 10.1016/j.sysarc.2026.103685
Kanghua Mo , Zhengxin Zhang , Yuanzhi Zhang , Yucheng Long , Zhengdao Li
Recent studies have demonstrated that policy manipulation attacks on deep reinforcement learning (DRL) systems can lead to the learning of abnormal policies by victim agents. However, existing work typically assumes that the attacker can manipulate multiple components of the training process, such as reward functions, environment dynamics, or state information. In IoT-enabled smart societies, where AI-driven systems operate in interconnected and data-sensitive environments, such assumptions raise serious concerns regarding security and privacy. This paper investigates a novel policy manipulation attack in competitive multi-agent reinforcement learning under significantly weaker assumptions, where the attacker only requires access to the victim’s training settings and, in some cases, the learned policy outputs during training. We propose the honeypot policy attack (HPA), in which an adversarial agent induces the victim to learn an attacker-specified target policy by deliberately taking suboptimal actions. To this end, we introduce a honeypot reward estimation mechanism that quantifies the amount of reward sacrifice required by the adversarial agent to influence the victim’s learning process, and adapts this sacrifice according to the degree of policy manipulation. Extensive experiments on three representative competitive games demonstrate that HPA is both effective and stealthy, exposing previously unexplored vulnerabilities in DRL-based systems deployed in IoT-driven smart environments. To the best of our knowledge, this work presents the first policy manipulation attack that does not rely on explicit tampering with internal components of DRL systems, but instead operates solely through admissible adversarial interactions, offering new insights into security challenges faced by emerging AIoT ecosystems.
{"title":"HPA: Manipulating deep reinforcement learning via adversarial interaction","authors":"Kanghua Mo , Zhengxin Zhang , Yuanzhi Zhang , Yucheng Long , Zhengdao Li","doi":"10.1016/j.sysarc.2026.103685","DOIUrl":"10.1016/j.sysarc.2026.103685","url":null,"abstract":"<div><div>Recent studies have demonstrated that policy manipulation attacks on deep reinforcement learning (DRL) systems can lead to the learning of abnormal policies by victim agents. However, existing work typically assumes that the attacker can manipulate multiple components of the training process, such as reward functions, environment dynamics, or state information. In IoT-enabled smart societies, where AI-driven systems operate in interconnected and data-sensitive environments, such assumptions raise serious concerns regarding security and privacy. This paper investigates a novel policy manipulation attack in competitive multi-agent reinforcement learning under significantly weaker assumptions, where the attacker only requires access to the victim’s training settings and, in some cases, the learned policy outputs during training. We propose the honeypot policy attack (HPA), in which an adversarial agent induces the victim to learn an attacker-specified target policy by deliberately taking suboptimal actions. To this end, we introduce a honeypot reward estimation mechanism that quantifies the amount of reward sacrifice required by the adversarial agent to influence the victim’s learning process, and adapts this sacrifice according to the degree of policy manipulation. Extensive experiments on three representative competitive games demonstrate that HPA is both effective and stealthy, exposing previously unexplored vulnerabilities in DRL-based systems deployed in IoT-driven smart environments. To the best of our knowledge, this work presents the first policy manipulation attack that does not rely on explicit tampering with internal components of DRL systems, but instead operates solely through admissible adversarial interactions, offering new insights into security challenges faced by emerging AIoT ecosystems.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"173 ","pages":"Article 103685"},"PeriodicalIF":4.1,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146090348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-21DOI: 10.1016/j.sysarc.2026.103721
Yonghua Hu , Anxing Xie , Yaohua Wang , Zhe Li , Zenghua Cheng , Junyang Tang
Automatically generating high-performance tensor programs has become a promising approach for deploying deep neural networks. A key challenge lies in designing an effective cost model to navigate the vast scheduling search space. Existing approaches typically fall into two categories, each with limitations: offline learning cost models rely on large pre-collected datasets, which may be incomplete or device-specific, and online learning cost models depend on handcrafted features, requiring substantial manual effort and expertise.
We propose GAS, a lightweight framework for generating tensor programs for deep learning applications. GAS reformulates feature extraction as a sequence-dependent analysis of scheduling primitives. Our cost model integrates three key factors to uncover performance-critical insights within scheduling sequences: (1) decision factors allocation, quantifying entropy and skewness of scheduling primitive factors to capture their dominance; (2) primitive contribution weights, measuring the relative impact of primitives on overall performance; and (3) structural semantic alignment, capturing correlations between scheduling primitive factors and hardware parallelism mechanisms. This approach reduces the complexity of handcrafted feature engineering and extensive pre-training datasets, significantly improving both efficiency and scalability. Experimental results on NVIDIA GPUs demonstrate that GAS achieves average speedups of 3.79 over AMOS and 2.22 over Ansor, while also consistently outperforming other state-of-the-art tensor compilers.
{"title":"GAS: A scheduling primitive dependency analysis-based cost model for tensor program optimization","authors":"Yonghua Hu , Anxing Xie , Yaohua Wang , Zhe Li , Zenghua Cheng , Junyang Tang","doi":"10.1016/j.sysarc.2026.103721","DOIUrl":"10.1016/j.sysarc.2026.103721","url":null,"abstract":"<div><div>Automatically generating high-performance tensor programs has become a promising approach for deploying deep neural networks. A key challenge lies in designing an effective cost model to navigate the vast scheduling search space. Existing approaches typically fall into two categories, each with limitations: offline learning cost models rely on large pre-collected datasets, which may be incomplete or device-specific, and online learning cost models depend on handcrafted features, requiring substantial manual effort and expertise.</div><div>We propose GAS, a lightweight framework for generating tensor programs for deep learning applications. GAS reformulates feature extraction as a sequence-dependent analysis of scheduling primitives. Our cost model integrates three key factors to uncover performance-critical insights within scheduling sequences: (1) decision factors allocation, quantifying entropy and skewness of scheduling primitive factors to capture their dominance; (2) primitive contribution weights, measuring the relative impact of primitives on overall performance; and (3) structural semantic alignment, capturing correlations between scheduling primitive factors and hardware parallelism mechanisms. This approach reduces the complexity of handcrafted feature engineering and extensive pre-training datasets, significantly improving both efficiency and scalability. Experimental results on NVIDIA GPUs demonstrate that GAS achieves average speedups of 3.79<span><math><mo>×</mo></math></span> over AMOS and 2.22<span><math><mo>×</mo></math></span> over Ansor, while also consistently outperforming other state-of-the-art tensor compilers.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"173 ","pages":"Article 103721"},"PeriodicalIF":4.1,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-21DOI: 10.1016/j.sysarc.2026.103701
Tuan-Dung Tran, Phuong-Dai Bui, Van-Hau Pham
Enabling AI-driven real-time distributed computing on the edge-cloud continuum requires overcoming a critical dependability challenge: resource-constrained IoT devices cannot participate in Byzantine-resilient federated learning due to a 1940-fold memory gap, with robust aggregation methods demanding 512MB–2GB while microcontrollers offer only 264KB SRAM. We present EdgeTrust-Shard, a novel system architecture designed for dependability, security, and scalability in edge AI. It enables real-time Byzantine-resilient federated learning on commodity microcontrollers by distributing computational complexity across the network topology. The framework’s contributions include optimal clustering for communication, a Multi-Factor Proof-of-Performance consensus mechanism providing quadratic Byzantine suppression with proven convergence, and platform-optimized cryptography delivering a 3.4-fold speedup for real-time processing. A case study using a hybrid physical-simulation deployment demonstrates the system’s efficacy, achieving 93.9–94.7% accuracy across Byzantine attack scenarios at 30% adversary presence within a 140KB memory footprint on Raspberry Pi Pico nodes. By outperforming adapted state-of-the-art blockchain-FL systems like FedChain and BlockFL by up to 9.3 percentage points, EdgeTrust-Shard provides a critical security enhancement for the edge-cloud continuum, transforming passive IoT data sources into dependable participants in distributed trust computations for next-generation applications such as smart cities and industrial automation.
在边缘云连续体上实现人工智能驱动的实时分布式计算需要克服一个关键的可靠性挑战:由于内存缺口达40倍,资源受限的物联网设备无法参与拜占庭弹性联邦学习,而强大的聚合方法需要512MB-2GB,而微控制器仅提供264KB SRAM。我们提出了EdgeTrust-Shard,这是一种新颖的系统架构,专为边缘人工智能的可靠性,安全性和可扩展性而设计。它通过在网络拓扑上分配计算复杂性,在商品微控制器上实现实时拜占庭弹性联邦学习。该框架的贡献包括用于O(N)通信的最优M=N聚类,提供二次拜占庭抑制的多因素性能证明共识机制,具有已证明的O(T−1/2)收敛性,以及为实时处理提供3.4倍加速的平台优化加密。使用混合物理模拟部署的案例研究证明了系统的有效性,在Raspberry Pi Pico节点140KB内存占用内,在30%的对手存在的拜占庭攻击场景中,实现了93.9-94.7%的准确率。通过比FedChain和BlockFL等最先进的区块链- fl系统高出9.3个百分点,EdgeTrust-Shard为边缘云连续体提供了关键的安全性增强,将被动物联网数据源转化为下一代应用(如智能城市和工业自动化)分布式信任计算的可靠参与者。
{"title":"EdgeTrust-Shard: Hierarchical blockchain architecture for federated learning in cross-chain IoT ecosystems","authors":"Tuan-Dung Tran, Phuong-Dai Bui, Van-Hau Pham","doi":"10.1016/j.sysarc.2026.103701","DOIUrl":"10.1016/j.sysarc.2026.103701","url":null,"abstract":"<div><div>Enabling AI-driven real-time distributed computing on the edge-cloud continuum requires overcoming a critical dependability challenge: resource-constrained IoT devices cannot participate in Byzantine-resilient federated learning due to a 1940-fold memory gap, with robust aggregation methods demanding 512MB–2GB while microcontrollers offer only 264KB SRAM. We present EdgeTrust-Shard, a novel system architecture designed for dependability, security, and scalability in edge AI. It enables real-time Byzantine-resilient federated learning on commodity microcontrollers by distributing computational complexity across the network topology. The framework’s contributions include optimal <span><math><mrow><mi>M</mi><mo>=</mo><msqrt><mrow><mi>N</mi></mrow></msqrt></mrow></math></span> clustering for <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mi>N</mi><mo>)</mo></mrow></mrow></math></span> communication, a Multi-Factor Proof-of-Performance consensus mechanism providing quadratic Byzantine suppression with proven <span><math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>T</mi></mrow><mrow><mo>−</mo><mn>1</mn><mo>/</mo><mn>2</mn></mrow></msup><mo>)</mo></mrow></mrow></math></span> convergence, and platform-optimized cryptography delivering a 3.4-fold speedup for real-time processing. A case study using a hybrid physical-simulation deployment demonstrates the system’s efficacy, achieving 93.9–94.7% accuracy across Byzantine attack scenarios at 30% adversary presence within a 140KB memory footprint on Raspberry Pi Pico nodes. By outperforming adapted state-of-the-art blockchain-FL systems like FedChain and BlockFL by up to 9.3 percentage points, EdgeTrust-Shard provides a critical security enhancement for the edge-cloud continuum, transforming passive IoT data sources into dependable participants in distributed trust computations for next-generation applications such as smart cities and industrial automation.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"173 ","pages":"Article 103701"},"PeriodicalIF":4.1,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146090346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-17DOI: 10.1016/j.sysarc.2026.103702
Daria Bromot , Yehuda Kra , Zuher Jahshan , Esteban Garzón , Adam Teman , Leonid Yavits
We propose GenMClass, a genome classification system-on-chip (SoC) implementing two different classification approaches and comprising two separate classification engines: a DNN accelerator GenDNN, that classifies DNA reads converted to images using a classification neural network, and a similarity search-capable Error Tolerant Content Addressable Memory ETCAM, that classifies genomes by k-mer matching. Classification operations are controlled by an embedded RISCV processor. GenMClass classification platform was designed and manufactured in a commercial 65 nm process. We conduct a comparative analysis of ETCAM and GenDNN classification efficiency as well as their performance, silicon area and power consumption using silicon measurements. The size of GenMClass SoC is 3.4 mm and its total power consumption (assuming both GenDNN and ETCAM perform classification at the same time) is 144 mW. This allows using GenMClass as a portable classifier for pathogen surveillance during pandemics, food safety and environmental monitoring, agriculture pathogen and antimicrobial resistance control, in the field or at points of care.
{"title":"GenMClass: Design and comparative analysis of genome classifier-on-chip platform","authors":"Daria Bromot , Yehuda Kra , Zuher Jahshan , Esteban Garzón , Adam Teman , Leonid Yavits","doi":"10.1016/j.sysarc.2026.103702","DOIUrl":"10.1016/j.sysarc.2026.103702","url":null,"abstract":"<div><div>We propose GenMClass, a genome classification system-on-chip (SoC) implementing two different classification approaches and comprising two separate classification engines: a DNN accelerator GenDNN, that classifies DNA reads converted to images using a classification neural network, and a similarity search-capable Error Tolerant Content Addressable Memory ETCAM, that classifies genomes by k-mer matching. Classification operations are controlled by an embedded RISCV processor. GenMClass classification platform was designed and manufactured in a commercial 65 nm process. We conduct a comparative analysis of ETCAM and GenDNN classification efficiency as well as their performance, silicon area and power consumption using silicon measurements. The size of GenMClass SoC is 3.4 mm<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span> and its total power consumption (assuming both GenDNN and ETCAM perform classification at the same time) is 144 mW. This allows using GenMClass as a portable classifier for pathogen surveillance during pandemics, food safety and environmental monitoring, agriculture pathogen and antimicrobial resistance control, in the field or at points of care.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"173 ","pages":"Article 103702"},"PeriodicalIF":4.1,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The rapid evolution of Artificial Intelligence of Things (AIoT) is accelerating the development of smart societies, where interconnected consumer electronics such as smartphones, IoT devices, smart meters, and surveillance systems play a crucial role in optimizing operational efficiency and service delivery. However, this hyper-connected digital ecosystem is increasingly vulnerable to sophisticated Android malware attacks that exploit system weaknesses, disrupt services, and compromise data privacy and integrity. These malware variants leverage advanced evasion techniques, including permission abuse, dynamic runtime manipulation, and memory-based obfuscation, rendering traditional detection methods ineffective. The key challenges in securing AIoT-driven smart societies include managing high-dimensional feature spaces, detecting dynamically evolving malware behaviours, and ensuring real-time classification performance. To address these issues, this paper proposed an AI-powered Android Malware Detection (AIMD) framework designed for AIoT-enabled smart society environments. The framework extracts multi-level features (permissions, intents, API calls, and obfuscated memory patterns) from Android APK files and employs graph embedding techniques (DeepWalk and Node2Vec) for dimensionality reduction. Feature selection is optimized using the Red Deer Algorithm (RDA), a metaheuristic approach, while classification is performed through an ensemble of machine learning models (Support Vector Machine, Decision Tree, Random Forest, Extra Trees) enhanced by bagging, boosting, stacking, and soft voting techniques. Experimental evaluations on CICInvesAndMal2019 and CICMalMem2022 datasets demonstrate the effectiveness of the proposed system, achieving malware detection accuracies of 98.78% and 99.99%, respectively. By integrating AI-driven malware detection into AIoT infrastructures, this research advances cybersecurity resilience, safeguarding smart societies against emerging threats in an increasingly connected world.
{"title":"AIMD: AI-powered android malware detection for securing AIoT devices and networks using graph embedding and ensemble learning","authors":"Santosh K. Smmarwar , Rahul Priyadarshi , Pratik Angaitkar , Subodh Mishra , Rajkumar Singh Rathore","doi":"10.1016/j.sysarc.2026.103707","DOIUrl":"10.1016/j.sysarc.2026.103707","url":null,"abstract":"<div><div>The rapid evolution of Artificial Intelligence of Things (AIoT) is accelerating the development of smart societies, where interconnected consumer electronics such as smartphones, IoT devices, smart meters, and surveillance systems play a crucial role in optimizing operational efficiency and service delivery. However, this hyper-connected digital ecosystem is increasingly vulnerable to sophisticated Android malware attacks that exploit system weaknesses, disrupt services, and compromise data privacy and integrity. These malware variants leverage advanced evasion techniques, including permission abuse, dynamic runtime manipulation, and memory-based obfuscation, rendering traditional detection methods ineffective. The key challenges in securing AIoT-driven smart societies include managing high-dimensional feature spaces, detecting dynamically evolving malware behaviours, and ensuring real-time classification performance. To address these issues, this paper proposed an AI-powered Android Malware Detection (AIMD) framework designed for AIoT-enabled smart society environments. The framework extracts multi-level features (permissions, intents, API calls, and obfuscated memory patterns) from Android APK files and employs graph embedding techniques (DeepWalk and Node2Vec) for dimensionality reduction. Feature selection is optimized using the Red Deer Algorithm (RDA), a metaheuristic approach, while classification is performed through an ensemble of machine learning models (Support Vector Machine, Decision Tree, Random Forest, Extra Trees) enhanced by bagging, boosting, stacking, and soft voting techniques. Experimental evaluations on CICInvesAndMal2019 and CICMalMem2022 datasets demonstrate the effectiveness of the proposed system, achieving malware detection accuracies of 98.78% and 99.99%, respectively. By integrating AI-driven malware detection into AIoT infrastructures, this research advances cybersecurity resilience, safeguarding smart societies against emerging threats in an increasingly connected world.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"173 ","pages":"Article 103707"},"PeriodicalIF":4.1,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}