Pub Date : 2026-06-01Epub Date: 2026-03-10DOI: 10.1016/j.sysarc.2026.103756
Binqi Li, Yuan Zhu, Yingqi Zhao, Ke Cui, Xu Zhong, Ke Lu
The integration of the Data Distribution Service (DDS) with Time-Sensitive Networking (TSN) enables deterministic communication for distributed cyber–physical systems. However, existing studies mainly address real-time performance, with limited attention to reliability. This paper presents an Software-Defined Networking (SDN)-based DDS–TSN integration framework incorporating the IEEE 802.1CB Frame Replication and Elimination for Reliability (FRER) mechanism. The framework automatically extracts redundancy requirements from DDS publishers and applies a multicast transformation mechanism for CB flow identification. A heuristic redundancy-aware routing and scheduling (Ra-RaS) algorithm is developed to jointly optimize redundant path selection and time-aware scheduling, reducing path length deviation and improving load balance. Simulation results show that the proposed Ra-RaS algorithm maintains high schedulability and scalability even under high traffic load conditions. Physical experiments further demonstrate that the integrated DDS–TSN framework ensures deterministic and fault-tolerant communication by leveraging the joint operation of FRER and the Time-Aware Shaper (TAS), significantly enhancing the reliability of DDS under link failure and packet loss.
{"title":"Fault tolerance DDS communications in Time-Sensitive Networking using IEEE 802.1CB redundancy mechanisms","authors":"Binqi Li, Yuan Zhu, Yingqi Zhao, Ke Cui, Xu Zhong, Ke Lu","doi":"10.1016/j.sysarc.2026.103756","DOIUrl":"10.1016/j.sysarc.2026.103756","url":null,"abstract":"<div><div>The integration of the Data Distribution Service (DDS) with Time-Sensitive Networking (TSN) enables deterministic communication for distributed cyber–physical systems. However, existing studies mainly address real-time performance, with limited attention to reliability. This paper presents an Software-Defined Networking (SDN)-based DDS–TSN integration framework incorporating the IEEE 802.1CB Frame Replication and Elimination for Reliability (FRER) mechanism. The framework automatically extracts redundancy requirements from DDS publishers and applies a multicast transformation mechanism for CB flow identification. A heuristic redundancy-aware routing and scheduling (Ra-RaS) algorithm is developed to jointly optimize redundant path selection and time-aware scheduling, reducing path length deviation and improving load balance. Simulation results show that the proposed Ra-RaS algorithm maintains high schedulability and scalability even under high traffic load conditions. Physical experiments further demonstrate that the integrated DDS–TSN framework ensures deterministic and fault-tolerant communication by leveraging the joint operation of FRER and the Time-Aware Shaper (TAS), significantly enhancing the reliability of DDS under link failure and packet loss.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"175 ","pages":"Article 103756"},"PeriodicalIF":4.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147386522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-01Epub Date: 2026-02-23DOI: 10.1016/j.sysarc.2026.103736
Hector Gerardo Muñoz Hernandez , Mahdi Taheri , Muhammad Ali , Keyvan Shahin , Alireza Syavashi , Diana Göhringer , Marc Reichenbach , Christian Herglotz , Michael Hübner
Image and signal processing workloads are widely deployed on Graphics Processing Units (GPUs) for high throughput and on Field-Programmable Gate Arrays (FPGAs) for hardware specialization and energy efficiency. Soft GPU overlays on FPGAs aim to combine these advantages, yet existing solutions often depend on fixed hard processors or impose platform constraints that limit portability. This work extends a popular open-source soft GPGPU overlay to integrate a soft RISC-V control plane and enable compatibility with High-Bandwidth Memory (HBM2). The resulting system can be instantiated on FPGA boards without a hard ARM processor, improving portability, simplifying system integration, and broadening deployability. Across representative image and signal processing kernels, the soft GPGPU achieves geometric-mean speedups of 114.60 over a scalar soft RISC-V core and 19.72 over a hard ARM core, demonstrating substantial performance benefits while retaining FPGA reconfigurability. HBM2 integration further benefits bandwidth-sensitive workloads by increasing sustained throughput and reducing the performance bottlenecks associated with off-chip memory access. Collectively, these results indicate that GPU-like programmability and performance can be delivered on reconfigurable platforms without reliance on hard CPU subsystems, providing a portable and scalable foundation for embedded vision and DSP acceleration.
{"title":"Integrating an open-source soft-GPU overlay with RISC-V control and high-bandwidth memory","authors":"Hector Gerardo Muñoz Hernandez , Mahdi Taheri , Muhammad Ali , Keyvan Shahin , Alireza Syavashi , Diana Göhringer , Marc Reichenbach , Christian Herglotz , Michael Hübner","doi":"10.1016/j.sysarc.2026.103736","DOIUrl":"10.1016/j.sysarc.2026.103736","url":null,"abstract":"<div><div>Image and signal processing workloads are widely deployed on Graphics Processing Units (GPUs) for high throughput and on Field-Programmable Gate Arrays (FPGAs) for hardware specialization and energy efficiency. Soft GPU overlays on FPGAs aim to combine these advantages, yet existing solutions often depend on fixed hard processors or impose platform constraints that limit portability. This work extends a popular open-source soft GPGPU overlay to integrate a soft RISC-V control plane and enable compatibility with High-Bandwidth Memory (HBM2). The resulting system can be instantiated on FPGA boards without a hard ARM processor, improving portability, simplifying system integration, and broadening deployability. Across representative image and signal processing kernels, the soft GPGPU achieves geometric-mean speedups of 114.60<span><math><mo>×</mo></math></span> over a scalar soft RISC-V core and 19.72<span><math><mo>×</mo></math></span> over a hard ARM core, demonstrating substantial performance benefits while retaining FPGA reconfigurability. HBM2 integration further benefits bandwidth-sensitive workloads by increasing sustained throughput and reducing the performance bottlenecks associated with off-chip memory access. Collectively, these results indicate that GPU-like programmability and performance can be delivered on reconfigurable platforms without reliance on hard CPU subsystems, providing a portable and scalable foundation for embedded vision and DSP acceleration.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"175 ","pages":"Article 103736"},"PeriodicalIF":4.1,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147386650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-02-11DOI: 10.1016/j.sysarc.2026.103734
Yonglin Zhou, Shi Wang, Qiu Fang, Yaonan Wang
UAV-assisted offloading presents an effective solution for areas with limited infrastructure or temporary congestion. Yet jointly optimising task allocation and trajectory control is difficult due to heterogeneous GU demands and UAV energy/kinematic limits. We propose a two-layer architecture that decouples discrete allocation from continuous control. In the allocation layer, a responder-only external incentive gates accept/reject decisions in an auction–predation–trade scheme, without altering the potential, learning targets, or metrics. This mechanism induces monotone ascent of the potential and terminates in finitely many steps at a pairwise-stable (Nash) allocation. The control layer adopts centralised training with decentralised execution (MADDPG) with a centralised critic and task-specific actor heads to generate 3D trajectories subject to flight and safety constraints. This partition confines the combinatorial complexity to the outer allocation, resulting in a lighter on-board inference than a monolithic policy that jointly handles both allocation and planning. In a 3-UAV/7-GU simulator with obstacles and heterogeneous task sizes, the proposed method achieves higher global reward (2.02 vs 1.91) and shorter completion time (92 s vs 109 s) than a reproduced game-theoretic baseline. Compared with end-to-end MADDPG without prior allocation, it also converges faster and more stably.
无人机辅助卸载为基础设施有限或暂时拥堵的地区提供了有效的解决方案。然而,由于GU需求的异质性和无人机能量/运动学的限制,联合优化任务分配和轨迹控制是困难的。我们提出了一个两层架构,将离散分配与连续控制解耦。在分配层,只有应答者的外部激励门接受/拒绝拍卖-掠夺-交易方案中的决策,而不改变潜力、学习目标或指标。这种机制导致电位单调上升,并在一对稳定(纳什)分配的有限多个步骤中终止。控制层采用集中训练与分散执行(madpg),具有集中的评论家和特定任务的演员头部,以生成受飞行和安全约束的3D轨迹。该分区将组合复杂性限制在外部分配,与联合处理分配和规划的整体策略相比,产生了更轻的板上推理。在具有障碍和异构任务大小的3-UAV/7-GU模拟器中,与复制的博弈论基线相比,该方法获得了更高的全局奖励(2.02 vs 1.91)和更短的完成时间(92 s vs 109 s)。与没有预先分配的端到端MADDPG相比,它的收敛速度更快、更稳定。
{"title":"A two-layer system architecture for Unmanned Aerial Vehicle (UAV)-assisted offloading: Game-theoretic allocation and CTDE-MADDPG control","authors":"Yonglin Zhou, Shi Wang, Qiu Fang, Yaonan Wang","doi":"10.1016/j.sysarc.2026.103734","DOIUrl":"10.1016/j.sysarc.2026.103734","url":null,"abstract":"<div><div>UAV-assisted offloading presents an effective solution for areas with limited infrastructure or temporary congestion. Yet jointly optimising task allocation and trajectory control is difficult due to heterogeneous GU demands and UAV energy/kinematic limits. We propose a two-layer architecture that decouples discrete allocation from continuous control. In the allocation layer, a responder-only external incentive gates accept/reject decisions in an auction–predation–trade scheme, without altering the potential, learning targets, or metrics. This mechanism induces monotone ascent of the potential and terminates in finitely many steps at a pairwise-stable (Nash) allocation. The control layer adopts centralised training with decentralised execution (MADDPG) with a centralised critic and task-specific actor heads to generate 3D trajectories subject to flight and safety constraints. This partition confines the combinatorial complexity to the outer allocation, resulting in a lighter on-board inference than a monolithic policy that jointly handles both allocation and planning. In a 3-UAV/7-GU simulator with obstacles and heterogeneous task sizes, the proposed method achieves higher global reward (2.02 vs 1.91) and shorter completion time (92 s vs 109 s) than a reproduced game-theoretic baseline. Compared with end-to-end MADDPG without prior allocation, it also converges faster and more stably.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"174 ","pages":"Article 103734"},"PeriodicalIF":4.1,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-02-05DOI: 10.1016/j.sysarc.2026.103730
Wenbin He , Shifeng Wang , Zhaohui Liu , Zhenying Li , Zhijun Fu , Dengfeng Zhao , Junjian Hou , Wuyi Ming
Distracted driving remains a leading cause of traffic fatalities worldwide, yet existing detection methods face significant challenges in real-time performance and complex driving scenarios. This review presents the first comprehensive analysis of deep learning approaches for distracted driving recognition, systematically addressing the fragmented research landscape. The study evaluates state-of-the-art architectures across three critical dimensions: (1) neural network designs ranging from CNN to Transformer, analyzing their effectiveness in capturing driver behavior patterns; (2) diverse data modalities including visual, physiological, and vehicle dynamics, with empirical comparisons of their discriminative power; and (3) fusion strategies that integrate heterogeneous information sources to enhance detection accuracy. Extensive empirical analysis reveals that hybrid approaches combining attention mechanisms with multimodal fusion achieve remarkable accuracy rates up to 99.8%. Nevertheless, critical challenges persist, from inherent failure modes across modalities to deployment hurdles involving efficiency, energy, and privacy. Through a comprehensive examination of model-based versus deep learning approaches, various neural architectures, data modalities, fusion strategies, failure modes, and deployment challenges this survey identifies current methodological limitations, uncovers underexplored research opportunities, and delineates future directions for advancing distracted driving recognition systems toward real-world deployment.
{"title":"Deep learning for distracted driving recognition with multisource data: A comprehensive review","authors":"Wenbin He , Shifeng Wang , Zhaohui Liu , Zhenying Li , Zhijun Fu , Dengfeng Zhao , Junjian Hou , Wuyi Ming","doi":"10.1016/j.sysarc.2026.103730","DOIUrl":"10.1016/j.sysarc.2026.103730","url":null,"abstract":"<div><div>Distracted driving remains a leading cause of traffic fatalities worldwide, yet existing detection methods face significant challenges in real-time performance and complex driving scenarios. This review presents the first comprehensive analysis of deep learning approaches for distracted driving recognition, systematically addressing the fragmented research landscape. The study evaluates state-of-the-art architectures across three critical dimensions: (1) neural network designs ranging from CNN to Transformer, analyzing their effectiveness in capturing driver behavior patterns; (2) diverse data modalities including visual, physiological, and vehicle dynamics, with empirical comparisons of their discriminative power; and (3) fusion strategies that integrate heterogeneous information sources to enhance detection accuracy. Extensive empirical analysis reveals that hybrid approaches combining attention mechanisms with multimodal fusion achieve remarkable accuracy rates up to 99.8%. Nevertheless, critical challenges persist, from inherent failure modes across modalities to deployment hurdles involving efficiency, energy, and privacy. Through a comprehensive examination of model-based versus deep learning approaches, various neural architectures, data modalities, fusion strategies, failure modes, and deployment challenges this survey identifies current methodological limitations, uncovers underexplored research opportunities, and delineates future directions for advancing distracted driving recognition systems toward real-world deployment.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"174 ","pages":"Article 103730"},"PeriodicalIF":4.1,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-02-07DOI: 10.1016/j.sysarc.2026.103735
Zijian Zhou , Xiaoheng Deng , Xinjun Pei , Yunlong Zhao , Yurong Qian , Shaohua Wan , Kaiping Xue
Achieving efficient repair with minimal redundancy remains a long-standing challenge for distributed cloud storage. Existing erasure codes such as Reed–Solomon (RS) and Local Reconstruction Codes (LRC) rely on global repair, which causes heavy repair bandwidth consumption and high node construction overhead. This paper presents ZJ Codes (ZJC), a novel erasure coding scheme that achieves fully local repair while maintaining strong availability. ZJC introduces an interleaved local-group structure in which each data block participates in two local parity groups, thus providing two independent recovery paths without any global parity. A new storage overhead model is further proposed, incorporating node construction cost into redundancy evaluation. We analytically prove that ZJC achieves lower storage overhead and constant repair bandwidth with only two helper blocks per repair. Both RS-based and XOR-based implementations are developed and experimentally deployed in a distributed system. Results show that ZJC improves encoding efficiency by up to 87.5% and recovery throughput by 96.5% compared with Reed–Solomon codes at equivalent redundancy. Compared to LRC, ZJC achieves encoding improvements of 62.3% and 86.5%, respectively. RS-based and Xor-based ZJC show recovery improvements of 53.9% and 96.5% compare with RS, respectively. Additionally, Xor-based ZJC outperforms LRC and ZJC RS-based by 77.5% and 90.8%, respectively. These results demonstrate that ZJC effectively reconciles the long-standing trade-off between low redundancy and high repair efficiency in distributed cloud storage.
{"title":"ZJC: Constructing fully local repair in erasure codes for distributed cloud storage","authors":"Zijian Zhou , Xiaoheng Deng , Xinjun Pei , Yunlong Zhao , Yurong Qian , Shaohua Wan , Kaiping Xue","doi":"10.1016/j.sysarc.2026.103735","DOIUrl":"10.1016/j.sysarc.2026.103735","url":null,"abstract":"<div><div>Achieving efficient repair with minimal redundancy remains a long-standing challenge for distributed cloud storage. Existing erasure codes such as Reed–Solomon (RS) and Local Reconstruction Codes (LRC) rely on global repair, which causes heavy repair bandwidth consumption and high node construction overhead. This paper presents ZJ Codes (ZJC), a novel erasure coding scheme that achieves fully local repair while maintaining strong availability. ZJC introduces an interleaved local-group structure in which each data block participates in two local parity groups, thus providing two independent recovery paths without any global parity. A new storage overhead model is further proposed, incorporating node construction cost into redundancy evaluation. We analytically prove that ZJC achieves lower storage overhead and constant repair bandwidth with only two helper blocks per repair. Both RS-based and XOR-based implementations are developed and experimentally deployed in a distributed system. Results show that ZJC improves encoding efficiency by up to 87.5% and recovery throughput by 96.5% compared with Reed–Solomon codes at equivalent redundancy. Compared to LRC, ZJC achieves encoding improvements of 62.3% and 86.5%, respectively. RS-based and Xor-based ZJC show recovery improvements of 53.9% and 96.5% compare with RS, respectively. Additionally, Xor-based ZJC outperforms LRC and ZJC RS-based by 77.5% and 90.8%, respectively. These results demonstrate that ZJC effectively reconciles the long-standing trade-off between low redundancy and high repair efficiency in distributed cloud storage.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"174 ","pages":"Article 103735"},"PeriodicalIF":4.1,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-02-04DOI: 10.1016/j.sysarc.2026.103726
Esteban Garzón , Benjamin Zambrano , David Sheinenzon , Marco Lanuzza , Adam Teman , Leonid Yavits
Sparse General matrix multiplication (SpGEMM) is a fundamental kernel in many scientific and engineering fields, including Artificial Intelligence (AI). However, its intrinsic computation complexity presents substantial challenges, making efficient hardware implementation particularly difficult. This paper proposes SPARCAM, a novel SpGEMM accelerator, developed and optimized for very energy-efficient AI edge applications. SPARCAM is designed using low-power dense Gain Cell embedded DRAM (GC-eDRAM) technology, a processing near memory paradigm, and a modified outer product matrix multiplication algorithm. Despite its quite limited peak theoretical performance, SPARCAM achieves very high energy efficiency due to its low-power architecture and almost 100% utilization of its computing resources. Designed in a commercial 28 nm FDSOI technology, SPARCAM achieves speedup over a high-performance embedded CPU when processing large-scale sparse matrices. When multiplying limited-size sparse matrices, SPARCAM obtains speedup over high-performance GPU. SPARCAM reaches about 4.3 orders-of-magnitude, on average, higher energy benefits, and , , , and , higher energy efficiency (over CPU) compared with state-of-the-art SpGEMM accelerators SpArch, OuterSPACE, MatRaptor, and high-performance GPU, respectively.
{"title":"SPARCAM: Sparse matrix multiplication accelerator using multi-port dynamic CAM","authors":"Esteban Garzón , Benjamin Zambrano , David Sheinenzon , Marco Lanuzza , Adam Teman , Leonid Yavits","doi":"10.1016/j.sysarc.2026.103726","DOIUrl":"10.1016/j.sysarc.2026.103726","url":null,"abstract":"<div><div>Sparse General matrix multiplication (SpGEMM) is a fundamental kernel in many scientific and engineering fields, including Artificial Intelligence (AI). However, its intrinsic computation complexity presents substantial challenges, making efficient hardware implementation particularly difficult. This paper proposes SPARCAM, a novel SpGEMM accelerator, developed and optimized for very energy-efficient AI edge applications. SPARCAM is designed using low-power dense Gain Cell embedded DRAM (GC-eDRAM) technology, a processing near memory paradigm, and a modified outer product matrix multiplication algorithm. Despite its quite limited peak theoretical performance, SPARCAM achieves very high energy efficiency due to its low-power architecture and almost 100% utilization of its computing resources. Designed in a commercial 28<!--> <!-->nm FDSOI technology, SPARCAM achieves <span><math><mrow><mn>13</mn><mo>.</mo><mn>9</mn><mo>×</mo></mrow></math></span> speedup over a high-performance embedded CPU when processing large-scale sparse matrices. When multiplying limited-size sparse matrices, SPARCAM obtains <span><math><mrow><mn>193</mn><mo>×</mo></mrow></math></span> speedup over high-performance GPU. SPARCAM reaches about 4.3 orders-of-magnitude, on average, higher energy benefits, and <span><math><mrow><mn>1892</mn><mo>×</mo></mrow></math></span>, <span><math><mrow><mn>181</mn><mo>×</mo></mrow></math></span>, <span><math><mrow><mn>2</mn><mo>×</mo></mrow></math></span>, and <span><math><mrow><mn>3471</mn><mo>×</mo></mrow></math></span>, higher energy efficiency (over CPU) compared with state-of-the-art SpGEMM accelerators SpArch, OuterSPACE, MatRaptor, and high-performance GPU, respectively.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"174 ","pages":"Article 103726"},"PeriodicalIF":4.1,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146191609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data sharing plays a significant role for C2V communication in VANETs. To achieve the data confidentiality, access control, data authentication and efficiency in data sharing simultaneously, this paper proposes a post quantum secure and efficient data sharing scheme by designing an attribute-based encryption (ABE) over lattices for VANETs. The proposed scheme is secure under the selective attribute and chosen-plaintexts attack (IND-sAtt-CPA) and the security is proven under the hardness of the Learning With Errors (LWE) problem in the random oracle model. Thanks to the characters of the ABE concept which is the design foundation of the proposed scheme, we also achieve the fine-grained access control of the sharing data in this paper. Furthermore, the signature of the shared ciphertext is also generated before the data is shared. Hence the proposed scheme can efficiently resist several normal network attacks such as replay attack and impersonation attack etc. The space and computation analysis of the proposed scheme show that the proposed data sharing scheme is with shorter ciphertext length and lower encryption/decryption costs which have been checked by using Java language on a personal computer with Intel (R) Core (TM) i9-14900HX processor (2.20 GHz) and 32GB RAM. At last, using NS3 software, a network simulation is given to verify the communication effects. Simulation results show that the response latency of the proposed scheme is lower than that of several known ABE schemes in VANETs. And the average message loss rate of the proposed scheme is also controlled within an acceptable interval (much less than 5%) even in high-intensity communication scenarios.
{"title":"Post quantum secure and efficient data sharing scheme from attribute based encryption for VANETs over lattices","authors":"Fenghe Wang, Meijiao Wang, Junquan Wang, Mengqi Gu","doi":"10.1016/j.sysarc.2026.103718","DOIUrl":"10.1016/j.sysarc.2026.103718","url":null,"abstract":"<div><div>Data sharing plays a significant role for C2V communication in VANETs. To achieve the data confidentiality, access control, data authentication and efficiency in data sharing simultaneously, this paper proposes a post quantum secure and efficient data sharing scheme by designing an attribute-based encryption (ABE) over lattices for VANETs. The proposed scheme is secure under the selective attribute and chosen-plaintexts attack (IND-sAtt-CPA) and the security is proven under the hardness of the Learning With Errors (LWE) problem in the random oracle model. Thanks to the characters of the ABE concept which is the design foundation of the proposed scheme, we also achieve the fine-grained access control of the sharing data in this paper. Furthermore, the signature of the shared ciphertext is also generated before the data is shared. Hence the proposed scheme can efficiently resist several normal network attacks such as replay attack and impersonation attack etc. The space and computation analysis of the proposed scheme show that the proposed data sharing scheme is with shorter ciphertext length and lower encryption/decryption costs which have been checked by using Java language on a personal computer with Intel (R) Core (TM) i9-14900HX processor (2.20 GHz) and 32GB RAM. At last, using NS3 software, a network simulation is given to verify the communication effects. Simulation results show that the response latency of the proposed scheme is lower than that of several known ABE schemes in VANETs. And the average message loss rate of the proposed scheme is also controlled within an acceptable interval (much less than 5%) even in high-intensity communication scenarios.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"174 ","pages":"Article 103718"},"PeriodicalIF":4.1,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146098574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-02-03DOI: 10.1016/j.sysarc.2026.103732
Xuran Cai , Amir Kafshdar Goharshady , S. Hitarth , Chun Kit Lam
Control-flow graphs (CFGs) of structured programs are well known to exhibit strong sparsity properties. Traditionally, this sparsity has been modeled using graph parameters such as treewidth and pathwidth, enabling the development of faster parameterized algorithms for tasks in compiler optimization, model checking, and program analysis. However, these parameters only approximate the structural constraints of CFGs: although every structured CFG has treewidth at most 7, many graphs with treewidth at most 7 cannot arise as CFGs. As a result, existing parameterized techniques are optimized for a substantially broader class of graphs than those encountered in practice.
In this work, we introduce a new grammar-based decomposition framework that characterizes exactly the class of control-flow graphs generated by structured programs. Our decomposition is intuitive, mirrors the syntactic structure of programs, and remains fully compatible with the dynamic-programming paradigm of treewidth-based methods. Using this framework, we design improved algorithms for two classical compiler optimization problems: Register Allocation and Lifetime-Optimal Speculative Partial Redundancy Elimination (LOSPRE). Extensive experimental evaluation demonstrates significant performance improvements over previous state-of-the-art approaches, highlighting the benefits of using decompositions tailored specifically to CFGs.
{"title":"Series–parallel-loop decompositions of control-flow graphs","authors":"Xuran Cai , Amir Kafshdar Goharshady , S. Hitarth , Chun Kit Lam","doi":"10.1016/j.sysarc.2026.103732","DOIUrl":"10.1016/j.sysarc.2026.103732","url":null,"abstract":"<div><div>Control-flow graphs (CFGs) of structured programs are well known to exhibit strong sparsity properties. Traditionally, this sparsity has been modeled using graph parameters such as treewidth and pathwidth, enabling the development of faster parameterized algorithms for tasks in compiler optimization, model checking, and program analysis. However, these parameters only approximate the structural constraints of CFGs: although every structured CFG has treewidth at most 7, many graphs with treewidth at most 7 cannot arise as CFGs. As a result, existing parameterized techniques are optimized for a substantially broader class of graphs than those encountered in practice.</div><div>In this work, we introduce a new grammar-based decomposition framework that characterizes <em>exactly</em> the class of control-flow graphs generated by structured programs. Our decomposition is intuitive, mirrors the syntactic structure of programs, and remains fully compatible with the dynamic-programming paradigm of treewidth-based methods. Using this framework, we design improved algorithms for two classical compiler optimization problems: <em>Register Allocation</em> and <em>Lifetime-Optimal Speculative Partial Redundancy Elimination (LOSPRE)</em>. Extensive experimental evaluation demonstrates significant performance improvements over previous state-of-the-art approaches, highlighting the benefits of using decompositions tailored specifically to CFGs.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"174 ","pages":"Article 103732"},"PeriodicalIF":4.1,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-02-07DOI: 10.1016/j.sysarc.2026.103705
Guoyu Wang , Juncheng Hu , Chenju Pei , Tengfei Li , Kedi Lyu , Xilong Che
Non-volatile memory (NVM), as a byte-addressable persistent device, offers capacity, performance, and cost characteristics that lie between DRAM and disk. Its emergence adds a new tier to the computer memory architecture and introduces new design challenges. Among various hybrid storage designs, we find that the Sync-triggered Selective Absorption (SA) model is both high-performing and broadly applicable when aiming to fully leverage NVM’s characteristics in conjunction with DRAM and disk. However, this model introduces three key crash consistency challenges: the Write Ordering, Write Granularity, and Parallel Write problems.
In this paper, we define a consistency model based on synchronization semantics and conduct an in-depth analysis of the three consistency challenges faced by SA systems. We then present targeted solutions for each problem and formally verify our design using model checking. We hope our exploration and resolution of these issues will offer guidance for the design of efficient and robust hybrid storage systems.
{"title":"Crash consistency in an NVM-enabled hybrid storage system: Problems, solutions, and verification","authors":"Guoyu Wang , Juncheng Hu , Chenju Pei , Tengfei Li , Kedi Lyu , Xilong Che","doi":"10.1016/j.sysarc.2026.103705","DOIUrl":"10.1016/j.sysarc.2026.103705","url":null,"abstract":"<div><div>Non-volatile memory (NVM), as a byte-addressable persistent device, offers capacity, performance, and cost characteristics that lie between DRAM and disk. Its emergence adds a new tier to the computer memory architecture and introduces new design challenges. Among various hybrid storage designs, we find that the Sync-triggered Selective Absorption (SA) model is both high-performing and broadly applicable when aiming to fully leverage NVM’s characteristics in conjunction with DRAM and disk. However, this model introduces three key crash consistency challenges: the Write Ordering, Write Granularity, and Parallel Write problems.</div><div>In this paper, we define a consistency model based on synchronization semantics and conduct an in-depth analysis of the three consistency challenges faced by SA systems. We then present targeted solutions for each problem and formally verify our design using model checking. We hope our exploration and resolution of these issues will offer guidance for the design of efficient and robust hybrid storage systems.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"174 ","pages":"Article 103705"},"PeriodicalIF":4.1,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-01-24DOI: 10.1016/j.sysarc.2026.103698
Rui Wang , Shichun Yang , Yuyi Chen , Zhuoyang Li , Jiayi Lu , Zexiang Tong , Jianyi Xu , Bin Sun , Xinjie Feng , Yaoguang Cao
Road terrain conditions are vital for ensuring the driving safety of autonomous vehicles (AVs). However, traditional sensors like cameras and LiDARs are sensitive to changes in lighting and weather, posing challenges for real-time road condition perception. In this paper, we propose an illumination-aware visual–tactile fusion system (IVTF) for terrain perception, integrating visual and tactile data while optimizing the fusion process based on illumination characteristics. The system employs a camera and an intelligent tire to capture visual and tactile data across various lighting conditions and vehicle speeds. Additionally, we also design a visual–tactile fusion module that dynamically adjusts the weights of different modalities according to illumination features. Comparative results with single-modality perception methods demonstrate the superior ability of visual–tactile fusion to accurately perceive road terrains under diverse lighting conditions. This approach significantly advances the robustness and reliability of terrain perception in AVs, contributing to enhanced driving safety.
{"title":"A visual–tactile fusion system for terrain perception under varying illumination conditions","authors":"Rui Wang , Shichun Yang , Yuyi Chen , Zhuoyang Li , Jiayi Lu , Zexiang Tong , Jianyi Xu , Bin Sun , Xinjie Feng , Yaoguang Cao","doi":"10.1016/j.sysarc.2026.103698","DOIUrl":"10.1016/j.sysarc.2026.103698","url":null,"abstract":"<div><div>Road terrain conditions are vital for ensuring the driving safety of autonomous vehicles (AVs). However, traditional sensors like cameras and LiDARs are sensitive to changes in lighting and weather, posing challenges for real-time road condition perception. In this paper, we propose an illumination-aware visual–tactile fusion system (IVTF) for terrain perception, integrating visual and tactile data while optimizing the fusion process based on illumination characteristics. The system employs a camera and an intelligent tire to capture visual and tactile data across various lighting conditions and vehicle speeds. Additionally, we also design a visual–tactile fusion module that dynamically adjusts the weights of different modalities according to illumination features. Comparative results with single-modality perception methods demonstrate the superior ability of visual–tactile fusion to accurately perceive road terrains under diverse lighting conditions. This approach significantly advances the robustness and reliability of terrain perception in AVs, contributing to enhanced driving safety.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"174 ","pages":"Article 103698"},"PeriodicalIF":4.1,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}