首页 > 最新文献

Future Generation Computer Systems-The International Journal of Escience最新文献

英文 中文
On-device explainable artificial intelligence for the semantic web of everything 面向万物语义网的设备上可解释人工智能
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-12-31 DOI: 10.1016/j.future.2025.108310
Davide Loconte , Saverio Ieva , Grazia Mascellaro , Agnese Pinto , Giuseppe Loseto , Floriano Scioscia , Michele Ruta
As the Internet of Things (IoT) evolves into an Internet of Everything (IoE), adapting Artificial Intelligence (AI) and Machine Learning (ML) approaches to pervasive computing devices is not enough. Collaborative intelligence is required, calling for on-device AI frameworks combining adequate accuracy and computational efficiency levels with incremental learning on continuous data streams, federated learning in distributed architectures and symbolic explainability formalisms to foster trustworthiness with interpretable trained models and comprehensible prediction outcomes. To fill this gap, the paper introduces a five-star rating for on-device AI based on the Semantic Web of Everything (SWoE) paradigm, and presents the five-star Mafalda 2.0 framework. It combines statistical data processing with Knowledge Graph technologies for information representation and automated reasoning to support: semi-automatic or fully data-driven ontology definition; on-device training to generate highly interpretable semantics-based models; prediction framed as a semantic matchmaking problem, exploiting non-standard reasoning services endowed with logic-based justifications to provide comprehensible results as well as counterfactual and contrastive explanations. An experimental campaign on four publicly available datasets has been carried out to validate the efficiency and accuracy of the proposal, along with federated learning and explainability examples.
随着物联网(IoT)发展成为万物互联(IoE),将人工智能(AI)和机器学习(ML)方法应用于普适计算设备是不够的。需要协作智能,要求设备上的人工智能框架将足够的准确性和计算效率水平与连续数据流的增量学习、分布式架构中的联邦学习和符号可解释性形式相结合,从而通过可解释的训练模型和可理解的预测结果来培养可信度。为了填补这一空白,本文引入了基于万物语义网(SWoE)范式的设备上人工智能五星评级,并提出了五星的Mafalda 2.0框架。它将统计数据处理与知识图技术相结合,用于信息表示和自动推理,以支持:半自动或完全数据驱动的本体定义;设备上的培训,以生成高度可解释的基于语义的模型;预测是一个语义匹配问题,利用非标准的推理服务,赋予基于逻辑的理由,提供可理解的结果,以及反事实和对比的解释。在四个公开可用的数据集上进行了一项实验,以验证该提议的效率和准确性,以及联邦学习和可解释性示例。
{"title":"On-device explainable artificial intelligence for the semantic web of everything","authors":"Davide Loconte ,&nbsp;Saverio Ieva ,&nbsp;Grazia Mascellaro ,&nbsp;Agnese Pinto ,&nbsp;Giuseppe Loseto ,&nbsp;Floriano Scioscia ,&nbsp;Michele Ruta","doi":"10.1016/j.future.2025.108310","DOIUrl":"10.1016/j.future.2025.108310","url":null,"abstract":"<div><div>As the Internet of Things (IoT) evolves into an Internet of Everything (IoE), adapting Artificial Intelligence (AI) and Machine Learning (ML) approaches to pervasive computing devices is not enough. Collaborative intelligence is required, calling for on-device AI frameworks combining adequate accuracy and computational efficiency levels with incremental learning on continuous data streams, federated learning in distributed architectures and symbolic explainability formalisms to foster trustworthiness with interpretable trained models and comprehensible prediction outcomes. To fill this gap, the paper introduces a five-star rating for on-device AI based on the Semantic Web of Everything (SWoE) paradigm, and presents the five-star <span>Mafalda</span> 2.0 framework. It combines statistical data processing with Knowledge Graph technologies for information representation and automated reasoning to support: semi-automatic or fully data-driven ontology definition; on-device training to generate highly interpretable semantics-based models; prediction framed as a semantic matchmaking problem, exploiting non-standard reasoning services endowed with logic-based justifications to provide comprehensible results as well as counterfactual and contrastive explanations. An experimental campaign on four publicly available datasets has been carried out to validate the efficiency and accuracy of the proposal, along with federated learning and explainability examples.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"179 ","pages":"Article 108310"},"PeriodicalIF":6.2,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowledge distillation-based Multi-Optimization intrusion detection system 基于知识提取的多优化入侵检测系统
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-12-31 DOI: 10.1016/j.future.2025.108296
Haofan Wang , Farah Kandah
Network attacks have expanded in scope, increased in frequency, and evolved in many ways in recent years. Internet of Things (IoT) devices, due to their limited computational resources, massive deployment, direct exposure to the public Internet, and lack of maintenance, face even more severe threat landscapes. Nowadays, numerous lightweight methods have been proposed, but they all rely on single-perspective optimizations, making it difficult to achieve an optimal balance between performance and computational resource consumption. In this work, we proposed a Knowledge Distillation-based Multi-Optimization Intrusion Detection System (KDMO-IDS) that reduces resource consumption at the feature, sample, and model levels. At the feature level, we compute the Analysis of Variance (ANOVA) F-value for each feature to rank them and determine the optimal subset. At the sample level, we use MiniBatchKMeans with Medoid clustering to compress data under preset ratios At the model level, we combine knowledge distillation with attention transfer so that a compact student model retains the performance of its teacher, further optimized by block operator fusion, pruning, and early stopping. We conduct extensive ablation studies to validate the contribution of each component. Experiments on WUSTL-IIoT and X-IIoTID datasets show that our proposed KDMO-IDS demonstrates superior performance and exhibits strong lightweight characteristics and generalizability compare to existing baseline models, making it well-suited for seamless integration into edge-cloud and distributed computing environments and providing a scalable security solution for next-generation high-performance IoT systems.
近年来,网络攻击的范围扩大了,频率增加了,并且在许多方面发生了变化。物联网(IoT)设备由于其有限的计算资源、大规模部署、直接暴露于公共互联网以及缺乏维护,面临着更加严重的威胁。目前,已经提出了许多轻量级方法,但它们都依赖于单视角优化,因此很难在性能和计算资源消耗之间实现最佳平衡。在这项工作中,我们提出了一种基于知识蒸馏的多优化入侵检测系统(KDMO-IDS),该系统减少了特征,样本和模型级别的资源消耗。在特征层面,我们计算每个特征的方差分析(ANOVA) f值,对它们进行排序并确定最优子集。在样本层面,我们使用MiniBatchKMeans和mediid聚类在预设比例下压缩数据。在模型层面,我们将知识蒸馏和注意力转移结合起来,使紧凑的学生模型保留了教师的表现,并通过块算子融合、剪枝和提前停止进一步优化。我们进行了广泛的消融研究,以验证每个组成部分的贡献。在WUSTL-IIoT和X-IIoTID数据集上的实验表明,与现有基线模型相比,我们提出的KDMO-IDS表现出卓越的性能,具有强大的轻量级特性和通用性,使其非常适合无缝集成到边缘云和分布式计算环境中,并为下一代高性能物联网系统提供可扩展的安全解决方案。
{"title":"Knowledge distillation-based Multi-Optimization intrusion detection system","authors":"Haofan Wang ,&nbsp;Farah Kandah","doi":"10.1016/j.future.2025.108296","DOIUrl":"10.1016/j.future.2025.108296","url":null,"abstract":"<div><div>Network attacks have expanded in scope, increased in frequency, and evolved in many ways in recent years. Internet of Things (IoT) devices, due to their limited computational resources, massive deployment, direct exposure to the public Internet, and lack of maintenance, face even more severe threat landscapes. Nowadays, numerous lightweight methods have been proposed, but they all rely on single-perspective optimizations, making it difficult to achieve an optimal balance between performance and computational resource consumption. In this work, we proposed a Knowledge Distillation-based Multi-Optimization Intrusion Detection System (KDMO-IDS) that reduces resource consumption at the feature, sample, and model levels. At the feature level, we compute the Analysis of Variance (ANOVA) F-value for each feature to rank them and determine the optimal subset. At the sample level, we use MiniBatchKMeans with Medoid clustering to compress data under preset ratios At the model level, we combine knowledge distillation with attention transfer so that a compact student model retains the performance of its teacher, further optimized by block operator fusion, pruning, and early stopping. We conduct extensive ablation studies to validate the contribution of each component. Experiments on WUSTL-IIoT and X-IIoTID datasets show that our proposed KDMO-IDS demonstrates superior performance and exhibits strong lightweight characteristics and generalizability compare to existing baseline models, making it well-suited for seamless integration into edge-cloud and distributed computing environments and providing a scalable security solution for next-generation high-performance IoT systems.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"180 ","pages":"Article 108296"},"PeriodicalIF":6.2,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semi-asynchronous energy-efficient federated prototype learning for end-edge-cloud architectures 面向端缘云架构的半异步节能联邦原型学习
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-12-31 DOI: 10.1016/j.future.2025.108351
Wendian Luo , Tong Yu , Shengxin Dai , Bing Guo , Xuesen Lin , Yanglin Pu
As Industry 5.0 advances rapidly, the Industrial Internet of Things (IIoT) integrates Artificial Intelligence (AI) technology to significantly enhance the intelligence of production processes. However, this advancement results in faster data generation and a higher demand for data processing in industrial scenarios. This leads to the sustained high-load operation of edge devices and cloud servers, which increases carbon emissions and raises concerns about data security. Implementing Federated Learning (FL) in IIoT frameworks effectively distributes the computational burden between the client and server, resolving privacy issues and enhancing energy efficiency. However, achieving energy efficiency while improving model performance is challenging within an IIoT system marked by heterogeneous AI models and imbalanced data. We present a semi-asynchronous, energy-efficient, federated prototype learning approach tailored to tackle these challenges with end-edge-cloud architectures. This method uploads data distribution instead of the raw data for privacy protection, employing Dynamic Voltage and Frequency Scaling (DVFS) technology to manage power consumption during training, thus achieving optimal energy efficiency. To boost model performance, we confront data imbalance by collecting feature distribution data from clients, generating virtual samples on the cloud server, and training a global classifier to promote local client learning. Our experiments across various datasets, including industrial datasets and large-scale heterogeneous scenarios, demonstrate that the proposed method enhances model accuracy and significantly reduces energy consumption compared to competitive methods, thereby validating its applicability in real-world, diverse environments.
随着工业5.0的快速发展,工业物联网(IIoT)集成了人工智能(AI)技术,大大提高了生产过程的智能化。然而,这种进步导致了更快的数据生成和对工业场景中数据处理的更高要求。这导致边缘设备和云服务器持续高负载运行,从而增加了碳排放,并引发了对数据安全的担忧。在工业物联网框架中实施联邦学习(FL)可以有效地在客户端和服务器之间分配计算负担,解决隐私问题并提高能源效率。然而,在以异构人工智能模型和不平衡数据为特征的工业物联网系统中,在提高模型性能的同时实现能源效率是一项挑战。我们提出了一种半异步、节能的联合原型学习方法,专门用于解决端边缘云架构的这些挑战。该方法通过上传数据分布代替原始数据进行隐私保护,采用动态电压和频率缩放(DVFS)技术对训练过程中的功耗进行管理,从而达到最佳的能源效率。为了提高模型性能,我们通过从客户端收集特征分布数据,在云服务器上生成虚拟样本,以及训练全局分类器来促进本地客户端学习来解决数据不平衡问题。我们在各种数据集(包括工业数据集和大规模异构场景)上的实验表明,与竞争方法相比,所提出的方法提高了模型精度,显著降低了能耗,从而验证了其在现实世界中不同环境的适用性。
{"title":"Semi-asynchronous energy-efficient federated prototype learning for end-edge-cloud architectures","authors":"Wendian Luo ,&nbsp;Tong Yu ,&nbsp;Shengxin Dai ,&nbsp;Bing Guo ,&nbsp;Xuesen Lin ,&nbsp;Yanglin Pu","doi":"10.1016/j.future.2025.108351","DOIUrl":"10.1016/j.future.2025.108351","url":null,"abstract":"<div><div>As Industry 5.0 advances rapidly, the Industrial Internet of Things (IIoT) integrates Artificial Intelligence (AI) technology to significantly enhance the intelligence of production processes. However, this advancement results in faster data generation and a higher demand for data processing in industrial scenarios. This leads to the sustained high-load operation of edge devices and cloud servers, which increases carbon emissions and raises concerns about data security. Implementing Federated Learning (FL) in IIoT frameworks effectively distributes the computational burden between the client and server, resolving privacy issues and enhancing energy efficiency. However, achieving energy efficiency while improving model performance is challenging within an IIoT system marked by heterogeneous AI models and imbalanced data. We present a semi-asynchronous, energy-efficient, federated prototype learning approach tailored to tackle these challenges with end-edge-cloud architectures. This method uploads data distribution instead of the raw data for privacy protection, employing Dynamic Voltage and Frequency Scaling (DVFS) technology to manage power consumption during training, thus achieving optimal energy efficiency. To boost model performance, we confront data imbalance by collecting feature distribution data from clients, generating virtual samples on the cloud server, and training a global classifier to promote local client learning. Our experiments across various datasets, including industrial datasets and large-scale heterogeneous scenarios, demonstrate that the proposed method enhances model accuracy and significantly reduces energy consumption compared to competitive methods, thereby validating its applicability in real-world, diverse environments.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"179 ","pages":"Article 108351"},"PeriodicalIF":6.2,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cost-efficient and topology-aware scheduling algorithms in distributed stream computing systems 分布式流计算系统中具有成本效益和拓扑感知的调度算法
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-12-30 DOI: 10.1016/j.future.2025.108340
Hongjian Li , Shuheng Wang , Gangfan Tan , Xiaolin Duan
With the rapid growth of data volume and increasing real-time processing requirements, stream processing systems face challenges of execution inefficiency and excessive resource consumption. Apache Storm employs a simplistic round-robin scheduling strategy by default, neglecting node heterogeneity, task topology, and varying traffic patterns, leading to performance degradation and resource wastage. To address these limitations, this paper proposes two novel scheduling strategies: a resource-cost and topology-aware distributed method (MMO-Stream) and a resource-aware cooperative strategy (D-Storm). MMO-Stream integrates a cost-effective Quality-of-Service (QoS) model with a meta-heuristic-based multi-criteria optimization algorithm to optimize resource consumption, latency, and throughput simultaneously. D-Storm utilizes historical performance data and resource-awareness mechanisms to dynamically optimize task reallocation, mitigating performance deterioration from frequent rescheduling. Experimental results show MMO-Stream achieves cost-effective QoS (C-QoS) improvements of 41.7% and 39.5%, and latency reductions of 23.9% and 15.8%, compared to Storm’s default scheduling and Ts-Stream, respectively. D-Storm reduces latency by 23.9% and 37.5% compared to default and Ts-Stream strategies, significantly outperforming MMO-Stream. The proposed methods effectively enhance Storm’s scheduling performance and resource efficiency.
随着数据量的快速增长和实时处理需求的不断提高,流处理系统面临着执行效率低下和资源消耗过大的挑战。Apache Storm默认采用简单的循环调度策略,忽略了节点异构性、任务拓扑和不同的流量模式,导致性能下降和资源浪费。为了解决这些限制,本文提出了两种新的调度策略:资源成本和拓扑感知的分布式方法(MMO-Stream)和资源感知的协作策略(D-Storm)。MMO-Stream集成了具有成本效益的服务质量(QoS)模型和基于元启发式的多准则优化算法,以同时优化资源消耗、延迟和吞吐量。D-Storm利用历史性能数据和资源感知机制来动态优化任务重新分配,减轻频繁重新调度带来的性能下降。实验结果表明,与Storm的默认调度和Ts-Stream相比,MMO-Stream的C-QoS性能分别提高了41.7%和39.5%,时延分别降低了23.9%和15.8%。与默认策略和Ts-Stream策略相比,D-Storm减少了23.9%和37.5%的延迟,明显优于MMO-Stream。提出的方法有效地提高了Storm的调度性能和资源效率。
{"title":"Cost-efficient and topology-aware scheduling algorithms in distributed stream computing systems","authors":"Hongjian Li ,&nbsp;Shuheng Wang ,&nbsp;Gangfan Tan ,&nbsp;Xiaolin Duan","doi":"10.1016/j.future.2025.108340","DOIUrl":"10.1016/j.future.2025.108340","url":null,"abstract":"<div><div>With the rapid growth of data volume and increasing real-time processing requirements, stream processing systems face challenges of execution inefficiency and excessive resource consumption. Apache Storm employs a simplistic round-robin scheduling strategy by default, neglecting node heterogeneity, task topology, and varying traffic patterns, leading to performance degradation and resource wastage. To address these limitations, this paper proposes two novel scheduling strategies: a resource-cost and topology-aware distributed method (<strong>MMO-Stream</strong>) and a resource-aware cooperative strategy (<strong>D-Storm</strong>). MMO-Stream integrates a cost-effective Quality-of-Service (QoS) model with a meta-heuristic-based multi-criteria optimization algorithm to optimize resource consumption, latency, and throughput simultaneously. D-Storm utilizes historical performance data and resource-awareness mechanisms to dynamically optimize task reallocation, mitigating performance deterioration from frequent rescheduling. Experimental results show MMO-Stream achieves cost-effective QoS (C-QoS) improvements of 41.7% and 39.5%, and latency reductions of 23.9% and 15.8%, compared to Storm’s default scheduling and Ts-Stream, respectively. D-Storm reduces latency by 23.9% and 37.5% compared to default and Ts-Stream strategies, significantly outperforming MMO-Stream. The proposed methods effectively enhance Storm’s scheduling performance and resource efficiency.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"179 ","pages":"Article 108340"},"PeriodicalIF":6.2,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145885750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explainable AI-guided test-time adversarial defense for resilient YOLO detectors in Industrial Internet of Things 工业物联网中弹性YOLO探测器的可解释ai引导测试时间对抗防御
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-12-30 DOI: 10.1016/j.future.2025.108356
Ruinan Ma , Zuobin Ying , Wenjuan Li , Dehua Zhu , Wanlei Zhou , Yu-An Tan , Hongyi Liu
With deep learning-based object detectors widely deployed as visual components in Industrial Internet of Things (IIoT) devices like cameras, their adversarial robustness has become paramount to the security and resilience of hyperconnected industrial systems. Existing adversarial defenses are often inadequate for the complexities of object detection, and securing already deployed detectors with a lightweight defense that avoids costly retraining remains a major challenge. In this paper, we propose XAIAD-YOLO: Explainable AI-Guided Adversarial Defense for YOLO detectors, a novel test-time defense to enable resilient YOLO detectors. XAIAD-YOLO introduces a synergistic two-stage purification framework grounded in distinct theoretical principles. Its initial stage, based on signal processing principles, filters high-frequency adversarial noise from genuine image structures. The second stage performs targeted feature destabilization; guided by our efficient XAI saliency map and grounded in the principle of differential feature stability, it precisely neutralizes fragile adversarial artifacts. Experiments show that our XAI method achieves 66.08 FPS (1.56x faster than Grad-CAM++), and our defense method significantly improves adversarial robustness, making anchor-based, anchor-free, lightweight, and non-lightweight YOLO detectors more resilient in both white-box and black-box scenarios. By uniquely integrating explainability into the defense mechanism, XAIAD-YOLO provides a practical and effective solution for enhancing the resilience and trustworthiness of AI in critical industrial applications. Our source code and datasets are available https://anonymous.4open.science/r/XAIAD-YOLO-B0A3/here.
随着基于深度学习的对象检测器作为视觉组件广泛部署在工业物联网(IIoT)设备(如摄像头)中,它们的对抗性鲁棒性对于超连接工业系统的安全性和弹性至关重要。现有的对抗性防御通常不足以应对目标检测的复杂性,并且使用轻量级防御来保护已经部署的探测器,以避免昂贵的再培训仍然是一个主要挑战。在本文中,我们提出了XAIAD-YOLO:用于YOLO探测器的可解释ai制导对抗防御,这是一种新的测试时间防御,可以使YOLO探测器具有弹性。XAIAD-YOLO引入了基于不同理论原理的协同两阶段净化框架。它的初始阶段,基于信号处理原理,从真实图像结构中过滤高频对抗噪声。第二阶段执行目标特征不稳定;在我们高效的XAI显著性地图的指导下,基于差分特征稳定性的原则,它精确地中和了脆弱的对抗性人工制品。实验表明,我们的XAI方法达到了66.08 FPS(比Grad-CAM++快1.56倍),并且我们的防御方法显著提高了对抗鲁棒性,使基于锚点的、无锚点的、轻量级的和非轻量级的YOLO探测器在白盒和黑盒场景下都更具弹性。通过独特地将可解释性集成到防御机制中,XAIAD-YOLO为增强关键工业应用中人工智能的弹性和可信度提供了实用有效的解决方案。我们的源代码和数据集可以在https://anonymous.4open.science/r/XAIAD-YOLO-B0A3/here上找到。
{"title":"Explainable AI-guided test-time adversarial defense for resilient YOLO detectors in Industrial Internet of Things","authors":"Ruinan Ma ,&nbsp;Zuobin Ying ,&nbsp;Wenjuan Li ,&nbsp;Dehua Zhu ,&nbsp;Wanlei Zhou ,&nbsp;Yu-An Tan ,&nbsp;Hongyi Liu","doi":"10.1016/j.future.2025.108356","DOIUrl":"10.1016/j.future.2025.108356","url":null,"abstract":"<div><div>With deep learning-based object detectors widely deployed as visual components in Industrial Internet of Things (IIoT) devices like cameras, their adversarial robustness has become paramount to the security and resilience of hyperconnected industrial systems. Existing adversarial defenses are often inadequate for the complexities of object detection, and securing already deployed detectors with a lightweight defense that avoids costly retraining remains a major challenge. In this paper, we propose XAIAD-YOLO: Explainable AI-Guided Adversarial Defense for YOLO detectors, a novel test-time defense to enable resilient YOLO detectors. XAIAD-YOLO introduces a synergistic two-stage purification framework grounded in distinct theoretical principles. Its initial stage, based on signal processing principles, filters high-frequency adversarial noise from genuine image structures. The second stage performs targeted feature destabilization; guided by our efficient XAI saliency map and grounded in the principle of differential feature stability, it precisely neutralizes fragile adversarial artifacts. Experiments show that our XAI method achieves 66.08 FPS (1.56x faster than Grad-CAM++), and our defense method significantly improves adversarial robustness, making anchor-based, anchor-free, lightweight, and non-lightweight YOLO detectors more resilient in both white-box and black-box scenarios. By uniquely integrating explainability into the defense mechanism, XAIAD-YOLO provides a practical and effective solution for enhancing the resilience and trustworthiness of AI in critical industrial applications. Our source code and datasets are available <span><span>https://anonymous.4open.science/r/XAIAD-YOLO-B0A3/here</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"179 ","pages":"Article 108356"},"PeriodicalIF":6.2,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic tuning based on hardware performance counters and machine learning 基于硬件性能计数器和机器学习的自动调优
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-12-30 DOI: 10.1016/j.future.2025.108358
Suren Harutyunyan Gevorgyan , Eduardo César , Anna Sikora , Jiří Filipovič , Jordi Alcaraz
This paper presents a Machine Learning (ML) methodology for automatically tuning parallel applications in heterogeneous High Performance Computing (HPC) environments using Hardware Performance Counters (HwPCs). The methodology addresses three critical challenges: counter quantity versus accessibility tradeoff, data interpretation complexity, and dynamic optimization needs. The introduced ensemble-based methodology automatically identifies minimal yet informative HwPC sets for code region identification and tuning parameter optimization. Experimental validation demonstrates high accuracy in predicting optimal thread allocation ( > 0.90 K-fold accuracy) and thread affinity ( > 0.95 accuracy) while requiring only 4–6 HwPCs. Compared to search-based methods like OpenTuner, the methodology achieves competitive performance with dramatically reduced optimization time. The architecture-agnostic design enables consistent performance across CPU and GPU platforms. These results establish a foundation for efficient, portable, automatic, and scalable tuning of parallel applications.
本文提出了一种机器学习(ML)方法,用于使用硬件性能计数器(hwpc)在异构高性能计算(HPC)环境中自动调整并行应用程序。该方法解决了三个关键挑战:计数器数量与可访问性的权衡、数据解释的复杂性和动态优化需求。引入的基于集成的方法自动识别最小但信息HwPC集代码区域识别和调优参数优化。实验验证表明,在预测最佳线程分配( >; 0.90 k倍精度)和线程亲和性( >; 0.95精度)时,只需要4-6个hwpc。与基于搜索的方法(如OpenTuner)相比,该方法在显著减少优化时间的情况下实现了具有竞争力的性能。与架构无关的设计使CPU和GPU平台的性能保持一致。这些结果为并行应用程序的高效、可移植、自动和可伸缩调优奠定了基础。
{"title":"Automatic tuning based on hardware performance counters and machine learning","authors":"Suren Harutyunyan Gevorgyan ,&nbsp;Eduardo César ,&nbsp;Anna Sikora ,&nbsp;Jiří Filipovič ,&nbsp;Jordi Alcaraz","doi":"10.1016/j.future.2025.108358","DOIUrl":"10.1016/j.future.2025.108358","url":null,"abstract":"<div><div>This paper presents a Machine Learning (ML) methodology for automatically tuning parallel applications in heterogeneous High Performance Computing (HPC) environments using Hardware Performance Counters (HwPCs). The methodology addresses three critical challenges: counter quantity versus accessibility tradeoff, data interpretation complexity, and dynamic optimization needs. The introduced ensemble-based methodology automatically identifies minimal yet informative HwPC sets for code region identification and tuning parameter optimization. Experimental validation demonstrates high accuracy in predicting optimal thread allocation ( &gt; 0.90 K-fold accuracy) and thread affinity ( &gt; 0.95 accuracy) while requiring only 4–6 HwPCs. Compared to search-based methods like OpenTuner, the methodology achieves competitive performance with dramatically reduced optimization time. The architecture-agnostic design enables consistent performance across CPU and GPU platforms. These results establish a foundation for efficient, portable, automatic, and scalable tuning of parallel applications.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"179 ","pages":"Article 108358"},"PeriodicalIF":6.2,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Resource-Efficient joint clustering and storage optimization for blockchain-Based IoT systems 基于区块链的物联网系统资源高效联合集群与存储优化
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-12-30 DOI: 10.1016/j.future.2025.108354
Kai Peng , Xueyan Hu , Jiaxing Hu , Zhiheng Yao , Tianping Deng , Menglan Hu , Chao Cai , Zehui Xiong
Blockchain technology is leveraged in the Internet of Things (IoT) systems to enhance data reliability and management efficiency, ensuring integrity, security, and auditability through a decentralized ledger architecture. However, resource-constrained IoT devices are unable to store the complete blockchain due to prohibitive resource consumption and performance degradation. While collaborative storage strategies have been proposed to mitigate these constraints, existing approaches often prioritize storage scalability without sufficiently addressing the selection of cooperative nodes for distributed ledger maintenance. This can lead to significant communication delays during block retrieval, undermining the real-time performance and overall efficiency of the blockchain-enabled IoT system. To address this challenge, this paper introduces a clustering-based collaborative storage scheme and proposes a novel joint optimization algorithm that iteratively refines both node clustering and block allocation strategies within the blockchain network. By structuring IoT devices into clustered peers, the algorithm reduces block query latency and facilitates efficient blockchain synchronization and update processes. Experimental evaluations confirm that the proposed method effectively alleviates storage limitations and lowers access costs in static blockchain-based IoT environments.
区块链技术被应用于物联网(IoT)系统,增强数据可靠性和管理效率,通过分布式账本架构确保完整性、安全性和可审计性。然而,由于资源消耗和性能下降,资源受限的物联网设备无法存储完整的区块链。虽然已经提出了协作存储策略来缓解这些限制,但现有的方法通常优先考虑存储可扩展性,而没有充分解决分布式账本维护的合作节点的选择。这可能导致在块检索过程中出现严重的通信延迟,从而破坏支持区块链的物联网系统的实时性能和整体效率。为了解决这一挑战,本文引入了一种基于聚类的协同存储方案,并提出了一种新的联合优化算法,该算法迭代地改进了区块链网络中的节点聚类和块分配策略。通过将物联网设备构建成集群对等体,该算法减少了块查询延迟,促进了高效的区块链同步和更新过程。实验评估证实,在基于区块链的静态物联网环境中,该方法有效地缓解了存储限制并降低了访问成本。
{"title":"Resource-Efficient joint clustering and storage optimization for blockchain-Based IoT systems","authors":"Kai Peng ,&nbsp;Xueyan Hu ,&nbsp;Jiaxing Hu ,&nbsp;Zhiheng Yao ,&nbsp;Tianping Deng ,&nbsp;Menglan Hu ,&nbsp;Chao Cai ,&nbsp;Zehui Xiong","doi":"10.1016/j.future.2025.108354","DOIUrl":"10.1016/j.future.2025.108354","url":null,"abstract":"<div><div>Blockchain technology is leveraged in the Internet of Things (IoT) systems to enhance data reliability and management efficiency, ensuring integrity, security, and auditability through a decentralized ledger architecture. However, resource-constrained IoT devices are unable to store the complete blockchain due to prohibitive resource consumption and performance degradation. While collaborative storage strategies have been proposed to mitigate these constraints, existing approaches often prioritize storage scalability without sufficiently addressing the selection of cooperative nodes for distributed ledger maintenance. This can lead to significant communication delays during block retrieval, undermining the real-time performance and overall efficiency of the blockchain-enabled IoT system. To address this challenge, this paper introduces a clustering-based collaborative storage scheme and proposes a novel joint optimization algorithm that iteratively refines both node clustering and block allocation strategies within the blockchain network. By structuring IoT devices into clustered peers, the algorithm reduces block query latency and facilitates efficient blockchain synchronization and update processes. Experimental evaluations confirm that the proposed method effectively alleviates storage limitations and lowers access costs in static blockchain-based IoT environments.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"179 ","pages":"Article 108354"},"PeriodicalIF":6.2,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145885835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TPQA: Efficient attention architecture with task-aware pattern-guided quantization TPQA:基于任务感知模式导向量化的高效注意架构
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-12-29 DOI: 10.1016/j.future.2025.108352
Sijia Wang , Shengbing Zhang , Lun Zhang , Yichao Yuan , Yawen Zhao , Xinyu Zhang , Meng Zhang
Attention mechanisms have become a cornerstone of modern deep learning models, yet their computational intensity poses significant deployment challenges for resource-limited devices. While quantization offers a potential solution, current approaches typically employ uniform precision assignment schemes across all attention heads, neglecting critical variations in head-specific contributions across different tasks. This oversight results in substantial computational redundancy for those attention heads with fewer contributions, impacting overall performance. Through systematic analysis of head pattern characteristics in transformer models, we reveal two key insights: different attention heads exhibit distinct task-aware patterns, and their varying contributions to model performance directly dictate differentiated quantization demands across heads. Building on these findings, we propose TPQA, a novel algorithm and accelerator co-design architecture for efficient deployment of transformer models. TPQA strategically assigns adaptive precision levels to each head based on pre-identified patterns, thereby reducing computational overhead while preserving model accuracy. Furthermore, TPQA employs a data reordering strategy to transform irregular workloads into structured formats and introduces a dedicated accelerator with an attention-weights-stationary dataflow to efficiently process these structured workloads. Comprehensive evaluations demonstrate TPQA’s superior performance, achieving up to 2.1 ×  speedup and 3.4 ×  energy efficiency improvement over state-of-the-art accelerators while maintaining <1% accuracy loss on various tasks.
注意机制已成为现代深度学习模型的基石,但其计算强度对资源有限的设备构成了重大的部署挑战。虽然量化提供了一个潜在的解决方案,但目前的方法通常在所有注意头中采用统一的精度分配方案,忽略了不同任务中头部特定贡献的关键变化。这种疏忽导致那些贡献较少的注意力头产生大量的计算冗余,从而影响整体性能。通过系统分析变压器模型中的头部模式特征,我们揭示了两个关键见解:不同的注意头部表现出不同的任务感知模式,它们对模型性能的不同贡献直接决定了不同头部的量化需求。基于这些发现,我们提出了TPQA,一种新的算法和加速器协同设计架构,用于有效部署变压器模型。TPQA基于预先识别的模式有策略地为每个头部分配自适应精度级别,从而在保持模型准确性的同时减少计算开销。此外,TPQA采用数据重新排序策略将不规则的工作负载转换为结构化格式,并引入一个专用加速器,该加速器具有注意力权重固定的数据流,可以有效地处理这些结构化工作负载。综合评估表明,TPQA具有卓越的性能,与最先进的加速器相比,可实现高达2.1 × 的加速和3.4 × 的能效改进,同时在各种任务中保持1%的精度损失。
{"title":"TPQA: Efficient attention architecture with task-aware pattern-guided quantization","authors":"Sijia Wang ,&nbsp;Shengbing Zhang ,&nbsp;Lun Zhang ,&nbsp;Yichao Yuan ,&nbsp;Yawen Zhao ,&nbsp;Xinyu Zhang ,&nbsp;Meng Zhang","doi":"10.1016/j.future.2025.108352","DOIUrl":"10.1016/j.future.2025.108352","url":null,"abstract":"<div><div>Attention mechanisms have become a cornerstone of modern deep learning models, yet their computational intensity poses significant deployment challenges for resource-limited devices. While quantization offers a potential solution, current approaches typically employ uniform precision assignment schemes across all attention heads, neglecting critical variations in head-specific contributions across different tasks. This oversight results in substantial computational redundancy for those attention heads with fewer contributions, impacting overall performance. Through systematic analysis of head pattern characteristics in transformer models, we reveal two key insights: different attention heads exhibit distinct task-aware patterns, and their varying contributions to model performance directly dictate differentiated quantization demands across heads. Building on these findings, we propose TPQA, a novel algorithm and accelerator co-design architecture for efficient deployment of transformer models. TPQA strategically assigns adaptive precision levels to each head based on pre-identified patterns, thereby reducing computational overhead while preserving model accuracy. Furthermore, TPQA employs a data reordering strategy to transform irregular workloads into structured formats and introduces a dedicated accelerator with an attention-weights-stationary dataflow to efficiently process these structured workloads. Comprehensive evaluations demonstrate TPQA’s superior performance, achieving up to 2.1 ×  speedup and 3.4 ×  energy efficiency improvement over state-of-the-art accelerators while maintaining &lt;1% accuracy loss on various tasks.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"179 ","pages":"Article 108352"},"PeriodicalIF":6.2,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Software aging issues and rejuvenation strategies for a container orchestration system 容器编排系统的软件老化问题和复兴策略
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-12-28 DOI: 10.1016/j.future.2025.108274
Marcelo Santos , Rubens Matos , Marco Vieira , Jean Araujo
Software Aging and Rejuvenation (SAR) has been extensively studied due to its critical role in ensuring the reliable operation of systems. Although container orchestration is essential for efficiently managing and scaling cloud resources, the impact of SAR is not yet fully understood. This paper presents experiments conducted on two versions of Ubuntu Linux, simulating the operational scenarios of a private cloud. Each cluster includes one Main node and three Worker nodes, utilizing Containerd as the container runtime and Kubernetes as the orchestrator, across four distinct scenarios. The primary experimental conditions were maintained across all scenarios, including configurations, workloads, and test duration. Throughout each experiment, metrics such as CPU utilization, memory usage and disk utilization were monitored, considering system-wide values and observations for the Containerd and Kubelet services. The experiments also included measuring the response time of a web server for external HTTP requests submitted to the clusters. The initial scenario focused on investigating the effects of software aging, while subsequent scenarios explored the adoption of different rejuvenation strategies. Effects of software aging were observed across all scenarios, with resource leaks identified, particularly in memory usage, even when the cluster was under no load. The issues observed can lead to performance degradation and compromise reliability and availability if the system crashes due to memory exhaustion.
软件老化与返老还老(SAR)是保证系统可靠运行的关键问题,因此得到了广泛的研究。尽管容器编排对于有效地管理和扩展云资源是必不可少的,但是SAR的影响还没有被完全理解。本文在两个版本的Ubuntu Linux上进行了实验,模拟了私有云的操作场景。每个集群包括一个Main节点和三个Worker节点,使用Containerd作为容器运行时,使用Kubernetes作为编排器,跨越四个不同的场景。在所有场景中维持主要实验条件,包括配置、工作负载和测试持续时间。在每次实验中,考虑到Containerd和Kubelet服务的系统范围值和观察结果,对CPU利用率、内存使用和磁盘利用率等指标进行了监控。实验还包括测量web服务器对提交到集群的外部HTTP请求的响应时间。最初的场景侧重于调查软件老化的影响,而随后的场景则探讨了采用不同的恢复策略。在所有场景中都可以观察到软件老化的影响,即使集群处于无负载状态,也会发现资源泄漏,特别是内存使用。如果系统由于内存耗尽而崩溃,所观察到的问题可能导致性能下降,并损害可靠性和可用性。
{"title":"Software aging issues and rejuvenation strategies for a container orchestration system","authors":"Marcelo Santos ,&nbsp;Rubens Matos ,&nbsp;Marco Vieira ,&nbsp;Jean Araujo","doi":"10.1016/j.future.2025.108274","DOIUrl":"10.1016/j.future.2025.108274","url":null,"abstract":"<div><div>Software Aging and Rejuvenation (SAR) has been extensively studied due to its critical role in ensuring the reliable operation of systems. Although container orchestration is essential for efficiently managing and scaling cloud resources, the impact of SAR is not yet fully understood. This paper presents experiments conducted on two versions of Ubuntu Linux, simulating the operational scenarios of a private cloud. Each cluster includes one Main node and three Worker nodes, utilizing Containerd as the container runtime and Kubernetes as the orchestrator, across four distinct scenarios. The primary experimental conditions were maintained across all scenarios, including configurations, workloads, and test duration. Throughout each experiment, metrics such as CPU utilization, memory usage and disk utilization were monitored, considering system-wide values and observations for the Containerd and Kubelet services. The experiments also included measuring the response time of a web server for external HTTP requests submitted to the clusters. The initial scenario focused on investigating the effects of software aging, while subsequent scenarios explored the adoption of different rejuvenation strategies. Effects of software aging were observed across all scenarios, with resource leaks identified, particularly in memory usage, even when the cluster was under no load. The issues observed can lead to performance degradation and compromise reliability and availability if the system crashes due to memory exhaustion.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"179 ","pages":"Article 108274"},"PeriodicalIF":6.2,"publicationDate":"2025-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145845123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HERCULES: A scalable and elastic ad-hoc file system for large-scale computing systems HERCULES:用于大规模计算系统的可伸缩和弹性的临时文件系统
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-12-28 DOI: 10.1016/j.future.2025.108350
Genaro Sánchez-Gallegos, Cosmin Petre, Javier Garcia-Blas, Jesus Carretero
The increasing demand for data processing by new, data-intensive applications is placing significant strain on the performance and capacity of HPC storage systems. Advancements in storage technologies, such as NVMe and persistent memory, have been introduced to address these demands. However, relying exclusively on ultra-fast storage devices is not cost-effective, necessitating multi-tier storage hierarchies to manage data based on its usage. In response, ad-hoc file systems have been proposed as a solution. These systems use the storage resources available in compute nodes, including memory and persistent storage, to create temporary file systems that adapt to application behavior in the HPC environment. This work presents the design, implementation, and evaluation of HERCULES, a distributed ad-hoc in-memory storage system, with a focus on its new metadata and elasticity model. HERCULES takes advantage of the Unified Communication X (UCX) framework, leveraging RDMA protocols such as Infiniband, Omnipath, shared-memory, and zero-copy transfers for data transfer. It includes elasticity features at runtime and fault-tolerant facilities. The elasticity features, together with flexible policies for data allocation, allow HERCULES to migrate data so that the available resources can be efficiently used. Our exhaustive evaluation results demonstrate a better performance than Lustre and BeeGFS, two parallel file systems heavily used in High-Performance Computing systems, and GekkoFS, an ad-hoc state-of-the-art solution.
新的数据密集型应用对数据处理的需求日益增长,给高性能计算存储系统的性能和容量带来了巨大的压力。为了满足这些需求,已经引入了NVMe和持久内存等存储技术的进步。然而,完全依赖超快存储设备并不划算,需要多层存储层次结构来根据其使用情况管理数据。为此,特设文件系统被提议作为一种解决方案。这些系统使用计算节点中可用的存储资源(包括内存和持久存储)来创建临时文件系统,以适应HPC环境中的应用程序行为。本文介绍了分布式ad-hoc内存存储系统HERCULES的设计、实现和评估,重点介绍了其新的元数据和弹性模型。HERCULES利用统一通信X (UCX)框架,利用RDMA协议(如Infiniband、Omnipath、共享内存和零拷贝传输)进行数据传输。它包括运行时的弹性特性和容错功能。弹性特性和灵活的数据分配策略使HERCULES能够迁移数据,从而有效地利用可用资源。我们详尽的评估结果表明,它的性能优于Lustre和BeeGFS(高性能计算系统中大量使用的两个并行文件系统)和GekkoFS(一种特别的最先进的解决方案)。
{"title":"HERCULES: A scalable and elastic ad-hoc file system for large-scale computing systems","authors":"Genaro Sánchez-Gallegos,&nbsp;Cosmin Petre,&nbsp;Javier Garcia-Blas,&nbsp;Jesus Carretero","doi":"10.1016/j.future.2025.108350","DOIUrl":"10.1016/j.future.2025.108350","url":null,"abstract":"<div><div>The increasing demand for data processing by new, data-intensive applications is placing significant strain on the performance and capacity of HPC storage systems. Advancements in storage technologies, such as NVMe and persistent memory, have been introduced to address these demands. However, relying exclusively on ultra-fast storage devices is not cost-effective, necessitating multi-tier storage hierarchies to manage data based on its usage. In response, <em>ad-hoc</em> file systems have been proposed as a solution. These systems use the storage resources available in compute nodes, including memory and persistent storage, to create temporary file systems that adapt to application behavior in the HPC environment. This work presents the design, implementation, and evaluation of HERCULES, a distributed <em>ad-hoc</em> in-memory storage system, with a focus on its new metadata and elasticity model. HERCULES takes advantage of the Unified Communication X (UCX) framework, leveraging RDMA protocols such as Infiniband, Omnipath, shared-memory, and zero-copy transfers for data transfer. It includes elasticity features at runtime and fault-tolerant facilities. The elasticity features, together with flexible policies for data allocation, allow HERCULES to migrate data so that the available resources can be efficiently used. Our exhaustive evaluation results demonstrate a better performance than Lustre and BeeGFS, two parallel file systems heavily used in High-Performance Computing systems, and GekkoFS, an <em>ad-hoc</em> state-of-the-art solution.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"179 ","pages":"Article 108350"},"PeriodicalIF":6.2,"publicationDate":"2025-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145845122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Future Generation Computer Systems-The International Journal of Escience
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1