Pub Date : 2024-08-12DOI: 10.1016/j.future.2024.107481
Human activity depends on the oceans for food, transportation, leisure, and many more purposes. Oceans cover 70% of the Earth’s surface, but most of them are unknown to humankind. This is the reason why underwater imaging is a valuable resource asset to Marine Science. Images are acquired with observing systems, e.g. autonomous underwater vehicles or underwater observatories, that presently transmit all the raw data to land stations. However, the transfer of such an amount of data could be challenging, considering the limited power supply and transmission bandwidth of these systems. In this paper, we discuss these aspects, and in particular how it is possible to couple Edge and Cloud computing for effective management of the full processing pipeline according to the Compute Continuum paradigm.
{"title":"Underwater Mediterranean image analysis based on the compute continuum paradigm","authors":"","doi":"10.1016/j.future.2024.107481","DOIUrl":"10.1016/j.future.2024.107481","url":null,"abstract":"<div><p>Human activity depends on the oceans for food, transportation, leisure, and many more purposes. Oceans cover 70% of the Earth’s surface, but most of them are unknown to humankind. This is the reason why underwater imaging is a valuable resource asset to Marine Science. Images are acquired with observing systems, e.g. autonomous underwater vehicles or underwater observatories, that presently transmit all the raw data to land stations. However, the transfer of such an amount of data could be challenging, considering the limited power supply and transmission bandwidth of these systems. In this paper, we discuss these aspects, and in particular how it is possible to couple Edge and Cloud computing for effective management of the full processing pipeline according to the Compute Continuum paradigm.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167739X2400431X/pdfft?md5=604d28ee54ea8468beac4eeba5484fd0&pid=1-s2.0-S0167739X2400431X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142020484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-10DOI: 10.1016/j.future.2024.08.002
Cloud computing has enabled data-sharing to be more convenient than ever before. However, data security is a major concern that prevents cloud computing from being widely adopted. A potential solution to secure data-sharing in cloud computing is proxy re-encryption (PRE), which allows a proxy to transform encrypted data from one key to another without accessing the plaintext. When using PRE, various challenges arise, including the leak of information by a trusted third party, collusion attacks, and issues associated with revocation. To overcome these challenges, this paper proposes a novel Certificateless Proxy Reencryption with Cryptographic Reverse Firewall for Secure Cloud Data Sharing (CLPRE-CRF). The new scheme enables secure distribution of encrypted data from a data owner to users through public clouds. Meanwhile, the CLPRE-CRF scheme can resist exfiltration of secret information and forgery of ciphertext in case the scheme is compromised. In addition, the scheme provides a flexible revocation mechanism to prevent unauthorized access to private data. The security analysis demonstrates that the CLPRE-CRF resists chosen-plaintext attacks and collusion attacks. Moreover, performance evaluation indicates that our scheme achieves a 14% and 22% reduction in computation costs during the encryption and decryption algorithms, respectively. Therefore, the proposed CLPRE-CRF scheme is well-suited for cloud computing environments.
{"title":"Certificateless Proxy Re-encryption with Cryptographic Reverse Firewalls for Secure Cloud Data Sharing","authors":"","doi":"10.1016/j.future.2024.08.002","DOIUrl":"10.1016/j.future.2024.08.002","url":null,"abstract":"<div><p>Cloud computing has enabled data-sharing to be more convenient than ever before. However, data security is a major concern that prevents cloud computing from being widely adopted. A potential solution to secure data-sharing in cloud computing is proxy re-encryption (PRE), which allows a proxy to transform encrypted data from one key to another without accessing the plaintext. When using PRE, various challenges arise, including the leak of information by a trusted third party, collusion attacks, and issues associated with revocation. To overcome these challenges, this paper proposes a novel Certificateless Proxy Reencryption with Cryptographic Reverse Firewall for Secure Cloud Data Sharing (CLPRE-CRF). The new scheme enables secure distribution of encrypted data from a data owner to users through public clouds. Meanwhile, the CLPRE-CRF scheme can resist exfiltration of secret information and forgery of ciphertext in case the scheme is compromised. In addition, the scheme provides a flexible revocation mechanism to prevent unauthorized access to private data. The security analysis demonstrates that the CLPRE-CRF resists chosen-plaintext attacks and collusion attacks. Moreover, performance evaluation indicates that our scheme achieves a 14% and 22% reduction in computation costs during the encryption and decryption algorithms, respectively. Therefore, the proposed CLPRE-CRF scheme is well-suited for cloud computing environments.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141979542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-08DOI: 10.1016/j.future.2024.07.053
The cooperation between Deep Learning (DL) and edge devices has further advanced technological developments, allowing smart devices to serve as both data sources and endpoints for DL-powered applications. However, the success of DL relies on optimal Deep Neural Network (DNN) architectures, and manually developing such systems requires extensive expertise and time. Neural Architecture Search (NAS) has emerged to automate the search for the best-performing neural architectures. Meanwhile, Federated Learning (FL) addresses data privacy concerns by enabling collaborative model development without exchanging the private data of clients.
In a FL system, network limitations can lead to biased model training, slower convergence, and increased communication overhead. On the other hand, traditional DNN architecture design, emphasizing validation accuracy, often overlooks computational efficiency and size constraints of edge devices. This research aims to develop a comprehensive framework that effectively balances trade-offs between model performance, communication efficiency, and the incorporation of FL into an iterative NAS algorithm. This framework aims to overcome challenges by addressing the specific requirements of FL, optimizing DNNs through NAS, and ensuring computational efficiency while considering the network constraints of edge devices.
To address these challenges, we introduce Network-Aware Federated Neural Architecture Search (NAFNAS), an open-source federated neural network pruning framework with network emulation support. Through comprehensive testing, we demonstrate the feasibility of our approach, efficiently reducing DNN size and mitigating communication challenges. Additionally, we propose Network and Distribution Aware Client Grouping (NetDAG), a novel client grouping algorithm tailored for FL with diverse DNN architectures, considerably enhancing efficiency of communication rounds and update balance.
{"title":"Network-aware federated neural architecture search","authors":"","doi":"10.1016/j.future.2024.07.053","DOIUrl":"10.1016/j.future.2024.07.053","url":null,"abstract":"<div><p>The cooperation between Deep Learning (DL) and edge devices has further advanced technological developments, allowing smart devices to serve as both data sources and endpoints for DL-powered applications. However, the success of DL relies on optimal Deep Neural Network (DNN) architectures, and manually developing such systems requires extensive expertise and time. Neural Architecture Search (NAS) has emerged to automate the search for the best-performing neural architectures. Meanwhile, Federated Learning (FL) addresses data privacy concerns by enabling collaborative model development without exchanging the private data of clients.</p><p>In a FL system, network limitations can lead to biased model training, slower convergence, and increased communication overhead. On the other hand, traditional DNN architecture design, emphasizing validation accuracy, often overlooks computational efficiency and size constraints of edge devices. This research aims to develop a comprehensive framework that effectively balances trade-offs between model performance, communication efficiency, and the incorporation of FL into an iterative NAS algorithm. This framework aims to overcome challenges by addressing the specific requirements of FL, optimizing DNNs through NAS, and ensuring computational efficiency while considering the network constraints of edge devices.</p><p>To address these challenges, we introduce Network-Aware Federated Neural Architecture Search (NAFNAS), an open-source federated neural network pruning framework with network emulation support. Through comprehensive testing, we demonstrate the feasibility of our approach, efficiently reducing DNN size and mitigating communication challenges. Additionally, we propose Network and Distribution Aware Client Grouping (NetDAG), a novel client grouping algorithm tailored for FL with diverse DNN architectures, considerably enhancing efficiency of communication rounds and update balance.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141992847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-06DOI: 10.1016/j.future.2024.08.001
The utilization of Device-to-Device (D2D) communication among Narrowband Internet of Things (NB-IoT) devices offers significant potential for advancing intelligent healthcare systems due to its superior data rates, low power consumption, and spectral efficiency. In D2D communication, strategies to mitigate interference and ensure coexistence with cellular networks are crucial. These strategies are aimed at enhancing user data rates by optimally allocating spectrum and managing the transmission power of D2D devices, presenting a complex engineering challenge. Existing studies are limited either by the inadequate integration of NB-IoT D2D communication methods for healthcare, lacking intelligent, distributed, and autonomous decision-making for reliable data transmission, or by insufficient healthcare event management policies during resource allocation in smart healthcare systems. In this work, we introduce an Intelligent Resource Allocation for Smart Healthcare (iRASH) system, designed to optimize D2D communication within NB-IoT environments. The iRASH innovatively integrates the Density-based Spatial Clustering of Applications with Noise (DBSCAN) and Ant Colony Optimization (ACO) algorithms to effectively address the unique requirements of healthcare applications. The proposed system utilizes Belief-Desire-Intention (BDI) agents for dynamic and intelligent clustering of D2D devices, facilitating autonomous decision-making and efficient resource allocation. This approach not only enhances data transmission rates but also reduces power consumption, and is formulated as a Multi-objective Integer Linear Programming (MILP) problem. Given the NP-hard nature of this problem, iRASH incorporates a polynomial-time meta-heuristic-based ACO algorithm, which provides a suboptimal solution. This algorithm adheres to the principles of distributed D2D communication, promoting equitable resource distribution and substantial improvements in utility, energy efficiency, and scalability. Our system is validated through simulations on the Network Simulator version 3 (NS-3) platform, demonstrating significant advancements over existing state-of-the-art solutions in terms of data rate, power efficiency, and system adaptability. As high as improvements of 35% in utility and 50% in energy cost are demonstrated by the iRASH system compared to the benchmark, proving its effectiveness. The outcomes highlight iRASH’s potential to revolutionize D2D communications in smart healthcare settings, paving the way for more responsive and reliable IoT applications.
{"title":"Context aware clustering and meta-heuristic resource allocation for NB-IoT D2D devices in smart healthcare applications","authors":"","doi":"10.1016/j.future.2024.08.001","DOIUrl":"10.1016/j.future.2024.08.001","url":null,"abstract":"<div><p>The utilization of Device-to-Device (D2D) communication among Narrowband Internet of Things (NB-IoT) devices offers significant potential for advancing intelligent healthcare systems due to its superior data rates, low power consumption, and spectral efficiency. In D2D communication, strategies to mitigate interference and ensure coexistence with cellular networks are crucial. These strategies are aimed at enhancing user data rates by optimally allocating spectrum and managing the transmission power of D2D devices, presenting a complex engineering challenge. Existing studies are limited either by the inadequate integration of NB-IoT D2D communication methods for healthcare, lacking intelligent, distributed, and autonomous decision-making for reliable data transmission, or by insufficient healthcare event management policies during resource allocation in smart healthcare systems. In this work, we introduce an Intelligent Resource Allocation for Smart Healthcare (iRASH) system, designed to optimize D2D communication within NB-IoT environments. The iRASH innovatively integrates the Density-based Spatial Clustering of Applications with Noise (DBSCAN) and Ant Colony Optimization (ACO) algorithms to effectively address the unique requirements of healthcare applications. The proposed system utilizes Belief-Desire-Intention (BDI) agents for dynamic and intelligent clustering of D2D devices, facilitating autonomous decision-making and efficient resource allocation. This approach not only enhances data transmission rates but also reduces power consumption, and is formulated as a Multi-objective Integer Linear Programming (MILP) problem. Given the NP-hard nature of this problem, iRASH incorporates a polynomial-time meta-heuristic-based ACO algorithm, which provides a suboptimal solution. This algorithm adheres to the principles of distributed D2D communication, promoting equitable resource distribution and substantial improvements in utility, energy efficiency, and scalability. Our system is validated through simulations on the Network Simulator version 3 (NS-3) platform, demonstrating significant advancements over existing state-of-the-art solutions in terms of data rate, power efficiency, and system adaptability. As high as improvements of 35% in utility and 50% in energy cost are demonstrated by the iRASH system compared to the benchmark, proving its effectiveness. The outcomes highlight iRASH’s potential to revolutionize D2D communications in smart healthcare settings, paving the way for more responsive and reliable IoT applications.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141979541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-06DOI: 10.1016/j.future.2024.08.003
The adoption of the Computing Continuum is characterised by the seamless integration of diverse computing environments and devices. In this dynamic landscape, sharing resources across the continuum is becoming a reality and security must move an step forward, specially in terms of authentication and authorisation for such a distributed and heterogeneous environments. The need for robust identity management is paramount and, in this regard, Decentralised Identity Management (DIM) emerges as a promising solution. It leverages decentralised technologies to secure and facilitate identity interactions across the Computing Continuum. Particularly, to enhance security and privacy, it would be desirable to apply the principles of Self-Sovereign Identity (SSI). In this paradigm, users have full ownership and control of their digital identities that empowers individuals to manage and share their identity data on a need-to-know basis. These mechanisms could contribute to improve security properties during continuum resource management operations. In this context, this paper presents the design, workflows and implementation of a solution that provides authentication/authorisation features to distributed zero-trust based infrastructures across the continuum, enhancing security in resource sharing and resource acquisition stages. To this aim, the solution relies on key aspects like decentralisation, interoperability, trust management and privacy-enhancing capabilities. The decentralisation leverages distributed ledger technologies, such as blockchain, to establish a decentralised identity ecosystem. The solution prioritises interoperability, enabling nodes to seamlessly access and share their identities across different domains and environments. Trustworthiness is at the core of DIM, and privacy is also considered, incorporating privacy-preserving techniques that individuals to selectively disclose identity attributes while safeguarding sensitive information. The implementation includes different operations for allowing continuum frameworks to be enhanced with decentralised authentication and authorisation features. The performance has been evaluated measuring the impact for the adoption of the solution. The most expensive task, the self-identity generation, takes only a few seconds (in our deployment) and it is only executed once. Authorisation tasks operate in the millisecond range, which is a totally invaluable time if incorporated into resource acquisition processes in frameworks such as Liqo, used in the scope of FLUIDOS project.
计算连续性的特点是各种计算环境和设备的无缝集成。在这一动态环境中,跨连续体共享资源正在成为现实,安全问题必须向前迈进一步,特别是在这种分布式异构环境的身份验证和授权方面。在这方面,分散式身份管理(DIM)是一个很有前途的解决方案。它利用去中心化技术来确保和促进整个计算过程中的身份互动。特别是,为了提高安全性和隐私性,最好采用自主身份(SSI)原则。在这种模式下,用户对自己的数字身份拥有完全的所有权和控制权,从而使个人有能力在 "需要知道 "的基础上管理和共享自己的身份数据。这些机制有助于提高连续资源管理操作过程中的安全性能。在此背景下,本文介绍了一种解决方案的设计、工作流程和实施,该解决方案可为整个连续体中基于零信任的分布式基础设施提供身份验证/授权功能,从而增强资源共享和资源获取阶段的安全性。为此,该解决方案依赖于去中心化、互操作性、信任管理和隐私增强功能等关键方面。去中心化利用区块链等分布式账本技术,建立一个去中心化的身份生态系统。该解决方案优先考虑互操作性,使节点能够在不同领域和环境中无缝访问和共享其身份。可信性是 DIM 的核心,同时也考虑到了隐私问题,采用了隐私保护技术,让个人在保护敏感信息的同时有选择地披露身份属性。实施过程包括不同的操作,允许连续框架通过分散认证和授权功能得到增强。对性能进行了评估,衡量采用该解决方案的影响。最昂贵的任务--自我身份生成--只需要几秒钟(在我们的部署中),而且只执行一次。授权任务的运行时间仅为几毫秒,如果将其纳入 FLUIDOS 项目所使用的 Liqo 等框架的资源获取流程中,这将是一个非常宝贵的时间。
{"title":"Decentralised Identity Management solution for zero-trust multi-domain Computing Continuum frameworks","authors":"","doi":"10.1016/j.future.2024.08.003","DOIUrl":"10.1016/j.future.2024.08.003","url":null,"abstract":"<div><p>The adoption of the Computing Continuum is characterised by the seamless integration of diverse computing environments and devices. In this dynamic landscape, sharing resources across the continuum is becoming a reality and security must move an step forward, specially in terms of authentication and authorisation for such a distributed and heterogeneous environments. The need for robust identity management is paramount and, in this regard, Decentralised Identity Management (DIM) emerges as a promising solution. It leverages decentralised technologies to secure and facilitate identity interactions across the Computing Continuum. Particularly, to enhance security and privacy, it would be desirable to apply the principles of Self-Sovereign Identity (SSI). In this paradigm, users have full ownership and control of their digital identities that empowers individuals to manage and share their identity data on a need-to-know basis. These mechanisms could contribute to improve security properties during continuum resource management operations. In this context, this paper presents the design, workflows and implementation of a solution that provides authentication/authorisation features to distributed zero-trust based infrastructures across the continuum, enhancing security in resource sharing and resource acquisition stages. To this aim, the solution relies on key aspects like decentralisation, interoperability, trust management and privacy-enhancing capabilities. The decentralisation leverages distributed ledger technologies, such as blockchain, to establish a decentralised identity ecosystem. The solution prioritises interoperability, enabling nodes to seamlessly access and share their identities across different domains and environments. Trustworthiness is at the core of DIM, and privacy is also considered, incorporating privacy-preserving techniques that individuals to selectively disclose identity attributes while safeguarding sensitive information. The implementation includes different operations for allowing continuum frameworks to be enhanced with decentralised authentication and authorisation features. The performance has been evaluated measuring the impact for the adoption of the solution. The most expensive task, the self-identity generation, takes only a few seconds (in our deployment) and it is only executed once. Authorisation tasks operate in the millisecond range, which is a totally invaluable time if incorporated into resource acquisition processes in frameworks such as Liqo, used in the scope of FLUIDOS project.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167739X24004291/pdfft?md5=b118fab0128173d8752d4ab90e0703c8&pid=1-s2.0-S0167739X24004291-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141915032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-02DOI: 10.1016/j.future.2024.07.050
The diverse landscape of distributed heterogeneous computer systems currently available and being created to address computational challenges with the highest performance requirements presents daunting complexity for application developers. They must effectively decompose and distribute their application functionality and data, efficiently orchestrating the associated communication and synchronisation, on multi/manycore CPU processors with multiple attached acceleration devices structured within compute nodes with interconnection networks of various topologies.
Sophisticated compilers, runtime systems and libraries are (loosely) matched with debugging, performance measurement and analysis tools, with proprietary versions by integrators/vendors provided exclusively for their systems complemented by portable (primarily) open-source equivalents developed and supported by the international research community over many years. The Scalasca and Paraver toolsets are two widely employed examples of the latter, installed on personal notebook computers through to the largest leadership HPC systems. Over more than fifteen years their developers have worked closely together in numerous collaborative projects culminating in the creation of a universal parallel performance assessment and optimisation methodology focused on application execution efficiency and scalability, and the associated training and coaching of application developers (often in teams) in its productive use, reviewed in this article with lessons learnt therefrom.
为应对最高性能要求的计算挑战,目前可用和正在创建的分布式异构计算机系统种类繁多,这给应用开发人员带来了令人生畏的复杂性。他们必须在多核 CPU 处理器上有效地分解和分配其应用功能和数据,并有效地协调相关的通信和同步,同时在具有不同拓扑结构互连网络的计算节点内结构多个附加加速设备。
{"title":"15+ years of joint parallel application performance analysis/tools training with Scalasca/Score-P and Paraver/Extrae toolsets","authors":"","doi":"10.1016/j.future.2024.07.050","DOIUrl":"10.1016/j.future.2024.07.050","url":null,"abstract":"<div><p>The diverse landscape of distributed heterogeneous computer systems currently available and being created to address computational challenges with the highest performance requirements presents daunting complexity for application developers. They must effectively decompose and distribute their application functionality and data, efficiently orchestrating the associated communication and synchronisation, on multi/manycore CPU processors with multiple attached acceleration devices structured within compute nodes with interconnection networks of various topologies.</p><p>Sophisticated compilers, runtime systems and libraries are (loosely) matched with debugging, performance measurement and analysis tools, with proprietary versions by integrators/vendors provided exclusively for their systems complemented by portable (primarily) open-source equivalents developed and supported by the international research community over many years. The <em>Scalasca</em> and <em>Paraver</em> toolsets are two widely employed examples of the latter, installed on personal notebook computers through to the largest leadership HPC systems. Over more than fifteen years their developers have worked closely together in numerous collaborative projects culminating in the creation of a universal parallel performance assessment and optimisation methodology focused on application execution efficiency and scalability, and the associated training and coaching of application developers (often in teams) in its productive use, reviewed in this article with lessons learnt therefrom.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167739X24004187/pdfft?md5=6d95fa0157a348afe6a2f74eeb9b1f7c&pid=1-s2.0-S0167739X24004187-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141915232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-02DOI: 10.1016/j.future.2024.07.054
The advancement of digitization and automation in Low Altitude Intelligent Networking (LAIN) is constrained by limited computational resources and the absence of a dedicated modal transformation mechanism, affecting the performance of latency-sensitive missions. This study addresses these challenges by proposing a Downscaling Reconstruction Multi-scale Locally Focused Generative Adversarial Network (DR-MFGAN) with Federated Learning (FL). This integration employs wavelet transform downscaling and zero-shot residual learning techniques to create noise-suppressed image pairs, ultimately facilitating high-quality image reconstruction. The core network structure is composed of multidimensional residual blocks and generative confrontation network, and feature extraction is further enhanced through cross channel attention mechanism. Finally, distributed training based on Federated Learning ensures the training effectiveness of nodes with small data volumes.Experimental results demonstrate significant improvements: an 18.18% reduction in Mean Squared Error (MSE), a 33.52% increase in Peak Signal to Noise Ratio (PSNR), and a 39.54% improvement in Learned Perceptual Image Patch Similarity (LPIPS). The edge terminal can provide high-resolution imagery with limited data, achieving precise cross-modal transformations. This approach enhances LAIN capabilities, addressing computational and transformation challenges to support critical latency-sensitive missions.
低空智能网络(LAIN)数字化和自动化的发展受到有限计算资源和专用模态转换机制缺失的制约,影响了对延迟敏感的任务的性能。为应对这些挑战,本研究提出了具有联合学习(FL)功能的降尺度重构多尺度局部聚焦生成对抗网络(DR-MFGAN)。这种集成采用了小波变换降尺度和零镜头残差学习技术来创建噪声抑制图像对,最终促进高质量图像重建。核心网络结构由多维残差块和生成式对抗网络组成,并通过跨通道注意机制进一步加强特征提取。实验结果表明,这种方法有显著的改进:平均平方误差(MSE)降低了 18.18%,峰值信噪比(PSNR)提高了 33.52%,学习感知图像补丁相似度(LPIPS)提高了 39.54%。边缘终端可以用有限的数据提供高分辨率图像,实现精确的跨模态转换。这种方法增强了 LAIN 的能力,解决了计算和转换方面的难题,从而支持对延迟敏感的关键任务。
{"title":"A cross-modal high-resolution image generation approach based on cloud-terminal collaboration for low-altitude intelligent network","authors":"","doi":"10.1016/j.future.2024.07.054","DOIUrl":"10.1016/j.future.2024.07.054","url":null,"abstract":"<div><p>The advancement of digitization and automation in Low Altitude Intelligent Networking (LAIN) is constrained by limited computational resources and the absence of a dedicated modal transformation mechanism, affecting the performance of latency-sensitive missions. This study addresses these challenges by proposing a Downscaling Reconstruction Multi-scale Locally Focused Generative Adversarial Network (DR-MFGAN) with Federated Learning (FL). This integration employs wavelet transform downscaling and zero-shot residual learning techniques to create noise-suppressed image pairs, ultimately facilitating high-quality image reconstruction. The core network structure is composed of multidimensional residual blocks and generative confrontation network, and feature extraction is further enhanced through cross channel attention mechanism. Finally, distributed training based on Federated Learning ensures the training effectiveness of nodes with small data volumes.Experimental results demonstrate significant improvements: an 18.18% reduction in Mean Squared Error (MSE), a 33.52% increase in Peak Signal to Noise Ratio (PSNR), and a 39.54% improvement in Learned Perceptual Image Patch Similarity (LPIPS). The edge terminal can provide high-resolution imagery with limited data, achieving precise cross-modal transformations. This approach enhances LAIN capabilities, addressing computational and transformation challenges to support critical latency-sensitive missions.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141915070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-31DOI: 10.1016/j.future.2024.07.045
The past years have witnessed the success of a distributed learning system called Federated Learning (FL). Recently, asynchronous FL (AFL) has demonstrated its potential in concurrency compared to mainstream synchronous FL. However, the inherent systematic and statistical heterogeneity has presented several impediments to AFL: On the client side, the discrepancies in trips and local model drift impede global performance enhancement; On the server side, dynamic communication leads to significant fluctuations in gradient arrival time, while asynchronous arrival gradients with ambiguous value are not fully leveraged. In this paper, we propose an adaptive AFL framework, ARDAGH, which systematically addresses the aforementioned challenges: Firstly, to address the discrepancies in client trips, ARDAGH ensures their convergence by incorporating only 1-bit feedback information into the downlink. Secondly, to counter the drift of clients, ARDAGH generalizes the local models by employing our novel adversarial sharpness-aware minimization, which does not necessitate reliance on additional global variables. Thirdly, in the face of gradient latency issues, ARDAGH employs a communication-aware dropout strategy to adaptively compress gradients to ensure similar transmission times. Finally, to fully unleash the potential of each gradient, we establish a consistent optimal direction by conceptualizing the aggregation as an optimizer with successive momentum. In light of the comprehensive solution provided by ARDAGH, an algorithm named FedAMO is derived, and its superiority is confirmed by experimental results obtained under challenging prototype and simulation settings. Particularly in typical sentiment analysis tasks, FedAMO demonstrates an improvement of up to 5.351% with a 20.056-fold acceleration compared to conventional asynchronous methods.
{"title":"Self-adaptive asynchronous federated optimizer with adversarial sharpness-aware minimization","authors":"","doi":"10.1016/j.future.2024.07.045","DOIUrl":"10.1016/j.future.2024.07.045","url":null,"abstract":"<div><p>The past years have witnessed the success of a distributed learning system called Federated Learning (FL). Recently, asynchronous FL (AFL) has demonstrated its potential in concurrency compared to mainstream synchronous FL. However, the inherent systematic and statistical heterogeneity has presented several impediments to AFL: On the client side, the discrepancies in trips and local model drift impede global performance enhancement; On the server side, dynamic communication leads to significant fluctuations in gradient arrival time, while asynchronous arrival gradients with ambiguous value are not fully leveraged. In this paper, we propose an adaptive AFL framework, ARDAGH, which systematically addresses the aforementioned challenges: Firstly, to address the discrepancies in client trips, ARDAGH ensures their convergence by incorporating only 1-bit feedback information into the downlink. Secondly, to counter the drift of clients, ARDAGH generalizes the local models by employing our novel adversarial sharpness-aware minimization, which does not necessitate reliance on additional global variables. Thirdly, in the face of gradient latency issues, ARDAGH employs a communication-aware dropout strategy to adaptively compress gradients to ensure similar transmission times. Finally, to fully unleash the potential of each gradient, we establish a consistent optimal direction by conceptualizing the aggregation as an optimizer with successive momentum. In light of the comprehensive solution provided by ARDAGH, an algorithm named FedAMO is derived, and its superiority is confirmed by experimental results obtained under challenging prototype and simulation settings. Particularly in typical sentiment analysis tasks, FedAMO demonstrates an improvement of up to 5.351% with a 20.056-fold acceleration compared to conventional asynchronous methods.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141915134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-31DOI: 10.1016/j.future.2024.07.051
Drone swarms are increasingly being used for critical missions and need to be protected against malicious users. Intrusion Detection Systems (IDS) are used to analyze network traffic in order to detect possible threats. Modern IDSs rely on machine learning models for this purpose. Optimizing the execution of resource-hungry IDS algorithms on resource-constrained drone devices, in terms of energy consumption, response time, memory footprint and guaranteed level of security, allows to extend the duration of missions. In addition, the embedded platforms used in drones often incorporate heterogeneous computing platforms on which IDSs could be executed. In this paper, we present a methodology and results about characterizing the execution of different IDS models on various processing elements, namely, Central Processing Units (CPU), Graphical Processing Units (GPU), Deep Learning Accelerators (DLA) and Field-Programmable Gate Array (FPGA). In effect, drones operate in different mission contexts in terms of criticality level, energy and memory budgets, and traffic load, so it is important to identify which IDS model to run on which processing element in a given context. For this sake, we evaluated several metrics on different platforms: energy and resource consumption, accuracy for malicious traffic detection and response time. Different models, namely Random Forests (RF), Convolutional Neural Networks (CNN) and Dense Neural Networks (DNN), have been implemented and characterized on different processing elements/platforms. This study has shown that relating the chosen implementation to the resources available on the drone is a judicious strategy to work on. It highlights the disparity between IDS implementations characteristics. For example, the inference time ranges from to 30 ms, the energy consumption per inference is between and 70 mJ, and the accuracy of the IDS models is between 65.73% and 81.59%. In addition, we develop a set of guidelines for choosing the best IDS model given a mission context.
{"title":"A study on characterizing energy, latency and security for Intrusion Detection Systems on heterogeneous embedded platforms","authors":"","doi":"10.1016/j.future.2024.07.051","DOIUrl":"10.1016/j.future.2024.07.051","url":null,"abstract":"<div><p>Drone swarms are increasingly being used for critical missions and need to be protected against malicious users. Intrusion Detection Systems (IDS) are used to analyze network traffic in order to detect possible threats. Modern IDSs rely on machine learning models for this purpose. Optimizing the execution of resource-hungry IDS algorithms on resource-constrained drone devices, in terms of energy consumption, response time, memory footprint and guaranteed level of security, allows to extend the duration of missions. In addition, the embedded platforms used in drones often incorporate heterogeneous computing platforms on which IDSs could be executed. In this paper, we present a methodology and results about characterizing the execution of different IDS models on various processing elements, namely, Central Processing Units (CPU), Graphical Processing Units (GPU), Deep Learning Accelerators (DLA) and Field-Programmable Gate Array (FPGA). In effect, drones operate in different mission contexts in terms of criticality level, energy and memory budgets, and traffic load, so it is important to identify which IDS model to run on which processing element in a given context. For this sake, we evaluated several metrics on different platforms: energy and resource consumption, accuracy for malicious traffic detection and response time. Different models, namely Random Forests (RF), Convolutional Neural Networks (CNN) and Dense Neural Networks (DNN), have been implemented and characterized on different processing elements/platforms. This study has shown that relating the chosen implementation to the resources available on the drone is a judicious strategy to work on. It highlights the disparity between IDS implementations characteristics. For example, the inference time ranges from <span><math><mrow><mn>1</mn><mo>.</mo><mn>27</mn><mspace></mspace><mi>μ</mi><mi>s</mi></mrow></math></span> to 30 ms, the energy consumption per inference is between <span><math><mrow><mn>10</mn><mo>.</mo><mn>7</mn><mspace></mspace><mi>μ</mi><mi>J</mi></mrow></math></span> and 70 mJ, and the accuracy of the IDS models is between 65.73% and 81.59%. In addition, we develop a set of guidelines for choosing the best IDS model given a mission context.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141979544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-31DOI: 10.1016/j.future.2024.07.052
High-performance computing (HPC) systems are a crucial component of modern society, with a significant impact in areas ranging from economics to scientific research, thanks to their unrivaled computational capabilities. For this reason, the worldwide HPC installation is steeply trending upwards, with no sign of slowing down. However, these machines are both complex, comprising millions of heterogeneous components, hard to effectively manage, and very costly (both in terms of economic investment and of energy consumption). Therefore, maximizing their productivity is of paramount importance. For instance, anomalies and faults can generate significant downtime due to the difficulty of promptly detecting them, as there are potentially many sources of issues preventing the correct functioning of computing nodes.
In recent years, several data-driven methods have been proposed to automatically detect anomalies in HPC systems, exploiting the fact that modern supercomputers are typically endowed with fine-grained monitoring infrastructures, collecting data that can be used to characterize the system behavior. Thus, it is possible to teach Machine Learning (ML) models to distinguish normal and anomalous states automatically. In this paper, we contribute to this line of research with a novel intuition, namely exploiting Federated Learning (FL) to improve the accuracy of anomaly detection models for HPC nodes. Although FL is not typically exploited in the HPC context, we show that FL can boost several types of underlying ML models, from supervised to unsupervised ones. We demonstrate our approach on a production Tier-0 supercomputer hosted in Italy. Applying FL to anomaly detection improves the average f-score from 0.46 to 0.87. Our research also shows FL can reduce the data collection time required to develop a representation data set, facilitating faster deployment of anomaly detection models. ML models need 5 months of training data for efficient anomaly detection performance while using FL reduces the training set by 15 times to 1.25 weeks.
{"title":"Harnessing federated learning for anomaly detection in supercomputer nodes","authors":"","doi":"10.1016/j.future.2024.07.052","DOIUrl":"10.1016/j.future.2024.07.052","url":null,"abstract":"<div><p>High-performance computing (HPC) systems are a crucial component of modern society, with a significant impact in areas ranging from economics to scientific research, thanks to their unrivaled computational capabilities. For this reason, the worldwide HPC installation is steeply trending upwards, with no sign of slowing down. However, these machines are both complex, comprising millions of heterogeneous components, hard to effectively manage, and very costly (both in terms of economic investment and of energy consumption). Therefore, maximizing their productivity is of paramount importance. For instance, anomalies and faults can generate significant downtime due to the difficulty of promptly detecting them, as there are potentially many sources of issues preventing the correct functioning of computing nodes.</p><p>In recent years, several data-driven methods have been proposed to automatically detect anomalies in HPC systems, exploiting the fact that modern supercomputers are typically endowed with fine-grained monitoring infrastructures, collecting data that can be used to characterize the system behavior. Thus, it is possible to teach Machine Learning (ML) models to distinguish normal and anomalous states automatically. In this paper, we contribute to this line of research with a novel intuition, namely exploiting Federated Learning (FL) to improve the accuracy of anomaly detection models for HPC nodes. Although FL is not typically exploited in the HPC context, we show that FL can boost several types of underlying ML models, from supervised to unsupervised ones. We demonstrate our approach on a production Tier-0 supercomputer hosted in Italy. Applying FL to anomaly detection improves the average f-score from 0.46 to 0.87. Our research also shows FL can reduce the data collection time required to develop a representation data set, facilitating faster deployment of anomaly detection models. ML models need 5 months of training data for efficient anomaly detection performance while using FL reduces the training set by 15 times to 1.25 weeks.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141915220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}