首页 > 最新文献

Future Generation Computer Systems-The International Journal of Escience最新文献

英文 中文
SWIM: Sliding-Window Model contrast for federated learning SWIM:联合学习的滑动窗口模型对比
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-11-03 DOI: 10.1016/j.future.2024.107590
In federated learning, data heterogeneity leads to significant differences in the local models learned by the clients, thereby affecting the performance of the global model. To address this issue, contrast federated learning algorithms increase the comparison of positive and negative samples on the clients, bringing the local models closer to the global model. However, existing methods take the global model as the positive sample and the previous round of local models as the negative sample, resulting in insufficient utilization of historical local models. In this paper, we propose SWIM: Sliding-WIndow Model contrast method, which introduces more rounds of local models. First, we design and utilize a sliding window mechanism for collecting client representations of historical local models. Subsequently, we employ the cosine distance function as a discriminator to distinguish them into positive and negative samples. In addition, we introduce a dynamic coefficient that balances the federated classification learning and feature learning tasks. By adjusting the dynamic coefficient at different training rounds, the global model becomes more focused on feature learning in the early stages and classification learning in the later stages. Experiments are compared with four state-of-the-art federated learning algorithms on three datasets. The results show that the proposed algorithm outperforms the four state-of-the-art algorithms in terms of accuracy. Source code is available at https://github.com/zhanghrswpu/SWIM.
在联合学习中,数据异质性会导致客户端学习到的本地模型存在显著差异,从而影响全局模型的性能。为了解决这个问题,对比联合学习算法增加了客户端正样本和负样本的对比,使局部模型更接近全局模型。然而,现有方法将全局模型作为正样本,将上一轮局部模型作为负样本,导致对历史局部模型的利用率不足。在本文中,我们提出了 SWIM:Sliding-WIndow Model contrast 方法,该方法引入了更多轮局部模型。首先,我们设计并利用滑动窗口机制来收集历史局部模型的客户端表示。随后,我们使用余弦距离函数作为判别器,将它们区分为正样本和负样本。此外,我们还引入了动态系数,以平衡联合分类学习和特征学习任务。通过在不同的训练轮次中调整动态系数,全局模型在早期阶段会更加专注于特征学习,而在后期阶段则会更加专注于分类学习。实验在三个数据集上与四种最先进的联合学习算法进行了比较。结果表明,所提出的算法在准确率方面优于四种最先进的算法。源代码见 https://github.com/zhanghrswpu/SWIM。
{"title":"SWIM: Sliding-Window Model contrast for federated learning","authors":"","doi":"10.1016/j.future.2024.107590","DOIUrl":"10.1016/j.future.2024.107590","url":null,"abstract":"<div><div>In federated learning, data heterogeneity leads to significant differences in the local models learned by the clients, thereby affecting the performance of the global model. To address this issue, contrast federated learning algorithms increase the comparison of positive and negative samples on the clients, bringing the local models closer to the global model. However, existing methods take the global model as the positive sample and the previous round of local models as the negative sample, resulting in insufficient utilization of historical local models. In this paper, we propose SWIM: Sliding-WIndow Model contrast method, which introduces more rounds of local models. First, we design and utilize a sliding window mechanism for collecting client representations of historical local models. Subsequently, we employ the cosine distance function as a discriminator to distinguish them into positive and negative samples. In addition, we introduce a dynamic coefficient that balances the federated classification learning and feature learning tasks. By adjusting the dynamic coefficient at different training rounds, the global model becomes more focused on feature learning in the early stages and classification learning in the later stages. Experiments are compared with four state-of-the-art federated learning algorithms on three datasets. The results show that the proposed algorithm outperforms the four state-of-the-art algorithms in terms of accuracy. Source code is available at <span><span>https://github.com/zhanghrswpu/SWIM</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142593354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Heterogeneous system list scheduling algorithm based on improved optimistic cost matrix 基于改进的乐观成本矩阵的异构系统列表调度算法
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-10-28 DOI: 10.1016/j.future.2024.107576
In heterogeneous computing systems, efficient task-scheduling methods are paramount for enhancing computational performance. However, the existing algorithm exhibits certain deficiencies, notably its oversight of load balancing concerns and inadequate emphasis on the out-degree property of tasks. To address these issues, a novel list scheduling algorithm is proposed, Average Earliest Finish Time (AEFT), which proficiently allocates task flows onto heterogeneous processors. The AEFT algorithm primarily consists of two key stages: (1) prioritizing tasks to determine the distribution of task priorities and (2) assigning optimal processors for tasks with given priorities. By leveraging its specific topology, the AEFT algorithm minimizes the scheduling length of task flows. Simultaneously, a prediction mechanism in determining task prioritization and selecting processors stages is proposed to reduce the scheduling time of task flows. In addition, in the processor selection stage, AEFT algorithm considers the out-degree characteristics of tasks, ameliorating situations of processor load imbalance. The AEFT algorithm demonstrates superior performance compared to prior list scheduling algorithms concerning makespan, speedup, and the percentage of occurrences of better solutions, as evidenced by experiments conducted on randomly generated and real-application graphs. Specifically, for t tasks and p processors, the AEFT algorithm achieves a time complexity of O(t2p).
在异构计算系统中,高效的任务调度方法对提高计算性能至关重要。然而,现有算法存在一些缺陷,特别是忽略了负载平衡问题,对任务的出度属性重视不够。为了解决这些问题,我们提出了一种新型列表调度算法--平均最早完成时间(AEFT),它能将任务流有效地分配到异构处理器上。AEFT 算法主要包括两个关键阶段:(1) 对任务进行优先级排序,以确定任务优先级的分布;(2) 为具有给定优先级的任务分配最佳处理器。通过利用其特定的拓扑结构,AEFT 算法最大限度地减少了任务流的调度长度。同时,在确定任务优先级和选择处理器阶段,提出了一种预测机制,以减少任务流的调度时间。此外,在处理器选择阶段,AEFT 算法考虑了任务的出度特征,改善了处理器负载不平衡的情况。在随机生成的图和实际应用图上进行的实验证明,与之前的列表调度算法相比,AEFT 算法在时间跨度、速度提升和更好解决方案的出现率方面都表现出更优越的性能。具体来说,对于 t 个任务和 p 个处理器,AEFT 算法的时间复杂度为 O(t2p)。
{"title":"Heterogeneous system list scheduling algorithm based on improved optimistic cost matrix","authors":"","doi":"10.1016/j.future.2024.107576","DOIUrl":"10.1016/j.future.2024.107576","url":null,"abstract":"<div><div>In heterogeneous computing systems, efficient task-scheduling methods are paramount for enhancing computational performance. However, the existing algorithm exhibits certain deficiencies, notably its oversight of load balancing concerns and inadequate emphasis on the out-degree property of tasks. To address these issues, a novel list scheduling algorithm is proposed, Average Earliest Finish Time (AEFT), which proficiently allocates task flows onto heterogeneous processors. The AEFT algorithm primarily consists of two key stages: (1) prioritizing tasks to determine the distribution of task priorities and (2) assigning optimal processors for tasks with given priorities. By leveraging its specific topology, the AEFT algorithm minimizes the scheduling length of task flows. Simultaneously, a prediction mechanism in determining task prioritization and selecting processors stages is proposed to reduce the scheduling time of task flows. In addition, in the processor selection stage, AEFT algorithm considers the out-degree characteristics of tasks, ameliorating situations of processor load imbalance. The AEFT algorithm demonstrates superior performance compared to prior list scheduling algorithms concerning makespan, speedup, and the percentage of occurrences of better solutions, as evidenced by experiments conducted on randomly generated and real-application graphs. Specifically, for <span><math><mi>t</mi></math></span> tasks and <span><math><mi>p</mi></math></span> processors, the AEFT algorithm achieves a time complexity of <span><math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>t</mi></mrow><mrow><mn>2</mn></mrow></msup><mi>p</mi><mo>)</mo></mrow></mrow></math></span>.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142573349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Fast Inertial ADMM optimization framework for distributed machine learning 分布式机器学习的快速惯性 ADMM 优化框架
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-10-28 DOI: 10.1016/j.future.2024.107575
The ADMM (Alternating Direction Method of Multipliers) optimization framework is known for its property of decomposition and assembly, which effectively bridges distributed computing and optimization algorithms, making it well-suited for distributed machine learning in the context of big data. However, it suffers from slow convergence speed and lacks the ability to coordinate worker computations, resulting in inconsistent speeds in solving subproblems in distributed systems and mutual waiting among workers. In this paper, we propose a novel optimization framework to address these challenges in support vector regression (SVR) and probit regression training through the FIADMM (Fast Inertial ADMM). The key concept of the FIADMM lies in the introduction of inertia acceleration and an adaptive subproblem iteration mechanism based on the ADMM, aimed at accelerating convergence speed and reducing the variance in solving speeds among workers. Further, we prove that FIADMM has a fast linear convergence rate O(1/k). Experimental results on six benchmark datasets demonstrate that the proposed FIADMM significantly enhances convergence speed and computational efficiency compared to multiple baseline algorithms and related efforts.
ADMM(交替乘法)优化框架以其分解和组装特性而著称,它有效地连接了分布式计算和优化算法,非常适合大数据背景下的分布式机器学习。然而,它存在收敛速度慢、缺乏协调工作者计算的能力等问题,导致在分布式系统中解决子问题的速度不一致,以及工作者之间的相互等待。在本文中,我们提出了一个新颖的优化框架,通过 FIADMM(快速惯性 ADMM)来解决支持向量回归(SVR)和 probit 回归训练中的这些难题。FIADMM 的关键概念在于引入惯性加速和基于 ADMM 的自适应子问题迭代机制,旨在加快收敛速度并减少工作者之间求解速度的差异。此外,我们还证明了 FIADMM 具有快速线性收敛率 O(1/k)。在六个基准数据集上的实验结果表明,与多种基准算法和相关努力相比,所提出的 FIADMM 显著提高了收敛速度和计算效率。
{"title":"The Fast Inertial ADMM optimization framework for distributed machine learning","authors":"","doi":"10.1016/j.future.2024.107575","DOIUrl":"10.1016/j.future.2024.107575","url":null,"abstract":"<div><div>The ADMM (Alternating Direction Method of Multipliers) optimization framework is known for its property of decomposition and assembly, which effectively bridges distributed computing and optimization algorithms, making it well-suited for distributed machine learning in the context of big data. However, it suffers from slow convergence speed and lacks the ability to coordinate worker computations, resulting in inconsistent speeds in solving subproblems in distributed systems and mutual waiting among workers. In this paper, we propose a novel optimization framework to address these challenges in support vector regression (SVR) and probit regression training through the FIADMM (<strong>F</strong>ast <strong>I</strong>nertial ADMM). The key concept of the FIADMM lies in the introduction of inertia acceleration and an adaptive subproblem iteration mechanism based on the ADMM, aimed at accelerating convergence speed and reducing the variance in solving speeds among workers. Further, we prove that FIADMM has a fast linear convergence rate <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mn>1</mn><mo>/</mo><mi>k</mi><mo>)</mo></mrow></mrow></math></span>. Experimental results on six benchmark datasets demonstrate that the proposed FIADMM significantly enhances convergence speed and computational efficiency compared to multiple baseline algorithms and related efforts.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142587247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Review of deep learning-based pathological image classification: From task-specific models to foundation models 基于深度学习的病理图像分类回顾:从特定任务模型到基础模型
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-10-28 DOI: 10.1016/j.future.2024.107578
Pathological diagnosis is considered the gold standard in cancer diagnosis, playing a crucial role in guiding treatment decisions and prognosis assessment for patients. However, achieving accurate diagnosis of pathology images poses several challenges, including the scarcity of pathologists and the inherent subjective variability in their interpretations. The advancements in whole-slide imaging technology and deep learning methods provide new opportunities for digital pathology, especially in low-resource settings, by enabling effective pathological image classification. In this article, we begin by introducing the datasets, which include both unimodal and multimodal types, as essential resources for advancing pathological image classification. We then provide a comprehensive overview of deep learning-based pathological image classification models, covering task-specific models such as supervised, unsupervised, weakly supervised, and semi-supervised learning methods, as well as unimodal and multimodal foundation models. Next, we review tumor-related indicators that can be predicted from pathological images, focusing on two main categories: indicators that can be recognized by pathologists, such as tumor classification, grading, and region recognition; and those that cannot be recognized by pathologists, including molecular subtype prediction, tumor origin prediction, biomarker prediction, and survival prediction. Finally, we summarize the key challenges in digital pathology and propose potential future directions.
病理诊断被认为是癌症诊断的黄金标准,在指导治疗决策和评估患者预后方面起着至关重要的作用。然而,实现病理图像的准确诊断面临着一些挑战,其中包括病理学家的稀缺性以及他们在解释时固有的主观差异性。全切片成像技术和深度学习方法的进步通过实现有效的病理图像分类,为数字病理学提供了新的机遇,尤其是在资源匮乏的环境中。在本文中,我们首先介绍了数据集,其中包括单模态和多模态类型,它们是推进病理图像分类的重要资源。然后,我们全面概述了基于深度学习的病理图像分类模型,涵盖了特定任务模型,如监督、无监督、弱监督和半监督学习方法,以及单模态和多模态基础模型。接下来,我们回顾了可从病理图像预测的肿瘤相关指标,重点关注两大类:病理学家可识别的指标,如肿瘤分类、分级和区域识别;病理学家无法识别的指标,包括分子亚型预测、肿瘤起源预测、生物标记物预测和生存预测。最后,我们总结了数字病理学面临的主要挑战,并提出了潜在的未来发展方向。
{"title":"Review of deep learning-based pathological image classification: From task-specific models to foundation models","authors":"","doi":"10.1016/j.future.2024.107578","DOIUrl":"10.1016/j.future.2024.107578","url":null,"abstract":"<div><div>Pathological diagnosis is considered the gold standard in cancer diagnosis, playing a crucial role in guiding treatment decisions and prognosis assessment for patients. However, achieving accurate diagnosis of pathology images poses several challenges, including the scarcity of pathologists and the inherent subjective variability in their interpretations. The advancements in whole-slide imaging technology and deep learning methods provide new opportunities for digital pathology, especially in low-resource settings, by enabling effective pathological image classification. In this article, we begin by introducing the datasets, which include both unimodal and multimodal types, as essential resources for advancing pathological image classification. We then provide a comprehensive overview of deep learning-based pathological image classification models, covering task-specific models such as supervised, unsupervised, weakly supervised, and semi-supervised learning methods, as well as unimodal and multimodal foundation models. Next, we review tumor-related indicators that can be predicted from pathological images, focusing on two main categories: indicators that can be recognized by pathologists, such as tumor classification, grading, and region recognition; and those that cannot be recognized by pathologists, including molecular subtype prediction, tumor origin prediction, biomarker prediction, and survival prediction. Finally, we summarize the key challenges in digital pathology and propose potential future directions.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142587248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning protein language contrastive models with multi-knowledge representation 利用多知识表示学习蛋白质语言对比模型
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-10-25 DOI: 10.1016/j.future.2024.107580
Protein representation learning plays a crucial role in obtaining a comprehensive understanding of biological regulatory mechanisms and in developing proteins and drugs for therapeutic purposes. However, labeled proteins, such as sequenced and functionally annotated data, are incomplete and few. Thus, contrastive learning has emerged as the preferred technique for learning meaningful representations from unlabeled data samples. In addition, at present, natural proteins cannot be fully described by extracting protein knowledge from a single domain. Therefore, Pro-CoRL, a protein contrastive models framework based on multi-knowledge representation learning, was proposed in this study. In particular, Pro-CoRL smooths the objective function using convex approximation, thereby improving the stability of training. Extensive experiments on predicting protein–protein interaction types and clustering protein families have confirmed the high accuracy and robustness of Pro-CoRL.
蛋白质表征学习在全面了解生物调控机制以及开发用于治疗目的的蛋白质和药物方面发挥着至关重要的作用。然而,有标记的蛋白质,如测序和功能注释数据,既不完整也很少。因此,对比学习已成为从无标记数据样本中学习有意义表征的首选技术。此外,目前从单一领域提取蛋白质知识并不能完全描述天然蛋白质。因此,本研究提出了基于多知识表征学习的蛋白质对比模型框架 Pro-CoRL。其中,Pro-CoRL 利用凸近似平滑目标函数,从而提高了训练的稳定性。在预测蛋白质-蛋白质相互作用类型和聚类蛋白质家族方面的大量实验证实了 Pro-CoRL 的高准确性和鲁棒性。
{"title":"Learning protein language contrastive models with multi-knowledge representation","authors":"","doi":"10.1016/j.future.2024.107580","DOIUrl":"10.1016/j.future.2024.107580","url":null,"abstract":"<div><div>Protein representation learning plays a crucial role in obtaining a comprehensive understanding of biological regulatory mechanisms and in developing proteins and drugs for therapeutic purposes. However, labeled proteins, such as sequenced and functionally annotated data, are incomplete and few. Thus, contrastive learning has emerged as the preferred technique for learning meaningful representations from unlabeled data samples. In addition, at present, natural proteins cannot be fully described by extracting protein knowledge from a single domain. Therefore, Pro-CoRL, a <u>pro</u>tein <u>co</u>ntrastive models framework based on multi-knowledge <u>r</u>epresentation <u>l</u>earning, was proposed in this study. In particular, Pro-CoRL smooths the objective function using convex approximation, thereby improving the stability of training. Extensive experiments on predicting protein–protein interaction types and clustering protein families have confirmed the high accuracy and robustness of Pro-CoRL.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142593355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-round decentralized dataset distillation with federated learning for Low Earth Orbit satellite communication 利用联合学习为低地球轨道卫星通信提供多轮分散式数据集提炼服务
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-10-24 DOI: 10.1016/j.future.2024.107570
Satellite communication and Low Earth Orbit (LEO) satellites are important components of the 6G network, widely used for Earth observation tasks due to their low cost and short return period, making them a key technology for 6G network connectivity. Due to limitations in satellite system technology and downlink bandwidth, it is not feasible to download all high-resolution image information to ground stations. Even in existing federated learning (FL) methods, sharing well-trained parts of the model can still bottleneck with increasing model size. To address these challenges, we propose a new federated learning framework (FL-M3D) for LEO satellite communication that employs multi-round decentralized dataset distillation techniques. It allows satellites to independently extract local datasets and transmit them to ground stations instead of exchanging model parameters. Communication costs depend only on the size of the synthesized dataset and do not increase with larger models. However, the heterogeneity of satellite datasets can lead to sample ambiguity and decreased model convergence speed. Therefore, we propose distilling the datasets to mitigate the negative effects of data heterogeneity. Through experiments using real-world image datasets, FL-M3D reduces communication volume in simulated satellite networks by approximately 49.84% and achieves improved model performance.
卫星通信和低地球轨道(LEO)卫星是 6G 网络的重要组成部分,因其成本低、返回周期短而被广泛用于地球观测任务,成为 6G 网络连接的关键技术。由于卫星系统技术和下行带宽的限制,将所有高分辨率图像信息下载到地面站是不可行的。即使在现有的联合学习(FL)方法中,共享模型中训练有素的部分也会随着模型规模的增大而出现瓶颈。为了应对这些挑战,我们为低地轨道卫星通信提出了一种新的联合学习框架(FL-M3D),它采用了多轮分散数据集提炼技术。它允许卫星独立提取本地数据集并将其传输到地面站,而不是交换模型参数。通信成本仅取决于合成数据集的大小,不会随着模型的增大而增加。然而,卫星数据集的异质性会导致样本模糊和模型收敛速度下降。因此,我们建议对数据集进行提炼,以减轻数据异质性的负面影响。通过使用真实世界图像数据集进行实验,FL-M3D 将模拟卫星网络中的通信量减少了约 49.84%,并提高了模型性能。
{"title":"Multi-round decentralized dataset distillation with federated learning for Low Earth Orbit satellite communication","authors":"","doi":"10.1016/j.future.2024.107570","DOIUrl":"10.1016/j.future.2024.107570","url":null,"abstract":"<div><div>Satellite communication and Low Earth Orbit (LEO) satellites are important components of the 6G network, widely used for Earth observation tasks due to their low cost and short return period, making them a key technology for 6G network connectivity. Due to limitations in satellite system technology and downlink bandwidth, it is not feasible to download all high-resolution image information to ground stations. Even in existing federated learning (FL) methods, sharing well-trained parts of the model can still bottleneck with increasing model size. To address these challenges, we propose a new federated learning framework (FL-M3D) for LEO satellite communication that employs multi-round decentralized dataset distillation techniques. It allows satellites to independently extract local datasets and transmit them to ground stations instead of exchanging model parameters. Communication costs depend only on the size of the synthesized dataset and do not increase with larger models. However, the heterogeneity of satellite datasets can lead to sample ambiguity and decreased model convergence speed. Therefore, we propose distilling the datasets to mitigate the negative effects of data heterogeneity. Through experiments using real-world image datasets, FL-M3D reduces communication volume in simulated satellite networks by approximately 49.84% and achieves improved model performance.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142573080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cloud-based solution for urbanization monitoring using satellite images 利用卫星图像监测城市化进程的云解决方案
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-10-24 DOI: 10.1016/j.future.2024.107579
Motivated by the large amount of available satellite data and increasing interest in the study of urbanization, this paper presents a way for better supervision of urbanization, as more and more people are looking to increase their quality of life by migrating to urban areas. This project is particularly useful for environmental researchers or citizens who are looking to make informed decisions. This project utilizes Sentinel Hub, a multi-spectral satellite imagery cloud service, to access Sentinel 2 data to detect changes in Romania’s urban environment automatically. Sentinel Hub’s spectral bands, which describe the reflectance properties of a surface, are used to compute spectral indices that highlight patterns in satellite images. The paper analyzes two urban indices that successfully map build-up regions and a vegetation index that assesses the degree of vegetation in an urbanized area. It employs different methods to enhance each index and evaluates its performance in a town that has seen rapid urban expansion.
随着越来越多的人希望通过迁移到城市地区来提高生活质量,大量可用的卫星数据以及人们对城市化研究日益浓厚的兴趣促使本文提出了一种更好地监督城市化进程的方法。该项目对环境研究人员或希望做出明智决策的市民特别有用。该项目利用多光谱卫星图像云服务 Sentinel Hub 访问 Sentinel 2 数据,自动检测罗马尼亚城市环境的变化。Sentinel Hub 的光谱波段描述了表面的反射特性,可用于计算光谱指数,从而突出卫星图像中的模式。本文分析了成功绘制建筑密集区地图的两个城市指数和评估城市化地区植被程度的植被指数。它采用不同的方法来增强每种指数,并在一个城市快速扩张的城镇中对其性能进行了评估。
{"title":"Cloud-based solution for urbanization monitoring using satellite images","authors":"","doi":"10.1016/j.future.2024.107579","DOIUrl":"10.1016/j.future.2024.107579","url":null,"abstract":"<div><div>Motivated by the large amount of available satellite data and increasing interest in the study of urbanization, this paper presents a way for better supervision of urbanization, as more and more people are looking to increase their quality of life by migrating to urban areas. This project is particularly useful for environmental researchers or citizens who are looking to make informed decisions. This project utilizes Sentinel Hub, a multi-spectral satellite imagery cloud service, to access Sentinel 2 data to detect changes in Romania’s urban environment automatically. Sentinel Hub’s spectral bands, which describe the reflectance properties of a surface, are used to compute spectral indices that highlight patterns in satellite images. The paper analyzes two urban indices that successfully map build-up regions and a vegetation index that assesses the degree of vegetation in an urbanized area. It employs different methods to enhance each index and evaluates its performance in a town that has seen rapid urban expansion.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142572978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CBWO: A Novel Multi-objective Load Balancing Technique for Cloud Computing CBWO:一种新颖的云计算多目标负载平衡技术
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-10-24 DOI: 10.1016/j.future.2024.107561
In cloud computing systems, the growing demand for diverse applications has led to challenges in resource allocation and workload distribution, resulting in increased energy consumption and computational costs. To address these challenges, we propose a novel load-balancing method, namely CBWO, that integrates Chaos theory with the Black Widow Optimization algorithm. Our approach is designed to optimize cloud computing environments by improving energy efficiency and resource utilization. We employ CloudSim for simulations, evaluating key performance metrics such as energy consumption, resource utilization, makespan, task completion time, and imbalance degree. The experimental results demonstrate the superiority of our method, achieving average improvements of 67.28% in makespan and 29.03% in energy consumption compared to existing solutions.
在云计算系统中,多样化应用的需求不断增长,导致资源分配和工作负载分配面临挑战,从而增加了能源消耗和计算成本。为了应对这些挑战,我们提出了一种新颖的负载平衡方法,即 CBWO,它将混沌理论与黑寡妇优化算法相结合。我们的方法旨在通过提高能源效率和资源利用率来优化云计算环境。我们采用 CloudSim 进行仿真,评估能源消耗、资源利用率、时间跨度、任务完成时间和不平衡程度等关键性能指标。实验结果证明了我们的方法的优越性,与现有的解决方案相比,我们的方法平均提高了 67.28% 的时间跨度和 29.03% 的能耗。
{"title":"CBWO: A Novel Multi-objective Load Balancing Technique for Cloud Computing","authors":"","doi":"10.1016/j.future.2024.107561","DOIUrl":"10.1016/j.future.2024.107561","url":null,"abstract":"<div><div>In cloud computing systems, the growing demand for diverse applications has led to challenges in resource allocation and workload distribution, resulting in increased energy consumption and computational costs. To address these challenges, we propose a novel load-balancing method, namely CBWO, that integrates Chaos theory with the Black Widow Optimization algorithm. Our approach is designed to optimize cloud computing environments by improving energy efficiency and resource utilization. We employ CloudSim for simulations, evaluating key performance metrics such as energy consumption, resource utilization, makespan, task completion time, and imbalance degree. The experimental results demonstrate the superiority of our method, achieving average improvements of 67.28% in makespan and 29.03% in energy consumption compared to existing solutions.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142572973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SWQC: Efficient sequencing data quality control on the next-generation sunway platform SWQC:新一代 sunway 平台上的高效测序数据质量控制
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-10-24 DOI: 10.1016/j.future.2024.107577
Sequencing data quality control can significantly prevent low-quality data from impacting downstream applications in bioinformatics. The enormous growth of biological sequencing data in recent years introduces new challenges to the efficiency of quality control processes and motivates the need for fast implementations on modern compute systems. The powerful next-generation heterogeneous Sunway platform holds significant potential for addressing this challenge. However, there are currently no dedicated quality control applications that can fully utilize its computational power. To bridge this gap, we introduce SWQC, a novel quality control application specifically designed for the Sunway platform. We present an efficient distributed FASTQ I/O framework for Sunway-based workstations and supercomputers to take advantage of fast SSDs and the parallel file system. In order to support both process-level and thread-level (CPE-level) parallelism to leverage the computational power, we refactor and optimize all standard quality control modules for the heterogeneous Sunway architecture. When using a single node, SWQC achieves speedups between 2 and 40 over highly optimized quality control applications executed on a high-end 48-core AMD server. Additionally, when using 16 nodes, SWQC achieves parallel efficiencies of 70% (for reading and writing a single file) and 95% (for reading one file and writing split files) compared to a single node. Overall, SWQC is able to perform quality control operations for a 140GB FASTQ file within only 70 s using a single Sunway node. It is publicly available at https://github.com/RabbitBio/SWQC.
测序数据质量控制能有效防止低质量数据影响生物信息学的下游应用。近年来,生物测序数据的巨大增长给质量控制流程的效率带来了新的挑战,并促使人们需要在现代计算系统上快速实现这一功能。功能强大的下一代异构 Sunway 平台具有应对这一挑战的巨大潜力。然而,目前还没有专门的质量控制应用能充分利用其计算能力。为了弥补这一差距,我们推出了 SWQC,这是一款专为 Sunway 平台设计的新型质量控制应用程序。我们为基于 Sunway 的工作站和超级计算机提出了一个高效的分布式 FASTQ I/O 框架,以充分利用快速固态硬盘和并行文件系统。为了支持进程级和线程级(CPE 级)并行以充分利用计算能力,我们针对异构 Sunway 架构重构和优化了所有标准质量控制模块。在使用单个节点时,SWQC 比在高端 48 核 AMD 服务器上执行的高度优化质量控制应用程序的速度提高了 2 到 40 倍。此外,在使用 16 个节点时,与单节点相比,SWQC 的并行效率分别达到 70%(读写单个文件)和 95%(读取一个文件并写入分割文件)。总之,使用单个 Sunway 节点,SWQC 只需 70 秒就能完成 140GB FASTQ 文件的质量控制操作。它可在 https://github.com/RabbitBio/SWQC 上公开获取。
{"title":"SWQC: Efficient sequencing data quality control on the next-generation sunway platform","authors":"","doi":"10.1016/j.future.2024.107577","DOIUrl":"10.1016/j.future.2024.107577","url":null,"abstract":"<div><div>Sequencing data quality control can significantly prevent low-quality data from impacting downstream applications in bioinformatics. The enormous growth of biological sequencing data in recent years introduces new challenges to the efficiency of quality control processes and motivates the need for fast implementations on modern compute systems. The powerful next-generation heterogeneous Sunway platform holds significant potential for addressing this challenge. However, there are currently no dedicated quality control applications that can fully utilize its computational power. To bridge this gap, we introduce SWQC, a novel quality control application specifically designed for the Sunway platform. We present an efficient distributed FASTQ I/O framework for Sunway-based workstations and supercomputers to take advantage of fast SSDs and the parallel file system. In order to support both process-level and thread-level (CPE-level) parallelism to leverage the computational power, we refactor and optimize all standard quality control modules for the heterogeneous Sunway architecture. When using a single node, SWQC achieves speedups between 2 and 40 over highly optimized quality control applications executed on a high-end 48-core AMD server. Additionally, when using 16 nodes, SWQC achieves parallel efficiencies of 70% (for reading and writing a single file) and 95% (for reading one file and writing split files) compared to a single node. Overall, SWQC is able to perform quality control operations for a 140GB FASTQ file within only 70 s using a single Sunway node. It is publicly available at <span><span>https://github.com/RabbitBio/SWQC</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142572979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient security interface for high-performance Ceph storage systems 高性能 Ceph 存储系统的高效安全接口
IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-10-23 DOI: 10.1016/j.future.2024.107571
Ceph portrays a resilient clustered storage solution with supporting object, block, and file storage capabilities with no single point of failure. Despite these qualifications, data confidentiality defines a concern in the system, as authentication and access control are the only data protection security services in Ceph. CephArmor was proposed as a third-party security interface to protect data confidentiality by adding an extra protection layer to data at rest. Despite the added layer, the initial design of the API needed to be more efficient in addressing security and performance simultaneously. In this study, we propose a new architectural design to address the associated issues with the preliminary prototype. Comprehensive performance and security analysis verify the improvement of the proposed method compared to the initial approach. The benchmark result has indicated a 37% improvement on average in IOPS, elapsed time, and bandwidth for the write benchmark compared to the initial model.
Ceph 是一种弹性集群存储解决方案,支持对象、块和文件存储功能,没有单点故障。尽管有这些优点,但数据保密性仍是系统中的一个问题,因为身份验证和访问控制是 Ceph 中唯一的数据保护安全服务。CephArmor 被提议作为第三方安全接口,通过为静态数据添加额外的保护层来保护数据的机密性。尽管增加了保护层,但最初设计的 API 需要更有效地同时解决安全性和性能问题。在本研究中,我们提出了一种新的架构设计,以解决与初步原型相关的问题。全面的性能和安全分析验证了与最初的方法相比,所提出的方法有所改进。基准结果表明,与初始模型相比,写入基准的 IOPS、耗时和带宽平均提高了 37%。
{"title":"Efficient security interface for high-performance Ceph storage systems","authors":"","doi":"10.1016/j.future.2024.107571","DOIUrl":"10.1016/j.future.2024.107571","url":null,"abstract":"<div><div>Ceph portrays a resilient clustered storage solution with supporting object, block, and file storage capabilities with no single point of failure. Despite these qualifications, data confidentiality defines a concern in the system, as authentication and access control are the only data protection security services in Ceph. CephArmor was proposed as a third-party security interface to protect data confidentiality by adding an extra protection layer to data at rest. Despite the added layer, the initial design of the API needed to be more efficient in addressing security and performance simultaneously. In this study, we propose a new architectural design to address the associated issues with the preliminary prototype. Comprehensive performance and security analysis verify the improvement of the proposed method compared to the initial approach. The benchmark result has indicated a 37% improvement on average in IOPS, elapsed time, and bandwidth for the <em>write</em> benchmark compared to the initial model.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142554219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Future Generation Computer Systems-The International Journal of Escience
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1