IF 3.7 3区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

Computing

Pub Date : 2024-09-15 DOI: 10.1007/s00607-024-01345-3

Syed Ali Haider, Junaid A. Zubairi, Sahar Idwan

Traffic congestion in urban areas poses several challenges to municipal authorities including pollution, productivity loss, reckless driving, and delays in dealing with emergencies. Smart cities can use modern IoT infrastructure to solve the congestion problem and reduce pollution and delays. In this article, we focus on congestion mapping and mitigation for emergency vehicles in smart cities. We use a novel traffic light control technique to change the flow of cars on lights of interest thereby making way for emergency vehicles. We use a simulation model for a selected area of Manhattan to implement congestion mapping and to help find the fastest path for routing emergency vehicles based on the congestion metrics. The system controls traffic lights to block off the roads feeding into congestion and allows flow away from the congested path. This helps in clearing the preferred route to help emergency vehicles reach the destination faster. We show that the proposed algorithm can map congestion on city roads with accuracy thus helping to improve the response time of the emergency services and saving precious lives.

城市地区的交通拥堵给市政当局带来了诸多挑战，包括污染、生产力损失、鲁莽驾驶以及处理紧急情况的延误。智能城市可以利用现代物联网基础设施来解决拥堵问题，减少污染和延误。在本文中，我们将重点关注智能城市中紧急车辆的拥堵映射和缓解。我们使用一种新颖的交通信号灯控制技术来改变相关信号灯上的车流，从而为紧急车辆让路。我们使用曼哈顿选定区域的仿真模型来实施拥堵映射，并根据拥堵指标帮助找到紧急车辆的最快路径。系统通过控制交通信号灯来封锁造成拥堵的道路，并允许车辆驶离拥堵路段。这有助于清理首选路线，帮助应急车辆更快到达目的地。我们的研究表明，所提出的算法可以准确地绘制出城市道路的拥堵地图，从而有助于提高应急服务的响应速度，挽救宝贵的生命。

引用次数: 0

Fog intelligence for energy efficient management in smart street lamps 用于智能路灯节能管理的雾智能

IF 3.7 3区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

Computing

Pub Date : 2024-09-13 DOI: 10.1007/s00607-024-01348-0

J. Angela Jennifa Sujana, R. Venitta Raj, V. K. Raja Priya

Street lamp is a great asset for human society with a narrow beam spread light. The extensive proliferation of solar power in street lamps causes power outages due to their variable power-generated profiles. Thus Smart Street Lamp Fog Intelligence (SSLFI) framework based on hierarchical learning was proposed for efficient energy management in solar street lamps. Smart Street Lamp (SSL) shifts its brightness at higher and lower light levels with a comforting, energy-efficient gleam of light. The fog intelligence framework forecasts the SSL output power through short-term probabilistic energy consumption forecasts using Q-NARX-BiLSTM (Quantile Regression-Nonlinear Auto-Regressive Neural Networks with exogenous input-Bidirectional Long short-term memory) model. NARX-BiLSTM of two module types: (1) NARXNN (Nonlinear Auto-Regressive Neural Networks with exogenous input) model generates SSL power consumption and (2) BiLSTM (Bidirectional Long short-term memory) model generates SSL power forecasts. The quantile regression with the NARX-BiLSTM (Nonlinear Auto-Regressive Neural Networks with exogenous input-Bidirectional Long short-term memory) model forecasts the seasonal patterns achieving non-parametric interval predictions. The probabilistic predictions of power consumption are determined based on the conditional quantile using an improved kernel density estimation approach. The fuzzy inference system adopts forecasting results to diagnose fault conditions in street lamps. The experiment results show that the proposed framework SSLFI outperformed the state-of-the-art models forecasting under different weather conditions.

路灯是人类社会的一大财富，它的光束传播范围很窄。由于太阳能路灯的发电曲线不稳定，太阳能路灯的广泛普及会导致停电。因此，我们提出了基于分层学习的智能路灯雾智能（SSLFI）框架，用于太阳能路灯的高效能源管理。智能路灯（SSL）会在较高和较低的光照水平下变换亮度，发出舒适、节能的微光。雾智能框架利用 Q-NARX-BiLSTM（具有外生输入的定量回归-非线性自回归神经网络-双向长短期记忆）模型，通过短期能耗概率预测来预测智能路灯的输出功率。NARX-BiLSTM 有两种模块类型：（1）NARXNN（具有外生输入的非线性自回归神经网络）模型生成 SSL 功率消耗；（2）BiLSTM（双向长短期记忆）模型生成 SSL 功率预测。采用 NARX-BiLSTM（带外源输入的非线性自回归神经网络-双向长短期记忆）模型的量回归预测季节性模式，实现了非参数区间预测。用改进的核密度估计方法，根据条件量值确定耗电量的概率预测。模糊推理系统采用预测结果来诊断路灯的故障情况。实验结果表明，所提出的 SSLFI 框架在不同天气条件下的预测结果优于最先进的模型。

{"title":"Fog intelligence for energy efficient management in smart street lamps","authors":"J. Angela Jennifa Sujana, R. Venitta Raj, V. K. Raja Priya","doi":"10.1007/s00607-024-01348-0","DOIUrl":"https://doi.org/10.1007/s00607-024-01348-0","url":null,"abstract":"Street lamp is a great asset for human society with a narrow beam spread light. The extensive proliferation of solar power in street lamps causes power outages due to their variable power-generated profiles. Thus Smart Street Lamp Fog Intelligence (SSLFI) framework based on hierarchical learning was proposed for efficient energy management in solar street lamps. Smart Street Lamp (SSL) shifts its brightness at higher and lower light levels with a comforting, energy-efficient gleam of light. The fog intelligence framework forecasts the SSL output power through short-term probabilistic energy consumption forecasts using Q-NARX-BiLSTM (Quantile Regression-Nonlinear Auto-Regressive Neural Networks with exogenous input-Bidirectional Long short-term memory) model. NARX-BiLSTM of two module types: (1) NARXNN (Nonlinear Auto-Regressive Neural Networks with exogenous input) model generates SSL power consumption and (2) BiLSTM (Bidirectional Long short-term memory) model generates SSL power forecasts. The quantile regression with the NARX-BiLSTM (Nonlinear Auto-Regressive Neural Networks with exogenous input-Bidirectional Long short-term memory) model forecasts the seasonal patterns achieving non-parametric interval predictions. The probabilistic predictions of power consumption are determined based on the conditional quantile using an improved kernel density estimation approach. The fuzzy inference system adopts forecasting results to diagnose fault conditions in street lamps. The experiment results show that the proposed framework SSLFI outperformed the state-of-the-art models forecasting under different weather conditions.","PeriodicalId":10718,"journal":{"name":"Computing","volume":"7 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Contextual authentication of users and devices using machine learning 利用机器学习对用户和设备进行情境认证

IF 3.7 3区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

Computing

Pub Date : 2024-09-13 DOI: 10.1007/s00607-024-01333-7

Divyans Mahansaria, Uttam Kumar Roy

At the time of authentication, confidential data are exchanged between the user/device and the authentication server to determine the legitimacy of the source requesting authentication. Safeguarding the authentication process from security attacks is of utmost importance, and various authentication methods exist depending on the system’s requirements. However, no authentication process can guarantee full-proof security. This research aimed to use the context of users and devices during authentication to detect anomalies and security-related attacks. In particular, denial-of-service (DoS)/distributed denial-of-service (DDoS) attacks and brute-force attacks have been analyzed in detail using contextual information. Extensive simulations were conducted on the benchmark CIC-IDS2017 dataset using the Weka tool. The performance metrics of recall, precision, accuracy, f-score, and model-built time were computed for the four machine-learning classifiers—J48, Random Forest, Multi-Layer Perceptron, and Bayes Net—for different combinations of data splits and groups of data features. For both DoS/DDoS and brute-force attacks, some of the experimental results show a more than 99% value for recall, precision, accuracy, and f-score. The results of the experiments, security analysis, and threat modeling show that the proposed authentication scheme effectively enhances a secure system’s security level.

在进行身份验证时，用户/设备与身份验证服务器之间会交换机密数据，以确定请求身份验证来源的合法性。防止身份验证过程受到安全攻击是最重要的，根据系统的要求有各种身份验证方法。然而，没有任何一种身份验证过程能保证完全安全。这项研究旨在利用用户和设备在身份验证过程中的上下文来检测异常情况和与安全相关的攻击。特别是，利用上下文信息详细分析了拒绝服务（DoS）/分布式拒绝服务（DDoS）攻击和暴力破解攻击。使用 Weka 工具在基准 CIC-IDS2017 数据集上进行了大量模拟。针对不同的数据拆分组合和数据特征组，计算了四种机器学习分类器--J48、随机森林、多层感知器和贝叶斯网的召回率、精确度、准确率、f-分数和建模时间等性能指标。对于 DoS/DDoS 和暴力破解攻击，部分实验结果显示召回率、精确率、准确率和 f 分数均超过 99%。实验、安全分析和威胁建模的结果表明，所提出的认证方案能有效提高安全系统的安全级别。

{"title":"Contextual authentication of users and devices using machine learning","authors":"Divyans Mahansaria, Uttam Kumar Roy","doi":"10.1007/s00607-024-01333-7","DOIUrl":"https://doi.org/10.1007/s00607-024-01333-7","url":null,"abstract":"At the time of authentication, confidential data are exchanged between the user/device and the authentication server to determine the legitimacy of the source requesting authentication. Safeguarding the authentication process from security attacks is of utmost importance, and various authentication methods exist depending on the system’s requirements. However, no authentication process can guarantee full-proof security. This research aimed to use the context of users and devices during authentication to detect anomalies and security-related attacks. In particular, denial-of-service (DoS)/distributed denial-of-service (DDoS) attacks and brute-force attacks have been analyzed in detail using contextual information. Extensive simulations were conducted on the benchmark CIC-IDS2017 dataset using the Weka tool. The performance metrics of recall, precision, accuracy, f-score, and model-built time were computed for the four machine-learning classifiers—J48, Random Forest, Multi-Layer Perceptron, and Bayes Net—for different combinations of data splits and groups of data features. For both DoS/DDoS and brute-force attacks, some of the experimental results show a more than 99% value for recall, precision, accuracy, and f-score. The results of the experiments, security analysis, and threat modeling show that the proposed authentication scheme effectively enhances a secure system’s security level.","PeriodicalId":10718,"journal":{"name":"Computing","volume":"183 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-objective service composition optimization problem in IoT for agriculture 4.0 农业 4.0 物联网中的多目标服务组合优化问题

IF 3.7 3区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

Computing

Pub Date : 2024-09-11 DOI: 10.1007/s00607-024-01346-2

Shalini Sharma, Bhupendra Kumar Pathak, Rajiv Kumar

One of the most well-known names that has recently attained new heights and set a standard is Internet of Things (IoT). IoT aims to connect all physical devices in such a way that they are subject to human control over the Internet.The emergence of IoT in almost all the industries has redesigned them including smart agriculture. In today’s world, the growth in agriculture sector is rapid, smarter and precise than ever. In case of IoT, the objects are termed as services, sometimes with similar functionalities but distinct quality of service parameters. As the user’s requirements are complex, a single service cannot fulfil them efficiently. So, service composition is the solution. These services known as atomic services, are represented as workflow, with each of them having distinct candidate composite services. Fulfilling these Quality of Service (QoS) constraints makes it a NP-hard problem which can’t be solved using traditional approaches. Hence, comes the concept of evolutionary approaches. In this paper one of the evolutionary approach- NSGA-II is used to optimize the production of apple by composing the various services, taking into account the cost and time as multi-objective problem to be solved. This is for the very first time that QoS aware service composition problem has been optimized in smart agriculture as found in the literature. Results are further compared with multi-objective genetic algorithm (MOGA) and it has been found that NSGA-II outperforms MOGA by generating well-proportioned pareto optimal solutions.

物联网（IoT）是最近达到新高度并制定标准的最著名的名称之一。物联网的目的是将所有物理设备连接起来，使它们能够通过互联网接受人类的控制。物联网在几乎所有行业的出现都对它们进行了重新设计，包括智能农业。在当今世界，农业部门的发展比以往任何时候都要迅速、智能和精确。在物联网中，物体被称为服务，有时具有相似的功能，但服务质量参数却截然不同。由于用户的需求非常复杂，单一服务无法有效满足这些需求。因此，服务组合是一种解决方案。这些服务被称为原子服务，表现为工作流，其中每个服务都有不同的候选复合服务。满足这些服务质量（QoS）约束是一个 NP 难问题，传统方法无法解决。因此，进化方法的概念应运而生。本文采用了一种进化方法--NSGA-II，通过组合各种服务来优化苹果的生产，并将成本和时间作为多目标问题加以解决。这是文献中首次在智慧农业中对 QoS 感知服务组合问题进行优化。研究结果与多目标遗传算法（MOGA）进行了进一步比较，发现 NSGA-II 生成的帕累托最优解比例合理，优于 MOGA。

{"title":"Multi-objective service composition optimization problem in IoT for agriculture 4.0","authors":"Shalini Sharma, Bhupendra Kumar Pathak, Rajiv Kumar","doi":"10.1007/s00607-024-01346-2","DOIUrl":"https://doi.org/10.1007/s00607-024-01346-2","url":null,"abstract":"One of the most well-known names that has recently attained new heights and set a standard is Internet of Things (IoT). IoT aims to connect all physical devices in such a way that they are subject to human control over the Internet.The emergence of IoT in almost all the industries has redesigned them including smart agriculture. In today’s world, the growth in agriculture sector is rapid, smarter and precise than ever. In case of IoT, the objects are termed as services, sometimes with similar functionalities but distinct quality of service parameters. As the user’s requirements are complex, a single service cannot fulfil them efficiently. So, service composition is the solution. These services known as atomic services, are represented as workflow, with each of them having distinct candidate composite services. Fulfilling these Quality of Service (QoS) constraints makes it a NP-hard problem which can’t be solved using traditional approaches. Hence, comes the concept of evolutionary approaches. In this paper one of the evolutionary approach- NSGA-II is used to optimize the production of apple by composing the various services, taking into account the cost and time as multi-objective problem to be solved. This is for the very first time that QoS aware service composition problem has been optimized in smart agriculture as found in the literature. Results are further compared with multi-objective genetic algorithm (MOGA) and it has been found that NSGA-II outperforms MOGA by generating well-proportioned pareto optimal solutions.","PeriodicalId":10718,"journal":{"name":"Computing","volume":"27 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Optimization of mitigation deployment using deep reinforcement learning over an enhanced ATT &CK 在增强型 ATT &CK 上使用深度强化学习优化缓解部署

IF 3.7 3区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

Computing

Pub Date : 2024-09-06 DOI: 10.1007/s00607-024-01344-4

Yingze Liu, Yuanbo Guo, Rajiv Ranjan, Dan Chen

This study introduces a Deep Reinforcement Learning approach (DRL-MD) aimed at optimizing the deployment of mitigations to minimize redundancy while ensuring effective defense against cyberattacks. DRL-MD initially enhances ATT &CK (Adversarial Tactics, Techniques, and Common Knowledge) to underscore the formal relationships between attacks and defenses. Over the enhanced ATT &CK, DRL-MD then operates in two phases: (1) Estimating Node Importance: DRL-MD proposes a model to estimate the importance of deployed nodes in the network, prioritizing mitigation deployment locations for better evaluation of mitigation effectiveness; and (2) Optimizing Mitigation Deployment: A Soft Actor-Critic algorithm finds the optimal mitigation deployment policy through multi-objective optimization of the importance of deployed nodes, the effectiveness of mitigations in preventing cyberattacks, vulnerability repair, and deployment cost. A case study with DRL-MD against the state-of-the-art counterparts has been performed considering the WannaCry threat, and results indicate that: (1) DRL-MD performs the best with 6.4–11% decrease in deployment cost; and (2) DRL-MD can significantly reduce redundancy in mitigation deployments, which partially benefits from the enhanced ATT &CK model. Overall, a comprehensive solution of mitigation deployment has been fostered to significantly lower the redundancy with more effective defenses against cyberattacks sustained.

本研究介绍了一种深度强化学习方法（DRL-MD），旨在优化缓解措施的部署，以尽量减少冗余，同时确保有效防御网络攻击。DRL-MD 最初增强了 ATT &CK （对抗战术、技术和常识），以强调攻击与防御之间的正式关系。在增强 ATT &CK 的基础上，DRL-MD 分两个阶段运行：（1）估计节点重要性：DRL-MD 提出了一个估算网络中部署节点重要性的模型，优先考虑缓解部署位置，以便更好地评估缓解效果；以及 (2) 优化缓解部署：软行为批判算法通过对部署节点的重要性、缓解措施在预防网络攻击方面的有效性、漏洞修复和部署成本进行多目标优化，找到最佳缓解部署策略。考虑到 WannaCry 威胁，利用 DRL-MD 与最先进的同行进行了案例研究，结果表明结果表明：(1) DRL-MD 性能最佳，部署成本降低了 6.4-11%；(2) DRL-MD 可显著减少缓解部署中的冗余，这部分得益于增强型 ATT &CK 模型。总之，一个全面的缓解部署解决方案已经形成，可显著降低冗余，更有效地防御网络攻击。

{"title":"Optimization of mitigation deployment using deep reinforcement learning over an enhanced ATT &CK","authors":"Yingze Liu, Yuanbo Guo, Rajiv Ranjan, Dan Chen","doi":"10.1007/s00607-024-01344-4","DOIUrl":"https://doi.org/10.1007/s00607-024-01344-4","url":null,"abstract":"This study introduces a Deep Reinforcement Learning approach (DRL-MD) aimed at optimizing the deployment of mitigations to minimize redundancy while ensuring effective defense against cyberattacks. DRL-MD initially enhances ATT &CK (Adversarial Tactics, Techniques, and Common Knowledge) to underscore the formal relationships between attacks and defenses. Over the enhanced ATT &CK, DRL-MD then operates in two phases: (1) Estimating Node Importance: DRL-MD proposes a model to estimate the importance of deployed nodes in the network, prioritizing mitigation deployment locations for better evaluation of mitigation effectiveness; and (2) Optimizing Mitigation Deployment: A Soft Actor-Critic algorithm finds the optimal mitigation deployment policy through multi-objective optimization of the importance of deployed nodes, the effectiveness of mitigations in preventing cyberattacks, vulnerability repair, and deployment cost. A case study with DRL-MD against the state-of-the-art counterparts has been performed considering the WannaCry threat, and results indicate that: (1) DRL-MD performs the best with 6.4–11% decrease in deployment cost; and (2) DRL-MD can significantly reduce redundancy in mitigation deployments, which partially benefits from the enhanced ATT &CK model. Overall, a comprehensive solution of mitigation deployment has been fostered to significantly lower the redundancy with more effective defenses against cyberattacks sustained.","PeriodicalId":10718,"journal":{"name":"Computing","volume":"437 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robust evaluation of GPU compute instances for HPC and AI in the cloud: a TOPSIS approach with sensitivity, bootstrapping, and non-parametric analysis 云计算中用于高性能计算和人工智能的 GPU 计算实例的稳健评估：采用敏感性、引导和非参数分析的 TOPSIS 方法

IF 3.7 3区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

Computing

Pub Date : 2024-09-06 DOI: 10.1007/s00607-024-01342-6

Mandeep Kumar, Gagandeep Kaur, Prashant Singh Rana

Evaluating GPU compute instances for High Performance Computing (HPC) and Artificial Intelligence (AI) applications in the cloud involves complex decision-making processes. This research applies the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) to rank various GPU compute instances for HPC and AI from leading cloud providers: Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP), and Oracle Cloud Infrastructure (OCI). The analysis incorporates a sensitivity examination, bootstrapping, and non-parametric tests to ensure robust and reliable rankings. Sensitivity analysis reveals the stability of the TOPSIS framework despite variations in criteria weights, while bootstrap analysis provides confidence intervals for the rankings, highlighting their consistency. The Friedman test confirms that ranking stability persists across different scenarios, indicating minimal impact from weight adjustments. Despite these insights, limitations such as interdependencies among criteria, data accuracy, and generalizability constraints must be acknowledged. This comprehensive approach ensures informed decision-making for selecting optimal GPU instances for cloud-based HPC and AI tasks.

评估云中高性能计算（HPC）和人工智能（AI）应用的 GPU 计算实例涉及复杂的决策过程。本研究采用 "与理想解决方案相似度排序技术"（TOPSIS），对主要云提供商提供的用于高性能计算和人工智能的各种 GPU 计算实例进行排序：亚马逊网络服务（AWS）、微软 Azure、谷歌云平台（GCP）和甲骨文云基础设施（OCI）。分析结合了灵敏度检查、引导和非参数检验，以确保排名稳健可靠。灵敏度分析揭示了 TOPSIS 框架在标准权重变化的情况下的稳定性，而自举分析提供了排名的置信区间，突出了排名的一致性。弗里德曼检验证实，在不同的情况下，排名的稳定性依然存在，这表明权重调整的影响微乎其微。尽管获得了这些启示，但仍必须承认一些局限性，如标准之间的相互依存性、数据准确性和可推广性限制。这种综合方法可确保为基于云的高性能计算和人工智能任务选择最佳 GPU 实例做出明智的决策。

{"title":"Robust evaluation of GPU compute instances for HPC and AI in the cloud: a TOPSIS approach with sensitivity, bootstrapping, and non-parametric analysis","authors":"Mandeep Kumar, Gagandeep Kaur, Prashant Singh Rana","doi":"10.1007/s00607-024-01342-6","DOIUrl":"https://doi.org/10.1007/s00607-024-01342-6","url":null,"abstract":"Evaluating GPU compute instances for High Performance Computing (HPC) and Artificial Intelligence (AI) applications in the cloud involves complex decision-making processes. This research applies the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) to rank various GPU compute instances for HPC and AI from leading cloud providers: Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP), and Oracle Cloud Infrastructure (OCI). The analysis incorporates a sensitivity examination, bootstrapping, and non-parametric tests to ensure robust and reliable rankings. Sensitivity analysis reveals the stability of the TOPSIS framework despite variations in criteria weights, while bootstrap analysis provides confidence intervals for the rankings, highlighting their consistency. The Friedman test confirms that ranking stability persists across different scenarios, indicating minimal impact from weight adjustments. Despite these insights, limitations such as interdependencies among criteria, data accuracy, and generalizability constraints must be acknowledged. This comprehensive approach ensures informed decision-making for selecting optimal GPU instances for cloud-based HPC and AI tasks.\u0000","PeriodicalId":10718,"journal":{"name":"Computing","volume":"40 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Using a random forest to predict quantized reuse distance in an SSD write buffer 使用随机森林预测固态硬盘写入缓冲区中的量化重复使用距离

IF 3.7 3区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

Computing

Pub Date : 2024-09-05 DOI: 10.1007/s00607-024-01343-5

Hyejin Cha, In Kee Kim, Taeseok Kim

Efficient management of the write buffer in solid-state drives (SSDs) can be achieved by predicting future I/O request patterns using machine learning techniques. However, the computational demands posed by sophisticated approaches like deep learning remain significant, despite the increasing computational power of SSDs. This paper presents a novel approach to write buffer management that addresses these challenges. Our method employs a lightweight yet accurate random forest classifier to predict the forward reuse distances (FRDs) of I/O requests, indicating the likelihood of recurring identical I/O requests. Our key insight is that, rather than aiming for exact FRD predictions for future individual requests, we focus on identifying whether the predicted FRD exceeds the buffer size. With this insight, our method implements efficient buffer management operations, including bypassing the buffer storage when necessary. To achieve this, we introduce a banding method that quantizes FRDs according to the buffer size. This enables predictions at the band level, forming the foundation for a lightweight machine learning model. Subsequently, we assign high caching priority to write requests that are anticipated to have a short FRD band. Through extensive evaluations utilizing a simulator, we demonstrate that our method achieves results comparable to those of the optimal algorithm in terms of hit rate in most scenarios. Moreover, our approach outperforms state-of-the-art algorithms, which depend on past I/O reference patterns, by up to 27%.

利用机器学习技术预测未来的 I/O 请求模式，可以实现对固态硬盘（SSD）写缓冲区的高效管理。然而，尽管固态硬盘的计算能力不断提高，但深度学习等复杂方法带来的计算需求仍然很大。本文提出了一种新型写缓冲区管理方法，以应对这些挑战。我们的方法采用轻量级但准确的随机森林分类器来预测 I/O 请求的前向重用距离 (FRD)，以显示重复出现相同 I/O 请求的可能性。我们的主要见解是，我们不以预测未来单个请求的精确前向重用距离为目标，而是专注于识别预测的前向重用距离是否超过缓冲区大小。有了这种认识，我们的方法就能实现高效的缓冲区管理操作，包括在必要时绕过缓冲区存储。为此，我们引入了根据缓冲区大小量化 FRD 的分段方法。这样就能在带级进行预测，为轻量级机器学习模型奠定基础。随后，我们为预计 FRD 波段较短的写入请求分配较高的缓存优先级。通过利用模拟器进行广泛评估，我们证明，在大多数情况下，我们的方法在命中率方面取得了与最优算法相当的结果。此外，我们的方法比依赖于过去 I/O 参考模式的最先进算法优越 27%。

{"title":"Using a random forest to predict quantized reuse distance in an SSD write buffer","authors":"Hyejin Cha, In Kee Kim, Taeseok Kim","doi":"10.1007/s00607-024-01343-5","DOIUrl":"https://doi.org/10.1007/s00607-024-01343-5","url":null,"abstract":"Efficient management of the write buffer in solid-state drives (SSDs) can be achieved by predicting future I/O request patterns using machine learning techniques. However, the computational demands posed by sophisticated approaches like deep learning remain significant, despite the increasing computational power of SSDs. This paper presents a novel approach to write buffer management that addresses these challenges. Our method employs a lightweight yet accurate random forest classifier to predict the forward reuse distances (FRDs) of I/O requests, indicating the likelihood of recurring identical I/O requests. Our key insight is that, rather than aiming for exact FRD predictions for future individual requests, we focus on identifying whether the predicted FRD exceeds the buffer size. With this insight, our method implements efficient buffer management operations, including bypassing the buffer storage when necessary. To achieve this, we introduce a banding method that quantizes FRDs according to the buffer size. This enables predictions at the band level, forming the foundation for a lightweight machine learning model. Subsequently, we assign high caching priority to write requests that are anticipated to have a short FRD band. Through extensive evaluations utilizing a simulator, we demonstrate that our method achieves results comparable to those of the optimal algorithm in terms of hit rate in most scenarios. Moreover, our approach outperforms state-of-the-art algorithms, which depend on past I/O reference patterns, by up to 27%.","PeriodicalId":10718,"journal":{"name":"Computing","volume":"29 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A demand forecasting system of product categories defined by their time series using a hybrid approach of ensemble learning with feature engineering 利用集合学习与特征工程的混合方法，建立由时间序列定义的产品类别需求预测系统

IF 3.7 3区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

Computing

Pub Date : 2024-09-02 DOI: 10.1007/s00607-024-01320-y

Santiago Mejía, Jose Aguilar

Retail companies face major problems in the estimation of their product’s future demand due to the high diversity of sales behavior that each good presents. Different forecasting models are implemented to meet the demand requirements for efficient inventory management. However, in most of the proposed works, a single model approach is applied to forecast all products, ignoring that some methods are better adapted for certain features of the demand time series of each product. The proposed forecasting system addresses this problem, by implementing a two-phase methodology that initially clusters the products with the application of an unsupervised learning approach using the extracted demand features of each good, and then, implements a second phase where, after a feature engineering process, a set of different forecasting methods are evaluated to identify those with best performs for each cluster. Finally, ensemble machine learning models are implemented using the top-performing models of each cluster to carry out the demand estimation. The results indicate that the proposed forecasting system improves the demand estimation over the single forecasting approaches when evaluating the R², MSE, and MASE quality measures.

由于每种商品的销售行为千差万别，零售公司在估计其产品的未来需求时面临着重大问题。为了满足高效库存管理的需求要求，人们采用了不同的预测模型。然而，在大多数提议的工作中，都是采用单一模型方法来预测所有产品，而忽略了有些方法更适合每种产品需求时间序列的某些特征。为解决这一问题，所提出的预测系统分为两个阶段：首先，采用无监督学习方法，利用提取的每种商品的需求特征对产品进行聚类；然后，实施第二阶段，在特征工程流程之后，对一系列不同的预测方法进行评估，以确定哪些方法在每个聚类中表现最佳。最后，使用每个群组中表现最好的模型来实施集合机器学习模型，以进行需求预测。结果表明，在评估 R2、MSE 和 MASE 质量指标时，与单一预测方法相比，建议的预测系统提高了需求预测效果。

{"title":"A demand forecasting system of product categories defined by their time series using a hybrid approach of ensemble learning with feature engineering","authors":"Santiago Mejía, Jose Aguilar","doi":"10.1007/s00607-024-01320-y","DOIUrl":"https://doi.org/10.1007/s00607-024-01320-y","url":null,"abstract":"Retail companies face major problems in the estimation of their product’s future demand due to the high diversity of sales behavior that each good presents. Different forecasting models are implemented to meet the demand requirements for efficient inventory management. However, in most of the proposed works, a single model approach is applied to forecast all products, ignoring that some methods are better adapted for certain features of the demand time series of each product. The proposed forecasting system addresses this problem, by implementing a two-phase methodology that initially clusters the products with the application of an unsupervised learning approach using the extracted demand features of each good, and then, implements a second phase where, after a feature engineering process, a set of different forecasting methods are evaluated to identify those with best performs for each cluster. Finally, ensemble machine learning models are implemented using the top-performing models of each cluster to carry out the demand estimation. The results indicate that the proposed forecasting system improves the demand estimation over the single forecasting approaches when evaluating the R2, MSE, and MASE quality measures.","PeriodicalId":10718,"journal":{"name":"Computing","volume":"99 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hybrid deep learning and evolutionary algorithms for accurate cloud workload prediction 混合深度学习和进化算法，实现准确的云计算工作量预测

IF 3.7 3区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

Computing

Pub Date : 2024-08-25 DOI: 10.1007/s00607-024-01340-8

Tassawar Ali, Hikmat Ullah Khan, Fawaz Khaled Alarfaj, Mohammed AlReshoodi

Cloud computing offers demand-based allocation of required resources to its clients ensuring optimal use of resources in a cost-effective manner. However, due to the massive increase in demand for physical resources by datacenters cloud management suffers from inefficient resource management. To enhance efficiency by reducing resource setup time, workload prediction has become an active research area. It helps to make management decisions proactively and enables the cloud management system to better respond to spikes in the workload. This study proposes a hybrid model combining both state-of-the-art deep learning models and evolutionary algorithms for workload prediction. The proposed cluster-based differential evolution neural network model utilizes differential evolution for the optimization of feature weights of the deep neural network to predict the future workloads of a cloud datacenter. The proposed model uses a novel mutation strategy that clusters the population based on an agglomerative technique and chooses the best gene from randomly chosen clusters. Thus, the strategy creates a balance between the exploration and exploitation of the population and enables the model to avoid local optima and converge rapidly. The datasets used for the experiments are created from Google’s real-world traces and the Alibaba platform. The model is compared with backpropagation, Adam optimizer-based LSTM, and an evolutionary neural network-based three-mutation policy. We evaluated the performance of the proposed model in terms of root mean squared error in predicting the upcoming CPU, RAM, and BW usage. The proposed model achieved an error rate as low as 0.0002 to outperform the existing studies in the relevant literature. To further authenticate the results, we performed the statistical analysis of the obtained results in terms of R-squared, mean bias deviation, 90th percentile score, and Theil’s U statistics. The high accuracy and automaticity of the proposed model have paved the way for its application in diverse areas of cloud computing, including real-time applications.

云计算根据客户需求分配所需资源，确保以经济高效的方式优化使用资源。然而，由于数据中心对物理资源的需求大量增加，云计算管理存在资源管理效率低下的问题。为了通过减少资源设置时间来提高效率，工作量预测已成为一个活跃的研究领域。它有助于主动做出管理决策，使云管理系统能够更好地应对工作负载的激增。本研究提出了一种混合模型，将最先进的深度学习模型和进化算法相结合，用于工作量预测。所提出的基于集群的差分进化神经网络模型利用差分进化来优化深度神经网络的特征权重，从而预测云数据中心未来的工作负载。所提出的模型采用了一种新颖的突变策略，该策略基于聚类技术对种群进行聚类，并从随机选择的聚类中选择最佳基因。因此，该策略在种群的探索和利用之间建立了平衡，使模型能够避免局部最优并快速收敛。实验所使用的数据集来自谷歌的真实跟踪和阿里巴巴平台。我们将该模型与反向传播、基于亚当优化器的 LSTM 以及基于进化神经网络的三突变策略进行了比较。我们从均方根误差的角度评估了所提模型在预测即将到来的 CPU、RAM 和 BW 使用率方面的性能。拟议模型的误差率低至 0.0002，优于相关文献中的现有研究。为了进一步验证结果，我们对所获得的结果进行了统计分析，包括 R 方、平均偏差、第 90 百分位数得分和 Theil's U 统计量。所提模型的高准确性和自动性为其在云计算各领域（包括实时应用）的应用铺平了道路。

{"title":"Hybrid deep learning and evolutionary algorithms for accurate cloud workload prediction","authors":"Tassawar Ali, Hikmat Ullah Khan, Fawaz Khaled Alarfaj, Mohammed AlReshoodi","doi":"10.1007/s00607-024-01340-8","DOIUrl":"https://doi.org/10.1007/s00607-024-01340-8","url":null,"abstract":"Cloud computing offers demand-based allocation of required resources to its clients ensuring optimal use of resources in a cost-effective manner. However, due to the massive increase in demand for physical resources by datacenters cloud management suffers from inefficient resource management. To enhance efficiency by reducing resource setup time, workload prediction has become an active research area. It helps to make management decisions proactively and enables the cloud management system to better respond to spikes in the workload. This study proposes a hybrid model combining both state-of-the-art deep learning models and evolutionary algorithms for workload prediction. The proposed cluster-based differential evolution neural network model utilizes differential evolution for the optimization of feature weights of the deep neural network to predict the future workloads of a cloud datacenter. The proposed model uses a novel mutation strategy that clusters the population based on an agglomerative technique and chooses the best gene from randomly chosen clusters. Thus, the strategy creates a balance between the exploration and exploitation of the population and enables the model to avoid local optima and converge rapidly. The datasets used for the experiments are created from Google’s real-world traces and the Alibaba platform. The model is compared with backpropagation, Adam optimizer-based LSTM, and an evolutionary neural network-based three-mutation policy. We evaluated the performance of the proposed model in terms of root mean squared error in predicting the upcoming CPU, RAM, and BW usage. The proposed model achieved an error rate as low as 0.0002 to outperform the existing studies in the relevant literature. To further authenticate the results, we performed the statistical analysis of the obtained results in terms of R-squared, mean bias deviation, 90th percentile score, and Theil’s U statistics. The high accuracy and automaticity of the proposed model have paved the way for its application in diverse areas of cloud computing, including real-time applications.","PeriodicalId":10718,"journal":{"name":"Computing","volume":"10 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Large language models: a new approach for privacy policy analysis at scale 大型语言模型：大规模隐私政策分析的新方法

IF 3.7 3区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

Computing

Pub Date : 2024-08-22 DOI: 10.1007/s00607-024-01331-9

David Rodriguez, Ian Yang, Jose M. Del Alamo, Norman Sadeh

The number and dynamic nature of web sites and mobile applications present regulators and app store operators with significant challenges when it comes to enforcing compliance with applicable privacy and data protection laws. Over the past several years, people have turned to Natural Language Processing (NLP) techniques to automate privacy compliance analysis (e.g., comparing statements in privacy policies with analysis of the code and behavior of mobile apps) and to answer people’s privacy questions. Traditionally, these NLP techniques have relied on labor-intensive and potentially error-prone manual annotation processes to build the corpora necessary to train them. This article explores and evaluates the use of Large Language Models (LLMs) as an alternative for effectively and efficiently identifying and categorizing a variety of data practice disclosures found in the text of privacy policies. Specifically, we report on the performance of ChatGPT and Llama 2, two particularly popular LLM-based tools. This includes engineering prompts and evaluating different configurations of these LLM techniques. Evaluation of the resulting techniques on well-known corpora of privacy policy annotations yields an F1 score exceeding 93%. This score is higher than scores reported earlier in the literature on these benchmarks. This performance is obtained at minimal marginal cost (excluding the cost required to train the foundational models themselves). These results, which are consistent with those reported in other domains, suggest that LLMs offer a particularly promising approach to automated privacy policy analysis at scale.

网站和移动应用程序的数量和动态性质给监管机构和应用程序商店运营商在执行适用的隐私和数据保护法律时带来了巨大的挑战。在过去的几年里，人们开始使用自然语言处理 (NLP) 技术来自动进行隐私合规性分析（例如，将隐私政策中的声明与移动应用程序的代码和行为分析进行比较），并回答人们的隐私问题。传统上，这些 NLP 技术依赖于劳动密集型且容易出错的人工标注过程，以建立训练这些技术所需的语料库。本文探讨并评估了大型语言模型 (LLM) 的使用情况，将其作为一种替代方法，有效、高效地识别隐私政策文本中的各种数据实践披露并对其进行分类。具体来说，我们报告了 ChatGPT 和 Llama 2 这两个特别流行的基于 LLM 的工具的性能。这包括工程提示和评估这些 LLM 技术的不同配置。在著名的隐私政策注释语料库中对所产生的技术进行评估后，F1 分数超过 93%。这一分数高于早先文献中报道的这些基准的分数。这种性能是以最小的边际成本（不包括训练基础模型本身所需的成本）获得的。这些结果与其他领域报告的结果一致，表明 LLM 为大规模自动隐私政策分析提供了一种特别有前途的方法。

{"title":"Large language models: a new approach for privacy policy analysis at scale","authors":"David Rodriguez, Ian Yang, Jose M. Del Alamo, Norman Sadeh","doi":"10.1007/s00607-024-01331-9","DOIUrl":"https://doi.org/10.1007/s00607-024-01331-9","url":null,"abstract":"The number and dynamic nature of web sites and mobile applications present regulators and app store operators with significant challenges when it comes to enforcing compliance with applicable privacy and data protection laws. Over the past several years, people have turned to Natural Language Processing (NLP) techniques to automate privacy compliance analysis (e.g., comparing statements in privacy policies with analysis of the code and behavior of mobile apps) and to answer people’s privacy questions. Traditionally, these NLP techniques have relied on labor-intensive and potentially error-prone manual annotation processes to build the corpora necessary to train them. This article explores and evaluates the use of Large Language Models (LLMs) as an alternative for effectively and efficiently identifying and categorizing a variety of data practice disclosures found in the text of privacy policies. Specifically, we report on the performance of ChatGPT and Llama 2, two particularly popular LLM-based tools. This includes engineering prompts and evaluating different configurations of these LLM techniques. Evaluation of the resulting techniques on well-known corpora of privacy policy annotations yields an F1 score exceeding 93%. This score is higher than scores reported earlier in the literature on these benchmarks. This performance is obtained at minimal marginal cost (excluding the cost required to train the foundational models themselves). These results, which are consistent with those reported in other domains, suggest that LLMs offer a particularly promising approach to automated privacy policy analysis at scale.","PeriodicalId":10718,"journal":{"name":"Computing","volume":"8 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Computing最新文献