首页 > 最新文献

2022 IEEE 7th International Conference on Smart Cloud (SmartCloud)最新文献

英文 中文
GAN-based Abnormal Transaction Detection in Bitcoin 基于gan的比特币异常交易检测
Pub Date : 2022-10-01 DOI: 10.1109/SmartCloud55982.2022.00031
Xiaoqi Zhang, Guangsong Li, Yongjuan Wang
Since its inception, blockchain technology attracts great attention from the industry and academia. With its development, cryptocurrencies such as bitcoin based on blockchain technology gradually emerge and enter the financial field. Meanwhile, malicious behaviors aimed at bitcoin become more and more common and cause huge damage to cryptocurrency users and the evolution of blockchain technology, which prompt researchers to establish various models to deal with this problem. In this paper, we collected the historical bitcoin transaction dataset and extracted features from it. After standardizing features, we used an unsupervised learning model based on Generative Adversarial Networks (GAN) to detect dataset containing more than 30 million normal and 108 malicious samples and reached a precision of 23% and recall value close to 100%.
区块链技术自问世以来,一直受到业界和学术界的高度关注。随着其发展,基于区块链技术的比特币等加密货币逐渐出现并进入金融领域。与此同时,针对比特币的恶意行为越来越普遍,对加密货币用户造成了巨大的伤害,区块链技术的发展也促使研究人员建立各种模型来应对这一问题。在本文中,我们收集了历史比特币交易数据集,并从中提取特征。在标准化特征后,我们使用基于生成式对抗网络(GAN)的无监督学习模型来检测包含超过3000万个正常样本和108个恶意样本的数据集,准确率达到23%,召回率接近100%。
{"title":"GAN-based Abnormal Transaction Detection in Bitcoin","authors":"Xiaoqi Zhang, Guangsong Li, Yongjuan Wang","doi":"10.1109/SmartCloud55982.2022.00031","DOIUrl":"https://doi.org/10.1109/SmartCloud55982.2022.00031","url":null,"abstract":"Since its inception, blockchain technology attracts great attention from the industry and academia. With its development, cryptocurrencies such as bitcoin based on blockchain technology gradually emerge and enter the financial field. Meanwhile, malicious behaviors aimed at bitcoin become more and more common and cause huge damage to cryptocurrency users and the evolution of blockchain technology, which prompt researchers to establish various models to deal with this problem. In this paper, we collected the historical bitcoin transaction dataset and extracted features from it. After standardizing features, we used an unsupervised learning model based on Generative Adversarial Networks (GAN) to detect dataset containing more than 30 million normal and 108 malicious samples and reached a precision of 23% and recall value close to 100%.","PeriodicalId":104366,"journal":{"name":"2022 IEEE 7th International Conference on Smart Cloud (SmartCloud)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125005368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accelerating Sample-based GNN Training by Feature Caching on GPUs gpu特征缓存加速基于样本的GNN训练
Pub Date : 2022-10-01 DOI: 10.1109/SmartCloud55982.2022.00032
Yuqi He, Zhiquan Lai, Zhejiang Ran, Lizhi Zhang, Dongsheng Li
The existing graph neural network (GNN) systems adopt sample-based training on large-scale graphs over multiple GPUs. Although they support large-scale graph training, large data loading overhead is still a bottleneck. In this work, we propose SCGraph, a method that supports GPU high-speed feature caching. We classify the graph vertices sorted by out-degrees. For high out-degree vertices, we set grading caches via different GPUs to increase the overall cache content through NVLink high-speed data transmission between them. For low out-degree vertices, we expand training vertices’ neighborhood in advance to regenerate cache. We evaluate SCGraph against two state-of-the-art industrial GNN frameworks, i.e., DGL and PaGraph on two datasets Reddit and ogbn-products. Experimental results show that SCGraph achieves up to 1.83× performance speedup over the state-of-the-art baselines.
现有的图神经网络(GNN)系统采用基于样本的多gpu大规模图训练。虽然它们支持大规模的图训练,但是大的数据加载开销仍然是一个瓶颈。在这项工作中,我们提出了SCGraph,一种支持GPU高速特征缓存的方法。我们根据出度对图顶点进行分类。对于出度高的顶点,我们通过不同的gpu设置分级缓存,通过NVLink高速数据传输来增加整体缓存内容。对于出度低的顶点,我们提前扩展训练顶点的邻域来重新生成缓存。我们根据两个最先进的工业GNN框架,即DGL和paggraph在两个数据集Reddit和ogbn-products上评估SCGraph。实验结果表明,SCGraph在最先进的基线上实现了高达1.83倍的性能加速。
{"title":"Accelerating Sample-based GNN Training by Feature Caching on GPUs","authors":"Yuqi He, Zhiquan Lai, Zhejiang Ran, Lizhi Zhang, Dongsheng Li","doi":"10.1109/SmartCloud55982.2022.00032","DOIUrl":"https://doi.org/10.1109/SmartCloud55982.2022.00032","url":null,"abstract":"The existing graph neural network (GNN) systems adopt sample-based training on large-scale graphs over multiple GPUs. Although they support large-scale graph training, large data loading overhead is still a bottleneck. In this work, we propose SCGraph, a method that supports GPU high-speed feature caching. We classify the graph vertices sorted by out-degrees. For high out-degree vertices, we set grading caches via different GPUs to increase the overall cache content through NVLink high-speed data transmission between them. For low out-degree vertices, we expand training vertices’ neighborhood in advance to regenerate cache. We evaluate SCGraph against two state-of-the-art industrial GNN frameworks, i.e., DGL and PaGraph on two datasets Reddit and ogbn-products. Experimental results show that SCGraph achieves up to 1.83× performance speedup over the state-of-the-art baselines.","PeriodicalId":104366,"journal":{"name":"2022 IEEE 7th International Conference on Smart Cloud (SmartCloud)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130028499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Power Load Curve Clustering based on ISODATA 基于ISODATA的电力负荷曲线聚类
Pub Date : 2022-10-01 DOI: 10.1109/SmartCloud55982.2022.00022
Zhu Li, Xia Yu
Load clustering is the early basis of power grid system planning, load modeling, demand side management, load forecasting and other work. The traditional load classification method based on user types can not meet the needs of power grid services. Iterative Self-Organizing Data Analysis Algorithm (ISODATA) is an unsupervised learning dynamic clustering algorithm based on statistical pattern recognition. In view of the current problems that the initial clustering number of each algorithm is difficult to take and easy to fall into local optimum, the principle and implementation steps of ISODATA are introduced, and this algorithm is applied to the power load curve clustering. The clustering analysis is combined with specific power load curve samples, and the results prove that the clustering effect is better and the time improvement is larger. ISODATA is compared with the traditional clustering method to compare the clustering effect and the time loss of the algorithm. The results of the comparison experiments show that ISODATA has good clustering effect when applied to power load curve clustering.Isodata-based clustering of power load curves can fine distinguish users and provide decision support and scientific basis for the reliable operation of power system.
负荷聚类是电网系统规划、负荷建模、需求侧管理、负荷预测等工作的前期基础。传统的基于用户类型的负荷分类方法已不能满足电网业务的需要。迭代自组织数据分析算法(ISODATA)是一种基于统计模式识别的无监督学习动态聚类算法。针对目前各算法初始聚类数难以取且容易陷入局部最优的问题,介绍了ISODATA的原理和实现步骤,并将该算法应用于电力负荷曲线聚类。结合具体电力负荷曲线样本进行聚类分析,结果证明聚类效果较好,时间改善较大。将ISODATA与传统聚类方法进行比较,比较算法的聚类效果和时间损失。对比实验结果表明,ISODATA在电力负荷曲线聚类中具有良好的聚类效果。基于等数据的电力负荷曲线聚类可以精细区分用户,为电力系统的可靠运行提供决策支持和科学依据。
{"title":"Power Load Curve Clustering based on ISODATA","authors":"Zhu Li, Xia Yu","doi":"10.1109/SmartCloud55982.2022.00022","DOIUrl":"https://doi.org/10.1109/SmartCloud55982.2022.00022","url":null,"abstract":"Load clustering is the early basis of power grid system planning, load modeling, demand side management, load forecasting and other work. The traditional load classification method based on user types can not meet the needs of power grid services. Iterative Self-Organizing Data Analysis Algorithm (ISODATA) is an unsupervised learning dynamic clustering algorithm based on statistical pattern recognition. In view of the current problems that the initial clustering number of each algorithm is difficult to take and easy to fall into local optimum, the principle and implementation steps of ISODATA are introduced, and this algorithm is applied to the power load curve clustering. The clustering analysis is combined with specific power load curve samples, and the results prove that the clustering effect is better and the time improvement is larger. ISODATA is compared with the traditional clustering method to compare the clustering effect and the time loss of the algorithm. The results of the comparison experiments show that ISODATA has good clustering effect when applied to power load curve clustering.Isodata-based clustering of power load curves can fine distinguish users and provide decision support and scientific basis for the reliable operation of power system.","PeriodicalId":104366,"journal":{"name":"2022 IEEE 7th International Conference on Smart Cloud (SmartCloud)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132514275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Semantic Segmentation Algorithm for Distributed Energy Data Storage Optimization based on Neural Networks 基于神经网络的分布式能源数据存储优化语义分割算法
Pub Date : 2022-10-01 DOI: 10.1109/SmartCloud55982.2022.00024
Dong Mao, Zhongxu Li, Zuge Chen, Hanyu Rao, Jiuding Zhang, Zehan Liu
There are many kinds of energy data, how to realize unified storage, processing and sharing of energy data is a big problem. As the national energy data center, State Grid aims to build a database that can store distributed heterogeneous asynchronous energy data. The storage of image files in the big energy database will take up a lot of space in the system, but not all parts of the image are needed. Therefore, it is very necessary to accurately segment the effective area of the image to store it so as to achieve the purpose of data compression. This paper proposes the Attention U-Net framework, which combines the traditional semantic segmentation network U-Net with the Attention module to focus on the region of interest in the image, emphasize foreground information, and suppress background information. The results show that compared with U-Net, the accuracy is improved by 1.77% and after the segmentation is completed, each image saves an average of 2MB of storage space.
能源数据种类繁多,如何实现能源数据的统一存储、处理和共享是一个很大的问题。作为国家能源数据中心,国家电网的目标是建立一个能够存储分布式异构异步能源数据的数据库。大能量数据库中图像文件的存储会占用系统中大量的空间,但并不是所有的图像都需要。因此,准确分割图像的有效区域进行存储,以达到数据压缩的目的是非常必要的。本文提出了Attention U-Net框架,该框架将传统的语义分割网络U-Net与Attention模块相结合,聚焦图像中感兴趣的区域,强调前景信息,抑制背景信息。结果表明,与U-Net相比,该方法的分割精度提高了1.77%,分割完成后平均每张图像节省2MB的存储空间。
{"title":"A Semantic Segmentation Algorithm for Distributed Energy Data Storage Optimization based on Neural Networks","authors":"Dong Mao, Zhongxu Li, Zuge Chen, Hanyu Rao, Jiuding Zhang, Zehan Liu","doi":"10.1109/SmartCloud55982.2022.00024","DOIUrl":"https://doi.org/10.1109/SmartCloud55982.2022.00024","url":null,"abstract":"There are many kinds of energy data, how to realize unified storage, processing and sharing of energy data is a big problem. As the national energy data center, State Grid aims to build a database that can store distributed heterogeneous asynchronous energy data. The storage of image files in the big energy database will take up a lot of space in the system, but not all parts of the image are needed. Therefore, it is very necessary to accurately segment the effective area of the image to store it so as to achieve the purpose of data compression. This paper proposes the Attention U-Net framework, which combines the traditional semantic segmentation network U-Net with the Attention module to focus on the region of interest in the image, emphasize foreground information, and suppress background information. The results show that compared with U-Net, the accuracy is improved by 1.77% and after the segmentation is completed, each image saves an average of 2MB of storage space.","PeriodicalId":104366,"journal":{"name":"2022 IEEE 7th International Conference on Smart Cloud (SmartCloud)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116797978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Performance Impacts of JavaScript-Based Encryption of HTML5 Web Storage for Enhanced Privacy 基于javascript的HTML5 Web存储加密对增强隐私的性能影响
Pub Date : 2022-10-01 DOI: 10.1109/SmartCloud55982.2022.00037
Michael S. MacFadden, Meikang Qiu
The HTML5 Web Storage API provides the ability for web applications to store data on client machines. This storage is commonly used for caching, local state tracking, and offline support that allows web applications to work when the web server cannot be contacted. The HTML5 Web Storage is becoming increasingly popular with the majority of new web applications using at least some features provided by this standard. Unfortunately, the local storage provided by HTML5 Web Storage is not entirely secure and does not sufficiently ensure the confidentiality of the user’s data. Encrypting data prior to storage is a common approach to protecting local user data. However, as browser-based applications become more complex and demanding the impact of data encryption may adversely impact application performance. Furthermore, the average web developer is generally not proficient in cryptographic best practices in web applications. First, we provide a simple design approach for encryption of local storage that supports offline web applications. Second, we analyze the impact of various symmetric encryption algorithms and implementations on the performance of the HTML Web Storage API. We show that there are several viable options that will increase the confidentiality and privacy of user data within local storage without imposing significant performance penalties.
HTML5 Web Storage API为Web应用程序提供了在客户机上存储数据的能力。这种存储通常用于缓存、本地状态跟踪和离线支持,允许web应用程序在无法联系web服务器时工作。HTML5 Web Storage正变得越来越流行,因为大多数新的Web应用程序至少使用了这个标准提供的一些特性。不幸的是,HTML5 Web storage提供的本地存储并不完全安全,不能充分保证用户数据的机密性。在存储之前对数据进行加密是保护本地用户数据的常用方法。然而,随着基于浏览器的应用程序变得越来越复杂和苛刻,数据加密的影响可能会对应用程序的性能产生不利影响。此外,一般的web开发人员通常并不精通web应用程序中的加密最佳实践。首先,我们为支持离线web应用程序的本地存储加密提供了一种简单的设计方法。其次,我们分析了各种对称加密算法和实现对HTML Web存储API性能的影响。我们展示了几个可行的选项,可以在不造成显著性能损失的情况下增加本地存储中用户数据的机密性和隐私性。
{"title":"Performance Impacts of JavaScript-Based Encryption of HTML5 Web Storage for Enhanced Privacy","authors":"Michael S. MacFadden, Meikang Qiu","doi":"10.1109/SmartCloud55982.2022.00037","DOIUrl":"https://doi.org/10.1109/SmartCloud55982.2022.00037","url":null,"abstract":"The HTML5 Web Storage API provides the ability for web applications to store data on client machines. This storage is commonly used for caching, local state tracking, and offline support that allows web applications to work when the web server cannot be contacted. The HTML5 Web Storage is becoming increasingly popular with the majority of new web applications using at least some features provided by this standard. Unfortunately, the local storage provided by HTML5 Web Storage is not entirely secure and does not sufficiently ensure the confidentiality of the user’s data. Encrypting data prior to storage is a common approach to protecting local user data. However, as browser-based applications become more complex and demanding the impact of data encryption may adversely impact application performance. Furthermore, the average web developer is generally not proficient in cryptographic best practices in web applications. First, we provide a simple design approach for encryption of local storage that supports offline web applications. Second, we analyze the impact of various symmetric encryption algorithms and implementations on the performance of the HTML Web Storage API. We show that there are several viable options that will increase the confidentiality and privacy of user data within local storage without imposing significant performance penalties.","PeriodicalId":104366,"journal":{"name":"2022 IEEE 7th International Conference on Smart Cloud (SmartCloud)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114854855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Federated-Learning-based Hierarchical Diagnosis of Liver Fibrosis 基于联邦学习的肝纤维化分级诊断
Pub Date : 2022-10-01 DOI: 10.1109/SmartCloud55982.2022.00023
Yueying Zhou, Xinping Ren, Xiaoying Zheng, Yongxin Zhu, Kang Xu, Shijin Song, Li Tian
Hepatic fibrosis is an important prognostic factor as severe liver fibrosis may lead to liver cancer or even death. To grade liver fibrosis, ultrasound gray-scale images and ultrasound elastic images are commonly used in clinical diagnosis to judge the severity of liver fibrosis. However, these two diagnoses methods are often vulnerable to disturbances, such as personal experience or instrument differences. Moreover, these individual differences usually lead to conflicting stand-alone machine learning diagnosis models at each hospital whose medical data are not allowed to share in public due to data privacy. To handle the conflicts among diagnosis models, we propose a federated learning based hierarchical diagnosis method of liver fibrosis by utilizing shear wave elasticity pictures of multiple users across hospitals without sharing the original data. Our method is validated with authentic shear wave elasticity pictures of hepatic fibrosis patients in Shanghai, China. Experimental results show that our method is able to preprocess these shear wave elasticity pictures, train local diagnosis models at each hospital and securely consolidate into a shared global diagnosis model whose accuracy is over 70% with only a small dataset containing a few hundreds of labeled pictures. Our method is expected to further improve in its accuracy with more training samples. Our method would be the first practice based on federated learning in liver fibrosis diagnosis.
肝纤维化是一个重要的预后因素,严重的肝纤维化可导致肝癌甚至死亡。临床上常用超声灰度图像和超声弹性图像对肝纤维化进行分级,判断肝纤维化的严重程度。然而,这两种诊断方法往往容易受到干扰,如个人经验或仪器的差异。此外,这些个体差异通常会导致每家医院的独立机器学习诊断模型相互冲突,这些医院的医疗数据由于数据隐私而不允许公开共享。为了解决诊断模型之间的冲突,我们提出了一种基于联邦学习的分层肝纤维化诊断方法,该方法利用跨医院多个用户的剪切波弹性图像,而不共享原始数据。我们的方法用中国上海肝纤维化患者的真实横波弹性图像进行了验证。实验结果表明,我们的方法能够对这些剪切波弹性图像进行预处理,在每个医院训练局部诊断模型,并安全地整合成一个共享的全局诊断模型,该模型仅包含数百个标记图像的小数据集,准确率超过70%。随着训练样本的增加,我们的方法有望进一步提高准确率。我们的方法将是第一个基于联合学习的肝纤维化诊断实践。
{"title":"Federated-Learning-based Hierarchical Diagnosis of Liver Fibrosis","authors":"Yueying Zhou, Xinping Ren, Xiaoying Zheng, Yongxin Zhu, Kang Xu, Shijin Song, Li Tian","doi":"10.1109/SmartCloud55982.2022.00023","DOIUrl":"https://doi.org/10.1109/SmartCloud55982.2022.00023","url":null,"abstract":"Hepatic fibrosis is an important prognostic factor as severe liver fibrosis may lead to liver cancer or even death. To grade liver fibrosis, ultrasound gray-scale images and ultrasound elastic images are commonly used in clinical diagnosis to judge the severity of liver fibrosis. However, these two diagnoses methods are often vulnerable to disturbances, such as personal experience or instrument differences. Moreover, these individual differences usually lead to conflicting stand-alone machine learning diagnosis models at each hospital whose medical data are not allowed to share in public due to data privacy. To handle the conflicts among diagnosis models, we propose a federated learning based hierarchical diagnosis method of liver fibrosis by utilizing shear wave elasticity pictures of multiple users across hospitals without sharing the original data. Our method is validated with authentic shear wave elasticity pictures of hepatic fibrosis patients in Shanghai, China. Experimental results show that our method is able to preprocess these shear wave elasticity pictures, train local diagnosis models at each hospital and securely consolidate into a shared global diagnosis model whose accuracy is over 70% with only a small dataset containing a few hundreds of labeled pictures. Our method is expected to further improve in its accuracy with more training samples. Our method would be the first practice based on federated learning in liver fibrosis diagnosis.","PeriodicalId":104366,"journal":{"name":"2022 IEEE 7th International Conference on Smart Cloud (SmartCloud)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129417407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prediction of User Electricity Consumption based on Adaptive K-Means Algorithm 基于自适应k均值算法的用户用电量预测
Pub Date : 2022-10-01 DOI: 10.1109/SmartCloud55982.2022.00026
Li Zhu, Bin Liu
When predicting the total power load of many users, the computing resources often can’t keep up with the growth rate of the total amount of data, and it is difficult to analyze effectively the data in the actual environment. This paper firstly considers clustering users, then predicts each cluster separately, and finally summarizes the results of each cluster to get the result. This paper firstly performs PCA dimension reduction on user data, and then uses the adaptive K-Means clustering method to determine the number of clusters and the initial cluster center, and then uses the determined parameters to cluster the users, and then builds a model for each cluster user and sum up the forecast results to get the total power load. In order to illustrate the effect of this method under different models, this paper establishes XGBoost, CatBoost and LightGBM models respectively and predicts the total power load of all users. From the experimental results, it can be seen that this method is consistent with the actual data trend, and the prediction effect is better than that of directly modeling all user data.
在对众多用户的总电力负荷进行预测时,计算资源往往跟不上数据总量的增长速度,难以对实际环境中的数据进行有效分析。本文首先考虑聚类用户,然后分别对每个聚类进行预测,最后对每个聚类的结果进行汇总得到结果。本文首先对用户数据进行PCA降维,然后使用自适应K-Means聚类方法确定聚类个数和初始聚类中心,然后使用确定的参数对用户进行聚类,然后对每个聚类用户建立模型并对预测结果进行汇总,得到总电力负荷。为了说明该方法在不同模型下的效果,本文分别建立了XGBoost、CatBoost和LightGBM模型,并对所有用户的总电力负荷进行了预测。从实验结果可以看出,该方法与实际数据趋势一致,预测效果优于直接对全部用户数据建模。
{"title":"Prediction of User Electricity Consumption based on Adaptive K-Means Algorithm","authors":"Li Zhu, Bin Liu","doi":"10.1109/SmartCloud55982.2022.00026","DOIUrl":"https://doi.org/10.1109/SmartCloud55982.2022.00026","url":null,"abstract":"When predicting the total power load of many users, the computing resources often can’t keep up with the growth rate of the total amount of data, and it is difficult to analyze effectively the data in the actual environment. This paper firstly considers clustering users, then predicts each cluster separately, and finally summarizes the results of each cluster to get the result. This paper firstly performs PCA dimension reduction on user data, and then uses the adaptive K-Means clustering method to determine the number of clusters and the initial cluster center, and then uses the determined parameters to cluster the users, and then builds a model for each cluster user and sum up the forecast results to get the total power load. In order to illustrate the effect of this method under different models, this paper establishes XGBoost, CatBoost and LightGBM models respectively and predicts the total power load of all users. From the experimental results, it can be seen that this method is consistent with the actual data trend, and the prediction effect is better than that of directly modeling all user data.","PeriodicalId":104366,"journal":{"name":"2022 IEEE 7th International Conference on Smart Cloud (SmartCloud)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126351419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TDH: An Efficient One-stop Enterprise-level Big Data Platform TDH:高效一站式企业级大数据平台
Pub Date : 2022-10-01 DOI: 10.1109/SmartCloud55982.2022.00042
Yuanhao Sun, Cheng Lv, Xi Liu, Tianyang Lei, Zhuoyi Guo, Ning Li, Hongshan Yang
Big data technology is rapidly changing IT industry. Among those technologies, Hadoop is the best-known one and keeps growing its popularity. Transwarp Data Hub (shortened to TDH) is an enterprise-level big data platform developed by Transwarp Technology (Shanghai) Co., Ltd. The last five years have witnessed rapid development in its growth, and it has gained experience from the deployment and implementation in the fields such as postal service, transportation, and finance. Moreover, the company has been engaged in the exploration of the newborn big data technology. Transwarp Data Hub provides five major products: Analytical Database (Transwarp Inceptor and Transwarp ArgoDB), Real-time Streaming Engine (Transwarp Slipstream), Knowledge Database (Transwarp Search and Transwarp StellarDB), Operational Database (Transwarp Hyperbase), and Data Science Platform (Transwarp Discover). Enterprises can leverage data to build core business systems more effectively and accelerate business innovation by deploying, installing, and using TDH.
大数据技术正在迅速改变IT行业。在这些技术中,Hadoop是最著名的一种,而且它的受欢迎程度还在不断增长。Transwarp Data Hub(简称TDH)是由Transwarp Technology (Shanghai) Co. Ltd.开发的企业级大数据平台。近五年来,该系统发展迅速,在邮政、交通、金融等领域的部署和实施中积累了丰富的经验。此外,公司一直致力于新兴的大数据技术的探索。Transwarp Data Hub提供五大产品:分析数据库(Transwarp interceptor和Transwarp ArgoDB)、实时流引擎(Transwarp Slipstream)、知识数据库(Transwarp Search和Transwarp starardb)、操作数据库(Transwarp Hyperbase)和数据科学平台(Transwarp Discover)。企业可以通过部署、安装和使用TDH来更有效地利用数据构建核心业务系统,并加速业务创新。
{"title":"TDH: An Efficient One-stop Enterprise-level Big Data Platform","authors":"Yuanhao Sun, Cheng Lv, Xi Liu, Tianyang Lei, Zhuoyi Guo, Ning Li, Hongshan Yang","doi":"10.1109/SmartCloud55982.2022.00042","DOIUrl":"https://doi.org/10.1109/SmartCloud55982.2022.00042","url":null,"abstract":"Big data technology is rapidly changing IT industry. Among those technologies, Hadoop is the best-known one and keeps growing its popularity. Transwarp Data Hub (shortened to TDH) is an enterprise-level big data platform developed by Transwarp Technology (Shanghai) Co., Ltd. The last five years have witnessed rapid development in its growth, and it has gained experience from the deployment and implementation in the fields such as postal service, transportation, and finance. Moreover, the company has been engaged in the exploration of the newborn big data technology. Transwarp Data Hub provides five major products: Analytical Database (Transwarp Inceptor and Transwarp ArgoDB), Real-time Streaming Engine (Transwarp Slipstream), Knowledge Database (Transwarp Search and Transwarp StellarDB), Operational Database (Transwarp Hyperbase), and Data Science Platform (Transwarp Discover). Enterprises can leverage data to build core business systems more effectively and accelerate business innovation by deploying, installing, and using TDH.","PeriodicalId":104366,"journal":{"name":"2022 IEEE 7th International Conference on Smart Cloud (SmartCloud)","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115215284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on 3D Product Service System Based on Spherical Model 基于球面模型的三维产品服务系统研究
Pub Date : 2022-10-01 DOI: 10.1109/SmartCloud55982.2022.00027
Shufeng He, Dianqi Sun
This paper focuses on the intuitive, three-dimensional and convenient marine geological data service requirements of various applications. Based on the accumulation of 3D seabed visual modeling technology in the past, this paper realizes the uneven columnar sampling geological data processing, the rapid optimization processing of gravity and magnetic data, the extraction of key features of marine data field and the optimization of visual display can quickly and intuitively meet the service requirements for marine geological and geophysical data products, to realize related data analysis and simulation.
本文着重探讨了直观、立体、便捷的海洋地质资料服务需求的各种应用。本文在积累了以往三维海底可视化建模技术的基础上,实现了非均匀柱状采样地质数据处理、重磁数据快速优化处理、海洋数据场关键特征提取和可视化显示优化,能够快速直观地满足海洋地质与地球物理数据产品的服务需求,实现相关数据分析与仿真。
{"title":"Research on 3D Product Service System Based on Spherical Model","authors":"Shufeng He, Dianqi Sun","doi":"10.1109/SmartCloud55982.2022.00027","DOIUrl":"https://doi.org/10.1109/SmartCloud55982.2022.00027","url":null,"abstract":"This paper focuses on the intuitive, three-dimensional and convenient marine geological data service requirements of various applications. Based on the accumulation of 3D seabed visual modeling technology in the past, this paper realizes the uneven columnar sampling geological data processing, the rapid optimization processing of gravity and magnetic data, the extraction of key features of marine data field and the optimization of visual display can quickly and intuitively meet the service requirements for marine geological and geophysical data products, to realize related data analysis and simulation.","PeriodicalId":104366,"journal":{"name":"2022 IEEE 7th International Conference on Smart Cloud (SmartCloud)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124094974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting and Classifying Incoming Traffic in a Secure Cloud Computing Environment Using Machine Learning and Deep Learning System 使用机器学习和深度学习系统在安全云计算环境中检测和分类传入流量
Pub Date : 2022-10-01 DOI: 10.1109/smartcloud55982.2022.00010
Geetika Tiwari, Ruchi Jain
Cloud computing has been promoted as one of the most effective methods of hosting and delivering services via the internet. Despite its broad range of applications, cloud security remains a serious worry for cloud computing. Many secure solutions have been developed to safeguard communication in such environments, the majority of which are based on attack signatures. These systems are often ineffective in detecting all forms of threats. A machine learning approach was recently presented. This implies that if the training set lacks sufficient instances in a specific class, the judgment may be incorrect. In this research, we present a novel firewall mechanism for safe cloud computing environments called machine learning and deep learning system. Proposed Methods identifies and classifies incoming traffic packets using a novel combination methodology named most frequent decision, in which the nodes’ one previous decisions are coupled with the machine learning algorithm’s current decision to estimate the final attack category classification. This method improves learning performance as well as system correctness. UNSW-NB-15, a publicly accessible dataset, is utilized to derive our findings. Our data demonstrate that it enhances anomaly detection by 97.68 percent.
云计算已被推广为通过互联网托管和提供服务的最有效方法之一。尽管应用范围很广,云安全仍然是云计算的一个严重问题。已经开发了许多安全解决方案来保护这种环境中的通信,其中大多数是基于攻击签名的。这些系统在检测各种形式的威胁方面往往是无效的。最近提出了一种机器学习方法。这意味着,如果训练集在特定类中缺乏足够的实例,则判断可能是不正确的。在本研究中,我们提出了一种新的安全云计算环境防火墙机制,称为机器学习和深度学习系统。提出的方法使用一种称为最频繁决策的新颖组合方法对传入流量数据包进行识别和分类,其中节点的一个先前决策与机器学习算法的当前决策相结合,以估计最终的攻击类别分类。这种方法不仅提高了学习性能,而且提高了系统的正确性。UNSW-NB-15是一个可公开访问的数据集,用于得出我们的研究结果。我们的数据表明,它将异常检测提高了97.68%。
{"title":"Detecting and Classifying Incoming Traffic in a Secure Cloud Computing Environment Using Machine Learning and Deep Learning System","authors":"Geetika Tiwari, Ruchi Jain","doi":"10.1109/smartcloud55982.2022.00010","DOIUrl":"https://doi.org/10.1109/smartcloud55982.2022.00010","url":null,"abstract":"Cloud computing has been promoted as one of the most effective methods of hosting and delivering services via the internet. Despite its broad range of applications, cloud security remains a serious worry for cloud computing. Many secure solutions have been developed to safeguard communication in such environments, the majority of which are based on attack signatures. These systems are often ineffective in detecting all forms of threats. A machine learning approach was recently presented. This implies that if the training set lacks sufficient instances in a specific class, the judgment may be incorrect. In this research, we present a novel firewall mechanism for safe cloud computing environments called machine learning and deep learning system. Proposed Methods identifies and classifies incoming traffic packets using a novel combination methodology named most frequent decision, in which the nodes’ one previous decisions are coupled with the machine learning algorithm’s current decision to estimate the final attack category classification. This method improves learning performance as well as system correctness. UNSW-NB-15, a publicly accessible dataset, is utilized to derive our findings. Our data demonstrate that it enhances anomaly detection by 97.68 percent.","PeriodicalId":104366,"journal":{"name":"2022 IEEE 7th International Conference on Smart Cloud (SmartCloud)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127806675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2022 IEEE 7th International Conference on Smart Cloud (SmartCloud)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1