首页 > 最新文献

2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)最新文献

英文 中文
Blockchain based Public Auditing Outsourcing for Cloud Storage 基于区块链的云存储公共审计外包
Pub Date : 2021-12-01 DOI: 10.1109/ICPADS53394.2021.00066
Yangfei Lin, Jie Li, Shigetomo Kimura, Yongbing Zhang, Yusheng Ji, Yang Yang
Cloud storage services offer flexible, convenient solutions for business and personal users to store data. Traditionally, Third Party Auditors (TPAs) are introduced to ensure data integrity for public auditing. However, TPAs may also be untrusted for forging the auditing results or colluding with cloud storage servers to deceive users. In this paper, we propose a novel Blockchain-based Public Auditing Outsourcing system without TPAs (BPAO), in which the computationally expensive operations in public auditing are outsourced through blockchain to the cloud servers without risking users' privacy. Our security analysis indicates that BPAO achieves soundness and robustness. The experimental results show that BPAO is computationally efficient for cloud storage user.
云存储服务为企业和个人用户提供灵活、便捷的数据存储解决方案。传统上,引入第三方审计员(tpa)来确保公共审计的数据完整性。但是,tpa也可能因伪造审计结果或与云存储服务器串通欺骗用户而不受信任。在本文中,我们提出了一种新的基于区块链的无tpa公共审计外包系统(BPAO),其中公共审计中计算成本高的操作通过区块链外包给云服务器,而不会危及用户的隐私。我们的安全性分析表明,BPAO达到了稳健性和鲁棒性。实验结果表明,BPAO算法对云存储用户具有较高的计算效率。
{"title":"Blockchain based Public Auditing Outsourcing for Cloud Storage","authors":"Yangfei Lin, Jie Li, Shigetomo Kimura, Yongbing Zhang, Yusheng Ji, Yang Yang","doi":"10.1109/ICPADS53394.2021.00066","DOIUrl":"https://doi.org/10.1109/ICPADS53394.2021.00066","url":null,"abstract":"Cloud storage services offer flexible, convenient solutions for business and personal users to store data. Traditionally, Third Party Auditors (TPAs) are introduced to ensure data integrity for public auditing. However, TPAs may also be untrusted for forging the auditing results or colluding with cloud storage servers to deceive users. In this paper, we propose a novel Blockchain-based Public Auditing Outsourcing system without TPAs (BPAO), in which the computationally expensive operations in public auditing are outsourced through blockchain to the cloud servers without risking users' privacy. Our security analysis indicates that BPAO achieves soundness and robustness. The experimental results show that BPAO is computationally efficient for cloud storage user.","PeriodicalId":309508,"journal":{"name":"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125667821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enabling Auction-based Cross-Blockchain Protocol for Online Anonymous Payment 为在线匿名支付启用基于拍卖的跨区块链协议
Pub Date : 2021-12-01 DOI: 10.1109/ICPADS53394.2021.00095
Qian Zhang, Sheng Cao, Xiaosong Zhang
Nowadays, online medical services have been greatly developing. Cryptocurrencies like Bitcoin and Ethereum are very suitable for online medical electronic payment scenarios that require identity privacy protection because of their good anonymity and financial payment attributes. However, cryptocurrencies varies widely, the need of cryptocurrencies exchange is urgent for patients to pay different doctors and platforms with diverse cryptocurrencies. Exchanging cryptocurrencies through centralized exchanges has problems such as high fees and cumbersome operations. The decentralized exchanges mainly focus on cross-blockchain connectivity but high intermediate fees charged by connectors are ignored. In order to minimize the exchange fees, we propose a cross-blockchain connector selection scheme utilizing the reverse Vickrey auction along with Interledger. Our scheme abstracts the connector nodes selection into a service provider bidding process, throught which we can find the very node with the lowest bid, namely, the least exchange fees, as the ideal cross-blockchain service provider. Our scheme implements cross-blockchain payment of different cryptocurrencies conveniently, quickly and cheaply, which can provide patients with better identity protection of personal privacy information. Security analysis and performance evaluations show that our scheme can effectively promote the applications of cryptocurrencies in the field of medical care.
如今,网上医疗服务得到了很大的发展。比特币、以太坊等加密货币因其良好的匿名性和金融支付属性,非常适合需要身份隐私保护的在线医疗电子支付场景。然而,加密货币差异很大,患者迫切需要使用不同的加密货币支付不同的医生和平台。通过集中式交易所交换加密货币存在费用高、操作繁琐等问题。去中心化交易所主要关注跨区块链连接,但忽略了连接器收取的高额中间费用。为了最大限度地减少交易费用,我们提出了一种利用反向Vickrey拍卖和Interledger的跨区块链连接器选择方案。我们的方案将连接器节点的选择抽象为一个服务提供商的竞标过程,通过这个过程我们可以找到出价最低的节点,即交易费用最少的节点,作为理想的跨区块链服务提供商。我们的方案方便、快速、廉价地实现了不同加密货币的跨区块链支付,可以为患者提供更好的个人隐私信息身份保护。安全性分析和性能评估表明,我们的方案可以有效地促进加密货币在医疗领域的应用。
{"title":"Enabling Auction-based Cross-Blockchain Protocol for Online Anonymous Payment","authors":"Qian Zhang, Sheng Cao, Xiaosong Zhang","doi":"10.1109/ICPADS53394.2021.00095","DOIUrl":"https://doi.org/10.1109/ICPADS53394.2021.00095","url":null,"abstract":"Nowadays, online medical services have been greatly developing. Cryptocurrencies like Bitcoin and Ethereum are very suitable for online medical electronic payment scenarios that require identity privacy protection because of their good anonymity and financial payment attributes. However, cryptocurrencies varies widely, the need of cryptocurrencies exchange is urgent for patients to pay different doctors and platforms with diverse cryptocurrencies. Exchanging cryptocurrencies through centralized exchanges has problems such as high fees and cumbersome operations. The decentralized exchanges mainly focus on cross-blockchain connectivity but high intermediate fees charged by connectors are ignored. In order to minimize the exchange fees, we propose a cross-blockchain connector selection scheme utilizing the reverse Vickrey auction along with Interledger. Our scheme abstracts the connector nodes selection into a service provider bidding process, throught which we can find the very node with the lowest bid, namely, the least exchange fees, as the ideal cross-blockchain service provider. Our scheme implements cross-blockchain payment of different cryptocurrencies conveniently, quickly and cheaply, which can provide patients with better identity protection of personal privacy information. Security analysis and performance evaluations show that our scheme can effectively promote the applications of cryptocurrencies in the field of medical care.","PeriodicalId":309508,"journal":{"name":"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130709349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Efficient Asynchronous GCN Training on a GPU Cluster 基于GPU集群的高效异步GCN训练
Pub Date : 2021-12-01 DOI: 10.1109/ICPADS53394.2021.00086
Y. Zhang, D. Goswami
Research on Graph Convolutional Networks (GCNs) has increasingly gained popularity in recent years due to the powerful representational capacity of graphs. A common assumption in traditional synchronous parallel training of GCNs using multiple GPUs is that load is perfectly balanced. However, this assumption may not hold in a real-world scenario where there can be imbalances in workloads among GPUs for various reasons. In a synchronous parallel implementation, a straggler in the system can limit the overall speed up of parallel training. To address these performance issues, this research investigates approaches for asynchronous decentralized parallel training of GCNs on a GPU cluster. The techniques investigated are based on graph clustering and the Gossip protocol. The research specifically adapts the approach of Cluster GCN, which uses graph partitioning for SGD based training, and combines with a gossip algorithm specifically designed for a GPU cluster to periodically exchange gradients among randomly chosen partners (GPUs). In addition, it incorporates a work pool mechanism for load balancing among GPUs. The gossip algorithm is proven to be deadlock free. The implementation is performed on a deep learning cluster with 8 Tesla V100 GPUs per compute node, and PyTorch and DGL as the software platforms. Experiments are conducted on different benchmark datasets. The results demonstrate superior performance with similar accuracy scores, as compared to traditional synchronous training which uses “all reduce” to synchronously accumulate parallel training results.
由于图的强大表示能力,近年来对图卷积网络(GCNs)的研究日益受到关注。传统的使用多个gpu的GCNs同步并行训练通常假设负载是完全平衡的。然而,这种假设在现实场景中可能不成立,因为由于各种原因,gpu之间的工作负载可能存在不平衡。在同步并行实现中,系统中的离散点会限制并行训练的整体速度。为了解决这些性能问题,本研究探讨了在GPU集群上异步分散并行训练GCNs的方法。研究的技术是基于图聚类和八卦协议。该研究特别采用了集群GCN的方法,该方法使用图分区进行基于SGD的训练,并结合专门为GPU集群设计的八卦算法,在随机选择的GPU (GPU)之间定期交换梯度。此外,它还集成了一个工作池机制,用于gpu之间的负载平衡。流言算法被证明是无死锁的。该实现是在一个深度学习集群上进行的,每个计算节点8个Tesla V100 gpu, PyTorch和DGL作为软件平台。在不同的基准数据集上进行了实验。与传统的同步训练相比,使用“all reduce”来同步累积并行训练结果,结果显示出具有相似准确率分数的优越性能。
{"title":"Efficient Asynchronous GCN Training on a GPU Cluster","authors":"Y. Zhang, D. Goswami","doi":"10.1109/ICPADS53394.2021.00086","DOIUrl":"https://doi.org/10.1109/ICPADS53394.2021.00086","url":null,"abstract":"Research on Graph Convolutional Networks (GCNs) has increasingly gained popularity in recent years due to the powerful representational capacity of graphs. A common assumption in traditional synchronous parallel training of GCNs using multiple GPUs is that load is perfectly balanced. However, this assumption may not hold in a real-world scenario where there can be imbalances in workloads among GPUs for various reasons. In a synchronous parallel implementation, a straggler in the system can limit the overall speed up of parallel training. To address these performance issues, this research investigates approaches for asynchronous decentralized parallel training of GCNs on a GPU cluster. The techniques investigated are based on graph clustering and the Gossip protocol. The research specifically adapts the approach of Cluster GCN, which uses graph partitioning for SGD based training, and combines with a gossip algorithm specifically designed for a GPU cluster to periodically exchange gradients among randomly chosen partners (GPUs). In addition, it incorporates a work pool mechanism for load balancing among GPUs. The gossip algorithm is proven to be deadlock free. The implementation is performed on a deep learning cluster with 8 Tesla V100 GPUs per compute node, and PyTorch and DGL as the software platforms. Experiments are conducted on different benchmark datasets. The results demonstrate superior performance with similar accuracy scores, as compared to traditional synchronous training which uses “all reduce” to synchronously accumulate parallel training results.","PeriodicalId":309508,"journal":{"name":"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114255170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Implementation of Kyber on Mobile Devices Kyber在移动设备上的高效实现
Pub Date : 2021-12-01 DOI: 10.1109/ICPADS53394.2021.00069
Lirui Zhao, Jipeng Zhang, Junhao Huang, Zhe Liu, G. Hancke
Kyber, an IND-CCA-secure key encapsulation mechanism (KEM) based on the MLWE problem, has been shortlisted for the third round evaluation of the NIST Post-Quantum Cryptography Standardization. In this paper, we explored the optimizations of Kyber in high-performance processors from the ARM Cortex-A series, which are widely used in mainstream mobile phones. To improve the performance of Kyber, we utilized the powerful SIMD instruction set NEON in an ARMv8-A to parallelize the core modules of Kyber, i.e., modular reduction and NTT. Specifically, we specially designed the optimized implementation based on the characteristic of the NEON instruction set for the Barrett and Montgomery reduction algorithms. To make full use of the computing power of NEON instructions, we proposed a novel strategy for computing the 16-bit Barrett reduction without handling the 32-bit intermediate result. Our Barrett and Montgomery reduction showed 8.52 and 8.89 times faster than the reference implementation. As for NTT/INTT, we adopted the 2+5 layer merging strategy on an ARMv8-A to implement NTT/INTT after carefully analyzing the register occupancy of various layer merging techniques. Thanks to the selected layer merging strategy, our NTT and INTT achieved 11.89 and 13.45 times speedups compared with the reference implementation. Our optimized software achieved 1.77×, 1.85×, and 2.16× speedups for key generation, encapsulation, and decapsulation compared with Kyber's reference implementation.
Kyber是一种基于MLWE问题的ind - cca安全密钥封装机制(KEM),已入围NIST后量子加密标准化第三轮评估。在本文中,我们探索了Kyber在主流手机中广泛使用的ARM Cortex-A系列高性能处理器中的优化。为了提高Kyber的性能,我们利用ARMv8-A中强大的SIMD指令集NEON来并行化Kyber的核心模块,即模块化约简和NTT。具体来说,我们根据NEON指令集的特点为Barrett和Montgomery约简算法专门设计了优化实现。为了充分利用NEON指令的计算能力,我们提出了一种计算16位Barrett约简而不处理32位中间结果的新策略。我们的Barrett和Montgomery还原比参考实现分别快8.52和8.89倍。对于NTT/INTT,在仔细分析了各种层合并技术的寄存器占用情况后,我们在ARMv8-A上采用了2+5层合并策略来实现NTT/INTT。由于选择了层合并策略,我们的NTT和INTT的速度比参考实现分别提高了11.89倍和13.45倍。与Kyber的参考实现相比,我们优化的软件在密钥生成、封装和解封装方面的速度分别提高了1.77倍、1.85倍和2.16倍。
{"title":"Efficient Implementation of Kyber on Mobile Devices","authors":"Lirui Zhao, Jipeng Zhang, Junhao Huang, Zhe Liu, G. Hancke","doi":"10.1109/ICPADS53394.2021.00069","DOIUrl":"https://doi.org/10.1109/ICPADS53394.2021.00069","url":null,"abstract":"Kyber, an IND-CCA-secure key encapsulation mechanism (KEM) based on the MLWE problem, has been shortlisted for the third round evaluation of the NIST Post-Quantum Cryptography Standardization. In this paper, we explored the optimizations of Kyber in high-performance processors from the ARM Cortex-A series, which are widely used in mainstream mobile phones. To improve the performance of Kyber, we utilized the powerful SIMD instruction set NEON in an ARMv8-A to parallelize the core modules of Kyber, i.e., modular reduction and NTT. Specifically, we specially designed the optimized implementation based on the characteristic of the NEON instruction set for the Barrett and Montgomery reduction algorithms. To make full use of the computing power of NEON instructions, we proposed a novel strategy for computing the 16-bit Barrett reduction without handling the 32-bit intermediate result. Our Barrett and Montgomery reduction showed 8.52 and 8.89 times faster than the reference implementation. As for NTT/INTT, we adopted the 2+5 layer merging strategy on an ARMv8-A to implement NTT/INTT after carefully analyzing the register occupancy of various layer merging techniques. Thanks to the selected layer merging strategy, our NTT and INTT achieved 11.89 and 13.45 times speedups compared with the reference implementation. Our optimized software achieved 1.77×, 1.85×, and 2.16× speedups for key generation, encapsulation, and decapsulation compared with Kyber's reference implementation.","PeriodicalId":309508,"journal":{"name":"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116507542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
WiBWi: Encoding-based Bidirectional Physical-Layer Cross-Technology Communication between BLE and WiFi WiBWi: BLE与WiFi之间基于编码的双向物理层跨技术通信
Pub Date : 2021-12-01 DOI: 10.1109/ICPADS53394.2021.00050
Yuanhe Shu, Jingwei Wang, L. Kong, Jiadi Yu, Guisong Yang, Yueping Cai, Zhen Wang, M. K. Khan
The booming of mobile technologies and Internet of Things (IoTs) have facilitated the explosion of wireless devices and brought convenience to people's daily lives. Coming with the explosive growth of wireless devices, incompatibility of heterogeneous wireless technologies hindered the growing demands for everything connected. And spectrum sharing among heterogeneous wireless technologies has led to severe Cross-Technology Interference (CTI), which is a vital obstacle for network reliability and spectrum utilization. Researches in recent years have shown that Cross-Technology Communication (CTC) turns out to be a promising solution with broad perspective for the coexistence of heterogeneous wireless technologies. However, due to the physical layer incompatibility of WiFi and Bluetooth Low Energy (BLE), the researches about CTC between these two most wildly used wireless technologies are limited by now. In this paper, we propose WiBWi, a payload encoding-based bidirectional CTC scheme between BLE and WiFi, which can achieve near-optimal throughput and powerful robustness. For uplink, i.e., BLE to WiFi communication, WiBWi leverages a novel extended WiFi preamble detection rule and probabilistic inference based encode mapping to achieve fast and reliable communication. For downlink, i.e., WiFi to BLE communication, WiBWi introduces an encoding mapping scheme in the sight of BLE receiver with little modification to accomplish high throughput and robustness. Extensive evaluation shows that WiBWi can offer near-optimal throughput (near the maximum throughput of BLE) and extremely low bit error rate (less than 1%).
移动技术和物联网的蓬勃发展,推动了无线设备的爆炸式增长,为人们的日常生活带来了便利。随着无线设备的爆炸式增长,异构无线技术的不兼容性阻碍了人们对万物互联的需求。而异构无线技术之间的频谱共享导致了严重的跨技术干扰(CTI),这是影响网络可靠性和频谱利用率的重要障碍。近年来的研究表明,跨技术通信(CTC)是一种很有前途的解决方案,为异构无线技术共存提供了广阔的前景。然而,由于WiFi和低功耗蓝牙(Bluetooth Low Energy, BLE)的物理层不兼容,目前对这两种应用最广泛的无线技术之间的CTC的研究还很有限。在本文中,我们提出了WiBWi,一种介于BLE和WiFi之间的基于有效载荷编码的双向CTC方案,可以实现近乎最优的吞吐量和强大的鲁棒性。对于上行链路,即BLE到WiFi通信,WiBWi利用了一种新颖的扩展WiFi前导检测规则和基于概率推理的编码映射,实现了快速可靠的通信。对于下行链路,即WiFi到BLE通信,WiBWi在BLE接收器的视线中引入了一种编码映射方案,修改很少,实现了高吞吐量和鲁棒性。广泛的评估表明,WiBWi可以提供近乎最佳的吞吐量(接近BLE的最大吞吐量)和极低的误码率(小于1%)。
{"title":"WiBWi: Encoding-based Bidirectional Physical-Layer Cross-Technology Communication between BLE and WiFi","authors":"Yuanhe Shu, Jingwei Wang, L. Kong, Jiadi Yu, Guisong Yang, Yueping Cai, Zhen Wang, M. K. Khan","doi":"10.1109/ICPADS53394.2021.00050","DOIUrl":"https://doi.org/10.1109/ICPADS53394.2021.00050","url":null,"abstract":"The booming of mobile technologies and Internet of Things (IoTs) have facilitated the explosion of wireless devices and brought convenience to people's daily lives. Coming with the explosive growth of wireless devices, incompatibility of heterogeneous wireless technologies hindered the growing demands for everything connected. And spectrum sharing among heterogeneous wireless technologies has led to severe Cross-Technology Interference (CTI), which is a vital obstacle for network reliability and spectrum utilization. Researches in recent years have shown that Cross-Technology Communication (CTC) turns out to be a promising solution with broad perspective for the coexistence of heterogeneous wireless technologies. However, due to the physical layer incompatibility of WiFi and Bluetooth Low Energy (BLE), the researches about CTC between these two most wildly used wireless technologies are limited by now. In this paper, we propose WiBWi, a payload encoding-based bidirectional CTC scheme between BLE and WiFi, which can achieve near-optimal throughput and powerful robustness. For uplink, i.e., BLE to WiFi communication, WiBWi leverages a novel extended WiFi preamble detection rule and probabilistic inference based encode mapping to achieve fast and reliable communication. For downlink, i.e., WiFi to BLE communication, WiBWi introduces an encoding mapping scheme in the sight of BLE receiver with little modification to accomplish high throughput and robustness. Extensive evaluation shows that WiBWi can offer near-optimal throughput (near the maximum throughput of BLE) and extremely low bit error rate (less than 1%).","PeriodicalId":309508,"journal":{"name":"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133625889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
STNN: A Spatial-Temporal Graph Neural Network for Traffic Prediction STNN:用于交通预测的时空图神经网络
Pub Date : 2021-12-01 DOI: 10.1109/ICPADS53394.2021.00024
Xueyan Yin, Fei Li, Genze Wu, Pengfei Wang, Yanming Shen, Heng Qi, Baocai Yin
Accurate traffic prediction is of great importance in Intelligent Transportation System. This problem is very challenging due to the complex spatial and long-range temporal dependencies. Existing models generally suffer two limitations: (1) GCN-based methods usually use a fixed Laplacian matrix to model spatial dependencies, without considering their dynamics; (2) RNN and its variants are only capable of modeling a limited-range temporal dependencies, resulting in significant information loss. In this paper, we propose a novel spatial-temporal graph neural network (STNN), an end-to-end solution for traffic prediction that simultaneously captures dynamic spatial and long-range temporal dependencies. Specifically, STNN first uses a spatial attention network to model complex and dynamic spatial correlations, without any expensive matrix operations or relying on predefined road network topologies. Second, a temporal transformer network is utilized to model long-range temporal dependencies across multiple time steps, which considers not only the recent segment, but also the periodic dependencies of historical data. Making full use of historical data can alleviate the difficulty of obtaining real-time data and improve the prediction accuracy. Experiments are conducted on two real-world traffic datasets, and the results verify the effectiveness of the proposed model, especially in long-term traffic prediction.
准确的交通预测在智能交通系统中具有重要意义。由于复杂的空间和长时间依赖关系,这个问题非常具有挑战性。现有模型一般存在两个局限性:(1)基于gcn的方法通常使用固定的拉普拉斯矩阵来建模空间依赖关系,而不考虑它们的动态;(2) RNN及其变体仅能对有限范围的时间依赖性进行建模,导致严重的信息损失。在本文中,我们提出了一种新的时空图神经网络(STNN),这是一种端到端的交通预测解决方案,同时捕获动态空间和长期时间依赖性。具体来说,STNN首先使用空间注意网络来模拟复杂和动态的空间相关性,而不需要任何昂贵的矩阵操作或依赖于预定义的道路网络拓扑。其次,利用时序变压器网络对多个时间步长的时间依赖关系进行建模,该网络不仅考虑了最近段,而且考虑了历史数据的周期性依赖关系。充分利用历史数据可以缓解获取实时数据的困难,提高预测精度。在两个真实交通数据集上进行了实验,结果验证了该模型的有效性,特别是在长期交通预测方面。
{"title":"STNN: A Spatial-Temporal Graph Neural Network for Traffic Prediction","authors":"Xueyan Yin, Fei Li, Genze Wu, Pengfei Wang, Yanming Shen, Heng Qi, Baocai Yin","doi":"10.1109/ICPADS53394.2021.00024","DOIUrl":"https://doi.org/10.1109/ICPADS53394.2021.00024","url":null,"abstract":"Accurate traffic prediction is of great importance in Intelligent Transportation System. This problem is very challenging due to the complex spatial and long-range temporal dependencies. Existing models generally suffer two limitations: (1) GCN-based methods usually use a fixed Laplacian matrix to model spatial dependencies, without considering their dynamics; (2) RNN and its variants are only capable of modeling a limited-range temporal dependencies, resulting in significant information loss. In this paper, we propose a novel spatial-temporal graph neural network (STNN), an end-to-end solution for traffic prediction that simultaneously captures dynamic spatial and long-range temporal dependencies. Specifically, STNN first uses a spatial attention network to model complex and dynamic spatial correlations, without any expensive matrix operations or relying on predefined road network topologies. Second, a temporal transformer network is utilized to model long-range temporal dependencies across multiple time steps, which considers not only the recent segment, but also the periodic dependencies of historical data. Making full use of historical data can alleviate the difficulty of obtaining real-time data and improve the prediction accuracy. Experiments are conducted on two real-world traffic datasets, and the results verify the effectiveness of the proposed model, especially in long-term traffic prediction.","PeriodicalId":309508,"journal":{"name":"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117234571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
IEdroid:Detecting Malicious Android Network Behavior Using Incremental Ensemble of Ensembles idroid:使用集成的增量集成检测恶意Android网络行为
Pub Date : 2021-12-01 DOI: 10.1109/ICPADS53394.2021.00104
Cong Liu, Anli Yan, Zhenxiang Chen, Haibo Zhang, Qiben Yan, Lizhi Peng, Chuan Zhao
Malware detection has attracted widespread attention due to the growing malware sophistication. Machine learning based methods have been proposed to find traces of malware by analyzing network traffic. However, network traffic exhibits a series of growing and changing states, which makes it challenging to design a detection model that can detect malicious traffic over a long period without the need for costly retraining. In this paper, we present, IEdroid, an Android malicious network behavior detection method that leverages incremental ensembles for model update. Specifically, we train multiple classifiers to form an interim ensemble in distributed cluster environment, and update the interim ensemble by removing and adding classifiers. The generated model is composed of multiple interim ensembles that can adapt to the network traffic. We evaluated the performance of IEdroid using a dataset consisting of 98,565 benign and 41,267 malicious flows. Results show that IEdroid can effectively detect malicious traffic compared with state-of-the-art detection models. The experiment trained IEdroid on datasets incrementally for 10 times without a significant loss on accuracy, precision, recall, and F-Measure, compared with re-training from scratch with full data.
由于恶意软件越来越复杂,恶意软件检测引起了广泛的关注。已经提出了基于机器学习的方法,通过分析网络流量来发现恶意软件的踪迹。然而,网络流量呈现出一系列不断增长和变化的状态,这使得设计一种能够在不需要昂贵的再培训的情况下长时间检测恶意流量的检测模型具有挑战性。在本文中,我们提出了IEdroid,一种利用增量集成进行模型更新的Android恶意网络行为检测方法。具体来说,我们在分布式集群环境中训练多个分类器形成一个临时集成,并通过删除和添加分类器来更新临时集成。生成的模型由多个能够适应网络流量的临时集合组成。我们使用由98,565个良性流和41,267个恶意流组成的数据集来评估idroid的性能。实验结果表明,与现有的检测模型相比,IEdroid能够有效检测出恶意流量。与使用完整数据从头开始重新训练相比,实验在数据集上对idroid进行了10次增量训练,在准确性、精密度、召回率和F-Measure方面没有明显损失。
{"title":"IEdroid:Detecting Malicious Android Network Behavior Using Incremental Ensemble of Ensembles","authors":"Cong Liu, Anli Yan, Zhenxiang Chen, Haibo Zhang, Qiben Yan, Lizhi Peng, Chuan Zhao","doi":"10.1109/ICPADS53394.2021.00104","DOIUrl":"https://doi.org/10.1109/ICPADS53394.2021.00104","url":null,"abstract":"Malware detection has attracted widespread attention due to the growing malware sophistication. Machine learning based methods have been proposed to find traces of malware by analyzing network traffic. However, network traffic exhibits a series of growing and changing states, which makes it challenging to design a detection model that can detect malicious traffic over a long period without the need for costly retraining. In this paper, we present, IEdroid, an Android malicious network behavior detection method that leverages incremental ensembles for model update. Specifically, we train multiple classifiers to form an interim ensemble in distributed cluster environment, and update the interim ensemble by removing and adding classifiers. The generated model is composed of multiple interim ensembles that can adapt to the network traffic. We evaluated the performance of IEdroid using a dataset consisting of 98,565 benign and 41,267 malicious flows. Results show that IEdroid can effectively detect malicious traffic compared with state-of-the-art detection models. The experiment trained IEdroid on datasets incrementally for 10 times without a significant loss on accuracy, precision, recall, and F-Measure, compared with re-training from scratch with full data.","PeriodicalId":309508,"journal":{"name":"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116973050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
AMF-CSR: Adaptive Multi-Row Folding of CSR for SpMV on GPU AMF-CSR:基于GPU的SpMV自适应多行折叠CSR
Pub Date : 2021-12-01 DOI: 10.1109/ICPADS53394.2021.00058
Jianhua Gao, Weixing Ji, Jie Liu, Senhao Shao, Yizhuo Wang, Feng Shi
SpMV is a cost-dominant operation used in many iterative methods for solving large-scale sparse linear systems. However, irregular memory access of SpMV to the multiplied vector leads to low data locality and then harms the performance. This paper presents an adaptive multi-row folding of CSR (AMF-CSR) format for SpMV calculation on GPU. This new storage format supports the folding of the variable number of rows in order to achieve better load balancing in computation. AMF-CSR not only increases the density of non-zero elements in a folded row, thereby improving the access locality of the multiplied vector, but also merges an approximately equal number of nonzero elements in a folded row, hence achieving load balancing. The performance evaluation using 28 sparse matrices shows that the proposed SpMV algorithm based on AMF-CSR achieves the highest speedup of 4.11x and 3.62x on GTX 1080 Ti and Tesla V100 respectively against a fixed multi-row folding-based SpMV algorithm. Evaluation results using 450 regular sparse matrices and 450 irregular sparse matrices also show that AMF-CSR is superior to other SpMV implementations.
SpMV是一种成本优势运算,用于求解大规模稀疏线性系统的迭代方法中。然而,SpMV对乘向量的不规则内存访问导致数据局部性低,从而影响了性能。提出了一种适用于GPU上SpMV计算的自适应CSR多行折叠(AMF-CSR)格式。这种新的存储格式支持可变行数的折叠,以便在计算中实现更好的负载平衡。AMF-CSR不仅增加了折叠行中非零元素的密度,从而提高了相乘向量的访问局部性,而且在折叠行中合并了近似相等数量的非零元素,从而实现了负载均衡。基于28个稀疏矩阵的性能评价表明,与基于固定多行折叠的SpMV算法相比,基于AMF-CSR的SpMV算法在GTX 1080 Ti和Tesla V100上分别实现了4.11x和3.62x的最高加速。使用450个正则稀疏矩阵和450个不规则稀疏矩阵的评价结果也表明AMF-CSR优于其他SpMV实现。
{"title":"AMF-CSR: Adaptive Multi-Row Folding of CSR for SpMV on GPU","authors":"Jianhua Gao, Weixing Ji, Jie Liu, Senhao Shao, Yizhuo Wang, Feng Shi","doi":"10.1109/ICPADS53394.2021.00058","DOIUrl":"https://doi.org/10.1109/ICPADS53394.2021.00058","url":null,"abstract":"SpMV is a cost-dominant operation used in many iterative methods for solving large-scale sparse linear systems. However, irregular memory access of SpMV to the multiplied vector leads to low data locality and then harms the performance. This paper presents an adaptive multi-row folding of CSR (AMF-CSR) format for SpMV calculation on GPU. This new storage format supports the folding of the variable number of rows in order to achieve better load balancing in computation. AMF-CSR not only increases the density of non-zero elements in a folded row, thereby improving the access locality of the multiplied vector, but also merges an approximately equal number of nonzero elements in a folded row, hence achieving load balancing. The performance evaluation using 28 sparse matrices shows that the proposed SpMV algorithm based on AMF-CSR achieves the highest speedup of 4.11x and 3.62x on GTX 1080 Ti and Tesla V100 respectively against a fixed multi-row folding-based SpMV algorithm. Evaluation results using 450 regular sparse matrices and 450 irregular sparse matrices also show that AMF-CSR is superior to other SpMV implementations.","PeriodicalId":309508,"journal":{"name":"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115543914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BSDP: A Novel Balanced Spark Data Partitioner BSDP:一种新型的平衡火花数据分区
Pub Date : 2021-12-01 DOI: 10.1109/ICPADS53394.2021.00075
Aibo Song, Bowen Peng, Jingyi Qiu, Yingying Xue, Mingyang Du
As a memory-based distributed big data computing framework, Spark has been widely used in big data processing systems. However, during the execution of Spark, due to the imbalance of input data distribution and the shortage of existing data partitioners in Spark, it is easy to cause partition skew problem and reduce the execution efficiency of Spark. Aiming at this problem, this paper proposes a balanced Spark data partitioner called BSDP (Balanced Spark Data Partitioner). By deeply analyzing the partitioning characteristics of Shuffle intermediate data, the Spark Shuffle intermediate data equalization partitioning model is established. The model aims to minimize the partition skew and find a Shuffle intermediate data equalization partitioning strategy. Based on the model, this paper designs and implements a data equalization partitioning algorithm of BSDP. This algorithm transforms the Shuffle intermediate data equalization partitioning problem into a classic List-Scheduling task scheduling problem, effectively realizes the balanced partitioning of Shuffle intermediate data. The experiment verifies that the BSDP can effectively realize the balanced partitioning of the Shuffle intermediate data and improve the execution efficiency of Spark.
Spark作为一种基于内存的分布式大数据计算框架,在大数据处理系统中得到了广泛的应用。然而,在Spark执行过程中,由于输入数据分布的不平衡以及Spark中现有数据分区的不足,容易造成分区倾斜问题,降低Spark的执行效率。针对这一问题,本文提出了一种平衡的Spark数据分区器BSDP (balanced Spark data partitioner)。通过深入分析Shuffle中间数据的分区特点,建立了Spark Shuffle中间数据均衡分区模型。该模型旨在最小化分区倾斜,并找到一种Shuffle中间数据均衡分区策略。基于该模型,设计并实现了一种BSDP数据均衡分区算法。该算法将Shuffle中间数据均衡分区问题转化为经典的List-Scheduling任务调度问题,有效地实现了Shuffle中间数据均衡分区。实验验证了BSDP可以有效地实现Shuffle中间数据的均衡分区,提高Spark的执行效率。
{"title":"BSDP: A Novel Balanced Spark Data Partitioner","authors":"Aibo Song, Bowen Peng, Jingyi Qiu, Yingying Xue, Mingyang Du","doi":"10.1109/ICPADS53394.2021.00075","DOIUrl":"https://doi.org/10.1109/ICPADS53394.2021.00075","url":null,"abstract":"As a memory-based distributed big data computing framework, Spark has been widely used in big data processing systems. However, during the execution of Spark, due to the imbalance of input data distribution and the shortage of existing data partitioners in Spark, it is easy to cause partition skew problem and reduce the execution efficiency of Spark. Aiming at this problem, this paper proposes a balanced Spark data partitioner called BSDP (Balanced Spark Data Partitioner). By deeply analyzing the partitioning characteristics of Shuffle intermediate data, the Spark Shuffle intermediate data equalization partitioning model is established. The model aims to minimize the partition skew and find a Shuffle intermediate data equalization partitioning strategy. Based on the model, this paper designs and implements a data equalization partitioning algorithm of BSDP. This algorithm transforms the Shuffle intermediate data equalization partitioning problem into a classic List-Scheduling task scheduling problem, effectively realizes the balanced partitioning of Shuffle intermediate data. The experiment verifies that the BSDP can effectively realize the balanced partitioning of the Shuffle intermediate data and improve the execution efficiency of Spark.","PeriodicalId":309508,"journal":{"name":"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129737907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Minimizing Play Request Rejection through Workload Splitting in Edge-Cloud Gaming 在边缘云游戏中通过工作量分割最小化游戏请求拒绝
Pub Date : 2021-12-01 DOI: 10.1109/ICPADS53394.2021.00108
Iryanto Jaya, Yusen Li, Wentong Cai
Cloud gaming abstracts the concept of traditional gaming and places the gaming activities on remote rendering servers (RSes). Although this allows heterogeneous devices to gain access to multiple game titles, latency issue is always unavoidable. Each game input must go through a complete round trip between the player's device and the cloud gaming server. Hence, cloud games are not as responsive as traditional computer games where the game logic runs locally. Moreover, in order to have an acceptable level of game playability, the latency level must be within a certain threshold. This also prevents some players who are located in remote regions from playing the game due to high latency. Therefore, in this paper, we employ edge servers in order to reach those players by activating lower capability RSes which are more geographically distributed. Furthermore, we also allow workload splitting of foreground and background rendering between edge and cloud RSes to ease the burden of each individual RS with a trade-off between cost and latency constraints. From our experiments, our architecture and allocation scheme results in reduction of play request rejections for up to 28% compared to traditional cloud gaming approach.
云游戏抽象了传统游戏的概念,并将游戏活动放在远程呈现服务器(rse)上。尽管这允许不同设备访问多个游戏,但延迟问题总是不可避免的。每次游戏输入都必须在玩家的设备和云游戏服务器之间进行一次完整的往返。因此,云游戏的响应性不如传统电脑游戏,后者的游戏逻辑在本地运行。此外,为了获得可接受的游戏可玩性水平,延迟水平必须在一定的阈值范围内。这也阻止了一些位于偏远地区的玩家由于高延迟而无法玩游戏。因此,在本文中,我们使用边缘服务器,以便通过激活更地理分布的低能力rse来到达那些玩家。此外,我们还允许在边缘和云rsse之间拆分前景和背景渲染的工作负载,从而在成本和延迟约束之间进行权衡,减轻每个RS的负担。从我们的实验来看,与传统的云游戏方法相比,我们的架构和分配方案减少了高达28%的游戏请求拒绝。
{"title":"Minimizing Play Request Rejection through Workload Splitting in Edge-Cloud Gaming","authors":"Iryanto Jaya, Yusen Li, Wentong Cai","doi":"10.1109/ICPADS53394.2021.00108","DOIUrl":"https://doi.org/10.1109/ICPADS53394.2021.00108","url":null,"abstract":"Cloud gaming abstracts the concept of traditional gaming and places the gaming activities on remote rendering servers (RSes). Although this allows heterogeneous devices to gain access to multiple game titles, latency issue is always unavoidable. Each game input must go through a complete round trip between the player's device and the cloud gaming server. Hence, cloud games are not as responsive as traditional computer games where the game logic runs locally. Moreover, in order to have an acceptable level of game playability, the latency level must be within a certain threshold. This also prevents some players who are located in remote regions from playing the game due to high latency. Therefore, in this paper, we employ edge servers in order to reach those players by activating lower capability RSes which are more geographically distributed. Furthermore, we also allow workload splitting of foreground and background rendering between edge and cloud RSes to ease the burden of each individual RS with a trade-off between cost and latency constraints. From our experiments, our architecture and allocation scheme results in reduction of play request rejections for up to 28% compared to traditional cloud gaming approach.","PeriodicalId":309508,"journal":{"name":"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130301536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1