Pub Date : 2020-04-01DOI: 10.1109/ICDE48307.2020.00154
Boxuan Li, Reynold Cheng, Jiafeng Hu, Yixiang Fang, Min Ou, Ruibang Luo, K. Chang, Xuemin Lin
Large networks with labeled nodes are prevalent in various applications, such as biological graphs, social networks, and e-commerce graphs. To extract insight from this rich information source, we propose MC-Explorer, which is an advanced analysis and visualization system. A highlight of MC-Explorer is its ability to discover motif-cliques from a graph with labeled nodes. A motif, such as a 3-node triangle, is a fundamental building block of a graph. A motif-clique is a "complete" subgraph in a network with respect to a desired higher-order connection pattern. For example, on a large biological graph, we found out some motif-cliques, which disclose new side effects of a drug, and potential drugs for healing diseases. MC-Explorer includes online and interactive facilities for exploring a large labeled network through the use of motif-cliques. We will demonstrate how MC-Explorer can facilitate the analysis and visualization of a labeled biological network.An online demo video of MC-Explorer can be accessed from https://www.dropbox.com/s/vkalumc28wqp8yl/demo.mov
{"title":"MC-Explorer: Analyzing and Visualizing Motif-Cliques on Large Networks","authors":"Boxuan Li, Reynold Cheng, Jiafeng Hu, Yixiang Fang, Min Ou, Ruibang Luo, K. Chang, Xuemin Lin","doi":"10.1109/ICDE48307.2020.00154","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00154","url":null,"abstract":"Large networks with labeled nodes are prevalent in various applications, such as biological graphs, social networks, and e-commerce graphs. To extract insight from this rich information source, we propose MC-Explorer, which is an advanced analysis and visualization system. A highlight of MC-Explorer is its ability to discover motif-cliques from a graph with labeled nodes. A motif, such as a 3-node triangle, is a fundamental building block of a graph. A motif-clique is a \"complete\" subgraph in a network with respect to a desired higher-order connection pattern. For example, on a large biological graph, we found out some motif-cliques, which disclose new side effects of a drug, and potential drugs for healing diseases. MC-Explorer includes online and interactive facilities for exploring a large labeled network through the use of motif-cliques. We will demonstrate how MC-Explorer can facilitate the analysis and visualization of a labeled biological network.An online demo video of MC-Explorer can be accessed from https://www.dropbox.com/s/vkalumc28wqp8yl/demo.mov","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"15 1","pages":"1722-1725"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87407339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-04-01DOI: 10.1109/ICDE48307.2020.00227
Tiantian Liu, Zijin Feng, Huan Li, Hua Lu, M. A. Cheema, Hong Cheng, Jianliang Xu
Indoor shortest path query (ISPQ) is of fundamental importance for indoor location-based services (LBS). However, existing ISPQs ignore indoor temporal variations, e.g., the open and close times associated with entities like doors and rooms. In this paper, we define a new type of query called Indoor Temporal-variation aware Shortest Path Query (ITSPQ). It returns the valid shortest path based on the up-to-date indoor topology at the query time. A set of techniques is designed to answer ITSPQ efficiently. We design a graph structure (IT-Graph) that captures indoor temporal variations. To process ITSPQ using IT-Graph, we design two algorithms that check a door’s accessibility synchronously and asynchronously, respectively. We experimentally evaluate the proposed techniques using synthetic data. The results show that our methods are efficient.
{"title":"Shortest Path Queries for Indoor Venues with Temporal Variations","authors":"Tiantian Liu, Zijin Feng, Huan Li, Hua Lu, M. A. Cheema, Hong Cheng, Jianliang Xu","doi":"10.1109/ICDE48307.2020.00227","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00227","url":null,"abstract":"Indoor shortest path query (ISPQ) is of fundamental importance for indoor location-based services (LBS). However, existing ISPQs ignore indoor temporal variations, e.g., the open and close times associated with entities like doors and rooms. In this paper, we define a new type of query called Indoor Temporal-variation aware Shortest Path Query (ITSPQ). It returns the valid shortest path based on the up-to-date indoor topology at the query time. A set of techniques is designed to answer ITSPQ efficiently. We design a graph structure (IT-Graph) that captures indoor temporal variations. To process ITSPQ using IT-Graph, we design two algorithms that check a door’s accessibility synchronously and asynchronously, respectively. We experimentally evaluate the proposed techniques using synthetic data. The results show that our methods are efficient.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"19 1","pages":"2014-2017"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80092897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-04-01DOI: 10.1109/ICDE48307.2020.00173
Shantanu Sharma, A. Burtsev, S. Mehrotra
Despite extensive research, secure outsourcing remains an open challenge. This tutorial focuses on recent advances in secure cloud-based data outsourcing based on cryptographic (encryption, secret-sharing, and multi-party computation (MPC)) and hardware-based approaches. We highlight the strengths and weaknesses of state-of-the-art techniques, and conclude that, while no single approach is likely to emerge as a silver bullet. Thus, the key is to merge different hardware and software techniques to work in conjunction using partitioned computing wherein a computation is split across different cryptographic techniques carefully, so as not to compromise security. We highlight some recent work in that direction.
{"title":"Advances in Cryptography and Secure Hardware for Data Outsourcing","authors":"Shantanu Sharma, A. Burtsev, S. Mehrotra","doi":"10.1109/ICDE48307.2020.00173","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00173","url":null,"abstract":"Despite extensive research, secure outsourcing remains an open challenge. This tutorial focuses on recent advances in secure cloud-based data outsourcing based on cryptographic (encryption, secret-sharing, and multi-party computation (MPC)) and hardware-based approaches. We highlight the strengths and weaknesses of state-of-the-art techniques, and conclude that, while no single approach is likely to emerge as a silver bullet. Thus, the key is to merge different hardware and software techniques to work in conjunction using partitioned computing wherein a computation is split across different cryptographic techniques carefully, so as not to compromise security. We highlight some recent work in that direction.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"6 1","pages":"1798-1801"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88493407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-04-01DOI: 10.1109/ICDE48307.2020.00131
Philipp Fent, Alexander van Renen, Andreas Kipf, Viktor Leis, Thomas Neumann, A. Kemper
While hardware and software improvements greatly accelerated modern database systems’ internal operations, the decades-old stream-based Socket API for external communication is still unchanged. We show experimentally, that for modern high-performance systems networking has become a performance bottleneck. Therefore, we argue that the communication stack needs to be redesigned to fully exploit modern hardware—as has already happened to most other database system components.We propose L5, a high-performance communication layer for database systems. L5 rethinks the flow of data in and out of the database system and is based on direct memory access techniques for intra-datacenter (RDMA) and intra-machine communication (Shared Memory). With L5, we provide a building block to accelerate ODBC-like interfaces with a unified and message-based communication framework. Our results show that using interconnects like RDMA (InfiniBand), RoCE (Ethernet), and Shared Memory (IPC), L5 can largely eliminate the network bottleneck for database systems.
{"title":"Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory","authors":"Philipp Fent, Alexander van Renen, Andreas Kipf, Viktor Leis, Thomas Neumann, A. Kemper","doi":"10.1109/ICDE48307.2020.00131","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00131","url":null,"abstract":"While hardware and software improvements greatly accelerated modern database systems’ internal operations, the decades-old stream-based Socket API for external communication is still unchanged. We show experimentally, that for modern high-performance systems networking has become a performance bottleneck. Therefore, we argue that the communication stack needs to be redesigned to fully exploit modern hardware—as has already happened to most other database system components.We propose L5, a high-performance communication layer for database systems. L5 rethinks the flow of data in and out of the database system and is based on direct memory access techniques for intra-datacenter (RDMA) and intra-machine communication (Shared Memory). With L5, we provide a building block to accelerate ODBC-like interfaces with a unified and message-based communication framework. Our results show that using interconnects like RDMA (InfiniBand), RoCE (Ethernet), and Shared Memory (IPC), L5 can largely eliminate the network bottleneck for database systems.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"1 1","pages":"1477-1488"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88795461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-04-01DOI: 10.1109/ICDE48307.2020.00234
Petr Lukáš, Radim Bača, M. Krátký, T. Ling
Structural XQuery and XPath queries are often modeled by twig pattern queries (TPQs) specifying predicates on XML nodes and structural relationships to be satisfied between them. This paper considers a TPQ model extended by a specification of output and non-output query nodes since it complies with the XQuery and XPath semantics. There are two types of TPQ processing approaches: binary joins and holistic joins. The binary joins utilize a query plan of interconnected binary operators, whereas the holistic joins are based on one complex operator to process the whole query. In the recent years, the holistic joins have been considered as the state-of-the-art TPQ processing method. However, a thorough analytical and experimental comparison of binary and holistic joins has been missing despite an enormous research effort in this area. In this paper, we try to fill this gap. We introduce several improvements of the binary join operators which enable us to build a so-called fully-pipelined (FP) query plan for any TPQ with the specification of output and non-output query nodes. We analytically show that, for a class of queries, the proposed approach has the same time and space complexity as holistic joins, and we experimentally demonstrate that the proposed approach outperforms holistic joins in many cases.
{"title":"Demythization of Structural XML Query Processing: Comparison of Holistic and Binary Approaches (Extended Abstract)","authors":"Petr Lukáš, Radim Bača, M. Krátký, T. Ling","doi":"10.1109/ICDE48307.2020.00234","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00234","url":null,"abstract":"Structural XQuery and XPath queries are often modeled by twig pattern queries (TPQs) specifying predicates on XML nodes and structural relationships to be satisfied between them. This paper considers a TPQ model extended by a specification of output and non-output query nodes since it complies with the XQuery and XPath semantics. There are two types of TPQ processing approaches: binary joins and holistic joins. The binary joins utilize a query plan of interconnected binary operators, whereas the holistic joins are based on one complex operator to process the whole query. In the recent years, the holistic joins have been considered as the state-of-the-art TPQ processing method. However, a thorough analytical and experimental comparison of binary and holistic joins has been missing despite an enormous research effort in this area. In this paper, we try to fill this gap. We introduce several improvements of the binary join operators which enable us to build a so-called fully-pipelined (FP) query plan for any TPQ with the specification of output and non-output query nodes. We analytically show that, for a class of queries, the proposed approach has the same time and space complexity as holistic joins, and we experimentally demonstrate that the proposed approach outperforms holistic joins in many cases.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"124 1","pages":"2030-2031"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77261566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-04-01DOI: 10.1109/ICDE48307.2020.00185
Yan Zhu, Chin-Chia Michael Yeh, Zachary Zimmerman, Eamonn J. Keogh
Since its introduction several years ago, the Matrix Profile has received significant attention for two reasons. First, it is a very general representation, allowing for the discovery of time series motifs, discords, chains, joins, shapelets, segmentations etc. Secondly, it can be computed very efficiently, allowing for fast exact computation and ultra-fast approximate computation. For analysts that use the Matrix Profile frequently, its incremental computability means that they can perform ad-hoc analytics at any time, with almost no delay time. However, they can only issue global queries. That is, queries that consider all the data from time zero to the current time. This is a significant limitation, as they may be interested in localized questions about a contiguous subset of the data. For example, "do we have any unusual motifs that correspond with that unusually cool summer two years ago". Such ad-hoc queries would require recomputing the Matrix Profile for the time period in question. This is not an untenable computation, but it could not be done in interactive time. In this work we introduce a novel indexing framework that allows queries about arbitrary ranges to be answered in quasilinear time, allowing such queries to be interactive for the first time.
{"title":"Matrix Profile XVII: Indexing the Matrix Profile to Allow Arbitrary Range Queries","authors":"Yan Zhu, Chin-Chia Michael Yeh, Zachary Zimmerman, Eamonn J. Keogh","doi":"10.1109/ICDE48307.2020.00185","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00185","url":null,"abstract":"Since its introduction several years ago, the Matrix Profile has received significant attention for two reasons. First, it is a very general representation, allowing for the discovery of time series motifs, discords, chains, joins, shapelets, segmentations etc. Secondly, it can be computed very efficiently, allowing for fast exact computation and ultra-fast approximate computation. For analysts that use the Matrix Profile frequently, its incremental computability means that they can perform ad-hoc analytics at any time, with almost no delay time. However, they can only issue global queries. That is, queries that consider all the data from time zero to the current time. This is a significant limitation, as they may be interested in localized questions about a contiguous subset of the data. For example, \"do we have any unusual motifs that correspond with that unusually cool summer two years ago\". Such ad-hoc queries would require recomputing the Matrix Profile for the time period in question. This is not an untenable computation, but it could not be done in interactive time. In this work we introduce a novel indexing framework that allows queries about arbitrary ranges to be answered in quasilinear time, allowing such queries to be interactive for the first time.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"7 1","pages":"1846-1849"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87082281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-04-01DOI: 10.1109/ICDE48307.2020.00085
Jong-Seop Lee, Seokwon Kang, Yongseung Yu, Yong-Yeon Jo, Sang-Wook Kim, Yongjun Park
Sparse matrix multiplication (spGEMM) is widely used to analyze the sparse network data, and extract important information based on matrix representation. As it contains a high degree of data parallelism, many efficient implementations using data-parallel programming platforms such as CUDA and OpenCL have been introduced on graphic processing units (GPUs). Several well-known spGEMM techniques, such as cuS- PARSE and CUSP, often do not utilize the GPU resources fully, owing to the load imbalance between threads in the expansion process and high memory contention in the merge process. Furthermore, even though several outer-product-based spGEMM techniques are proposed to solve the load balancing problem on expansion, they still do not utilize the GPU resources fully, because severe computation load variations exist among the multiple thread blocks.To solve these challenges, this paper proposes a new optimization pass called Block Reorganizer, which balances the total computations of each computing unit on target GPUs, based on the outer-product-based expansion process, and reduces the memory pressure during the merge process. For expansion, it first identifies the actual computation amount for each block, and then performs two thread block transformation processes based on their characteristics: 1) B-Splitting to transform a heavy-computation blocks into multiple small blocks and 2) B- Gathering to aggregate multiple small-computation blocks to a larger block. While merging, it improves the overall performance by performing B-Limiting to limit the number of blocks on each computing unit. Experimental results show that it improves the total performance of kernel execution by 1.43x, on an average, when compared to the row-product-based spGEMM, for NVIDIA Titan Xp GPUs on real-world datasets.
稀疏矩阵乘法(spGEMM)被广泛用于分析稀疏网络数据,并基于矩阵表示提取重要信息。由于它包含高度的数据并行性,许多使用数据并行编程平台(如CUDA和OpenCL)的高效实现已经在图形处理单元(gpu)上引入。一些著名的spGEMM技术,如cu - PARSE和CUSP,由于扩展过程中线程之间的负载不平衡和合并过程中内存的高争用,往往不能充分利用GPU资源。此外,尽管提出了几种基于外部产品的spGEMM技术来解决扩展时的负载平衡问题,但它们仍然不能充分利用GPU资源,因为多个线程块之间存在严重的计算负载变化。为了解决这些问题,本文提出了一种新的优化通道Block Reorganizer,该优化通道基于基于外部产品的扩展过程,平衡目标gpu上每个计算单元的总计算量,并减少合并过程中的内存压力。对于扩展,首先确定每个块的实际计算量,然后根据其特点进行两个线程块转换过程:1)B- splitting将一个大计算块转换为多个小计算块;2)B- Gathering将多个小计算块聚合为一个更大的块。在合并时,它通过执行b限制来限制每个计算单元上的块数量,从而提高了整体性能。实验结果表明,与基于行产品的spGEMM相比,NVIDIA Titan Xp gpu在真实数据集上的总内核执行性能平均提高了1.43倍。
{"title":"Optimization of GPU-based Sparse Matrix Multiplication for Large Sparse Networks","authors":"Jong-Seop Lee, Seokwon Kang, Yongseung Yu, Yong-Yeon Jo, Sang-Wook Kim, Yongjun Park","doi":"10.1109/ICDE48307.2020.00085","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00085","url":null,"abstract":"Sparse matrix multiplication (spGEMM) is widely used to analyze the sparse network data, and extract important information based on matrix representation. As it contains a high degree of data parallelism, many efficient implementations using data-parallel programming platforms such as CUDA and OpenCL have been introduced on graphic processing units (GPUs). Several well-known spGEMM techniques, such as cuS- PARSE and CUSP, often do not utilize the GPU resources fully, owing to the load imbalance between threads in the expansion process and high memory contention in the merge process. Furthermore, even though several outer-product-based spGEMM techniques are proposed to solve the load balancing problem on expansion, they still do not utilize the GPU resources fully, because severe computation load variations exist among the multiple thread blocks.To solve these challenges, this paper proposes a new optimization pass called Block Reorganizer, which balances the total computations of each computing unit on target GPUs, based on the outer-product-based expansion process, and reduces the memory pressure during the merge process. For expansion, it first identifies the actual computation amount for each block, and then performs two thread block transformation processes based on their characteristics: 1) B-Splitting to transform a heavy-computation blocks into multiple small blocks and 2) B- Gathering to aggregate multiple small-computation blocks to a larger block. While merging, it improves the overall performance by performing B-Limiting to limit the number of blocks on each computing unit. Experimental results show that it improves the total performance of kernel execution by 1.43x, on an average, when compared to the row-product-based spGEMM, for NVIDIA Titan Xp GPUs on real-world datasets.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"122 1","pages":"925-936"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87691473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The rapid development of steel logistics industry still has not effectively address such issues as truck overload and order overdue as well as cargo overstock. One of the reasons lie in limited number of trucks for transporting large scale cargos. More importantly, traditional methods attend to distribute cargos to trucks with the aim of maximizing the loading of each truck. But they ignore the priority level of orders and the expiration date of cargos stored in the warehouses, which have critical influences on profits of steel logistics industry. Hence, it necessitates an appropriate cargo distribution mechanism under the precondition of limited transportation capacity resources, to guarantee the maximization of delivery proportion for high-priority cargos. Recently, tremendous logistics data has been produced and are being in constant increment hourly in steel logistics platform. However, there is no existing solution to transform such data into actionable scheme to improve cargo distributing effectiveness. This paper puts forward a system implementation of smart cargo loading plan decision framework (SCLPD for short) for steel logistics industry. Through analysis on numerous real data cargo loading plan and inventory of warehouse, some important rules related to cargo distribution process are extracted. Additionally, consider that different amounts of trucks arriving in different time periods, based on adaptive time window model, a two- layer searching mechanism consisting of a genetic algorithm and A* algorithm is designed to ensure global optimization of cargo loading plan for the trucks in all time periods. In our demonstration, we illustrate the procedure of matching for cargos and trucks in various time windows, and showcase the comparison experimental results between the traditional method and SCLPD by the measurement of delivery proportion for high- priority cargos. The effectiveness and practicality of SCLPD enables efficient cargo loading plan generation, to meet the real- world requirements from steel logistics platform.
{"title":"SCLPD: Smart Cargo Loading Plan Decision Framework","authors":"Jiaye Liu, Jiali Mao, Jiajun Liao, Huiqi Hu, Ye Guo, Aoying Zhou","doi":"10.1109/ICDE48307.2020.00163","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00163","url":null,"abstract":"The rapid development of steel logistics industry still has not effectively address such issues as truck overload and order overdue as well as cargo overstock. One of the reasons lie in limited number of trucks for transporting large scale cargos. More importantly, traditional methods attend to distribute cargos to trucks with the aim of maximizing the loading of each truck. But they ignore the priority level of orders and the expiration date of cargos stored in the warehouses, which have critical influences on profits of steel logistics industry. Hence, it necessitates an appropriate cargo distribution mechanism under the precondition of limited transportation capacity resources, to guarantee the maximization of delivery proportion for high-priority cargos. Recently, tremendous logistics data has been produced and are being in constant increment hourly in steel logistics platform. However, there is no existing solution to transform such data into actionable scheme to improve cargo distributing effectiveness. This paper puts forward a system implementation of smart cargo loading plan decision framework (SCLPD for short) for steel logistics industry. Through analysis on numerous real data cargo loading plan and inventory of warehouse, some important rules related to cargo distribution process are extracted. Additionally, consider that different amounts of trucks arriving in different time periods, based on adaptive time window model, a two- layer searching mechanism consisting of a genetic algorithm and A* algorithm is designed to ensure global optimization of cargo loading plan for the trucks in all time periods. In our demonstration, we illustrate the procedure of matching for cargos and trucks in various time windows, and showcase the comparison experimental results between the traditional method and SCLPD by the measurement of delivery proportion for high- priority cargos. The effectiveness and practicality of SCLPD enables efficient cargo loading plan generation, to meet the real- world requirements from steel logistics platform.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"1 1","pages":"1758-1761"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87767058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-04-01DOI: 10.1109/ICDE48307.2020.00057
Zijian Li, Wenhao Zheng, Xueling Lin, Ziyuan Zhao, Zhe Wang, Yue Wang, Xun Jian, Lei Chen, Qiang Yan, Tiezheng Mao
Learning network embeddings has attracted growing attention in recent years. However, most of the existing methods focus on homogeneous networks, which cannot capture the important type information in heterogeneous networks. To address this problem, in this paper, we propose TransN, a novel multi-view network embedding framework for heterogeneous networks. Compared with the existing methods, TransN is an unsupervised framework which does not require node labels or user-specified meta-paths as inputs. In addition, TransN is capable of handling more general types of heterogeneous networks than the previous works. Specifically, in our framework TransN, we propose a novel algorithm to capture the proximity information inside each single view. Moreover, to transfer the learned information across views, we propose an algorithm to translate the node embeddings between different views based on the dual-learning mechanism, which can both capture the complex relations between node embeddings in different views, and preserve the proximity information inside each view during the translation. We conduct extensive experiments on real-world heterogeneous networks, whose results demonstrate that the node embeddings generated by TransN outperform those of competitors in various network mining tasks.
{"title":"TransN: Heterogeneous Network Representation Learning by Translating Node Embeddings","authors":"Zijian Li, Wenhao Zheng, Xueling Lin, Ziyuan Zhao, Zhe Wang, Yue Wang, Xun Jian, Lei Chen, Qiang Yan, Tiezheng Mao","doi":"10.1109/ICDE48307.2020.00057","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00057","url":null,"abstract":"Learning network embeddings has attracted growing attention in recent years. However, most of the existing methods focus on homogeneous networks, which cannot capture the important type information in heterogeneous networks. To address this problem, in this paper, we propose TransN, a novel multi-view network embedding framework for heterogeneous networks. Compared with the existing methods, TransN is an unsupervised framework which does not require node labels or user-specified meta-paths as inputs. In addition, TransN is capable of handling more general types of heterogeneous networks than the previous works. Specifically, in our framework TransN, we propose a novel algorithm to capture the proximity information inside each single view. Moreover, to transfer the learned information across views, we propose an algorithm to translate the node embeddings between different views based on the dual-learning mechanism, which can both capture the complex relations between node embeddings in different views, and preserve the proximity information inside each view during the translation. We conduct extensive experiments on real-world heterogeneous networks, whose results demonstrate that the node embeddings generated by TransN outperform those of competitors in various network mining tasks.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"65 1","pages":"589-600"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86465642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-04-01DOI: 10.1109/ICDE48307.2020.00229
Woohwan Jung, Suyong Kwon, Kyuseok Shim
Log data from mobile devices usually contain a series of events with time intervals. However, the problem of releasing differentially private time interval data has not been tackled yet. We propose the TIDY (publishing Time Intervals via Differential privacY) algorithm to release time interval data under differential privacy. We use the frequency vectors as a compact representation of the time interval data to reduce the aggregated noise. We also develop a new partitioning method adapted for the frequency vectors to balance the trade-off between the noise and structural errors. Our experiments confirm that TIDY outperforms the existing algorithms for releasing 2D histograms.
来自移动设备的日志数据通常包含一系列具有时间间隔的事件。但是,差分私有时间间隔数据的释放问题还没有得到解决。我们提出了一种基于差分隐私的时间间隔发布算法(publish Time Intervals via Differential privacY)来发布差分隐私下的时间间隔数据。我们使用频率向量作为时间间隔数据的紧凑表示来减少聚合噪声。我们还开发了一种新的适合频率矢量的划分方法,以平衡噪声和结构误差之间的权衡。我们的实验证实,TIDY优于现有的2D直方图释放算法。
{"title":"TIDY: Publishing a Time Interval Dataset with Differential Privacy (Extended abstract)","authors":"Woohwan Jung, Suyong Kwon, Kyuseok Shim","doi":"10.1109/ICDE48307.2020.00229","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00229","url":null,"abstract":"Log data from mobile devices usually contain a series of events with time intervals. However, the problem of releasing differentially private time interval data has not been tackled yet. We propose the TIDY (publishing Time Intervals via Differential privacY) algorithm to release time interval data under differential privacy. We use the frequency vectors as a compact representation of the time interval data to reduce the aggregated noise. We also develop a new partitioning method adapted for the frequency vectors to balance the trade-off between the noise and structural errors. Our experiments confirm that TIDY outperforms the existing algorithms for releasing 2D histograms.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"06 1","pages":"2020-2021"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85974856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}