2018 IEEE International Conference on Data Mining (ICDM)最新文献

Deep Reinforcement Learning with Knowledge Transfer for Online Rides Order Dispatching 基于知识转移的深度强化学习在在线订单调度中的应用

2018 IEEE International Conference on Data Mining (ICDM)

Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00077

Zhaodong Wang, Zhiwei Qin, Xiaocheng Tang, Jieping Ye, Hongtu Zhu

Ride dispatching is a central operation task on a ride-sharing platform to continuously match drivers to trip-requesting passengers. In this work, we model the ride dispatching problem as a Markov Decision Process and propose learning solutions based on deep Q-networks with action search to optimize the dispatching policy for drivers on ride-sharing platforms. We train and evaluate dispatching agents for this challenging decision task using real-world spatio-temporal trip data from the DiDi ride-sharing platform. A large-scale dispatching system typically supports many geographical locations with diverse demand-supply settings. To increase learning adaptability and efficiency, we propose a new transfer learning method Correlated Feature Progressive Transfer, along with two existing methods, enabling knowledge transfer in both spatial and temporal spaces. Through an extensive set of experiments, we demonstrate the learning and optimization capabilities of our deep reinforcement learning algorithms. We further show that dispatching policies learned by transferring knowledge from a source city to target cities or across temporal space within the same city significantly outperform those without transfer learning.

拼车调度是拼车平台的核心运营任务，它将司机与要求出行的乘客持续匹配。在这项工作中，我们将乘车调度问题建模为一个马尔可夫决策过程，并提出了基于深度q网络和动作搜索的学习解决方案，以优化乘车共享平台上的司机调度策略。我们使用来自滴滴出行共享平台的真实时空出行数据来训练和评估调度代理，以完成这一具有挑战性的决策任务。大型调度系统通常支持具有不同需求供应设置的许多地理位置。为了提高学习的适应性和效率，我们提出了一种新的迁移学习方法——关联特征渐进式迁移，并结合已有的两种方法，实现了知识在空间和时间上的迁移。通过一系列广泛的实验，我们展示了我们的深度强化学习算法的学习和优化能力。研究进一步表明，通过将知识从源城市迁移到目标城市或在同一城市内跨时空迁移来学习的调度策略显著优于不进行迁移学习的调度策略。

{"title":"Deep Reinforcement Learning with Knowledge Transfer for Online Rides Order Dispatching","authors":"Zhaodong Wang, Zhiwei Qin, Xiaocheng Tang, Jieping Ye, Hongtu Zhu","doi":"10.1109/ICDM.2018.00077","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00077","url":null,"abstract":"Ride dispatching is a central operation task on a ride-sharing platform to continuously match drivers to trip-requesting passengers. In this work, we model the ride dispatching problem as a Markov Decision Process and propose learning solutions based on deep Q-networks with action search to optimize the dispatching policy for drivers on ride-sharing platforms. We train and evaluate dispatching agents for this challenging decision task using real-world spatio-temporal trip data from the DiDi ride-sharing platform. A large-scale dispatching system typically supports many geographical locations with diverse demand-supply settings. To increase learning adaptability and efficiency, we propose a new transfer learning method Correlated Feature Progressive Transfer, along with two existing methods, enabling knowledge transfer in both spatial and temporal spaces. Through an extensive set of experiments, we demonstrate the learning and optimization capabilities of our deep reinforcement learning algorithms. We further show that dispatching policies learned by transferring knowledge from a source city to target cities or across temporal space within the same city significantly outperform those without transfer learning.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115684615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 85

Sequential Pattern Sampling with Norm Constraints 范数约束下的顺序模式抽样

2018 IEEE International Conference on Data Mining (ICDM)

Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00024

Lamine Diop, Cheikh Talibouya Diop, A. Giacometti, Dominique H. Li, Arnaud Soulet

In recent years, the field of pattern mining has shifted to user-centered methods. In such a context, it is necessary to have a tight coupling between the system and the user where mining techniques provide results at any time or within a short response time of only few seconds. Pattern sampling is a non-exhaustive method for instantly discovering relevant patterns that ensures a good interactivity while providing strong statistical guarantees due to its random nature. Curiously, such an approach investigated for itemsets and subgraphs has not yet been applied to sequential patterns, which are useful for a wide range of mining tasks and application fields. In this paper, we propose the first method for sequential pattern sampling. In addition to address sequential data, the originality of our approach is to introduce a constraint on the norm to control the length of the drawn patterns and to avoid the pitfall of the "long tail" where the rarest patterns flood the user. We propose a new constrained two-step random procedure, named CSSampling, that randomly draws sequential patterns according to frequency with an interval constraint on the norm. We demonstrate that this method performs an exact sampling. Moreover, despite the use of rejection sampling, the experimental study shows that CSSampling remains efficient and the constraint helps to draw general patterns of the "head". We also illustrate how to benefit from these sampled patterns to instantly build an associative classifier dedicated to sequences. This classification approach rivals state of the art proposals showing the interest of constrained sequential pattern sampling.

近年来，模式挖掘领域已经转向以用户为中心的方法。在这种情况下，有必要在系统和用户之间实现紧密耦合，在这种情况下，挖掘技术可以随时或在只有几秒钟的短响应时间内提供结果。模式抽样是一种非详尽的方法，用于立即发现相关模式，确保良好的交互性，同时由于其随机性而提供强大的统计保证。奇怪的是，这种用于项集和子图的方法尚未应用于序列模式，而序列模式对于广泛的挖掘任务和应用领域是有用的。本文提出了序列模式采样的第一种方法。除了处理顺序数据之外，我们的方法的独创性在于在规范上引入约束来控制绘制模式的长度，并避免“长尾”的陷阱，即最稀有的模式会淹没用户。我们提出了一种新的有约束的两步随机过程CSSampling，它根据频率随机绘制序列模式，范数上有区间约束。我们证明了这种方法可以进行精确的抽样。此外，实验研究表明，尽管使用了拒绝抽样，CSSampling仍然是有效的，并且约束有助于绘制“头部”的一般模式。我们还说明了如何从这些采样模式中获益，以便立即构建专用于序列的关联分类器。这种分类方法与显示约束序列模式采样兴趣的最新建议相竞争。

{"title":"Sequential Pattern Sampling with Norm Constraints","authors":"Lamine Diop, Cheikh Talibouya Diop, A. Giacometti, Dominique H. Li, Arnaud Soulet","doi":"10.1109/ICDM.2018.00024","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00024","url":null,"abstract":"In recent years, the field of pattern mining has shifted to user-centered methods. In such a context, it is necessary to have a tight coupling between the system and the user where mining techniques provide results at any time or within a short response time of only few seconds. Pattern sampling is a non-exhaustive method for instantly discovering relevant patterns that ensures a good interactivity while providing strong statistical guarantees due to its random nature. Curiously, such an approach investigated for itemsets and subgraphs has not yet been applied to sequential patterns, which are useful for a wide range of mining tasks and application fields. In this paper, we propose the first method for sequential pattern sampling. In addition to address sequential data, the originality of our approach is to introduce a constraint on the norm to control the length of the drawn patterns and to avoid the pitfall of the \"long tail\" where the rarest patterns flood the user. We propose a new constrained two-step random procedure, named CSSampling, that randomly draws sequential patterns according to frequency with an interval constraint on the norm. We demonstrate that this method performs an exact sampling. Moreover, despite the use of rejection sampling, the experimental study shows that CSSampling remains efficient and the constraint helps to draw general patterns of the \"head\". We also illustrate how to benefit from these sampled patterns to instantly build an associative classifier dedicated to sequences. This classification approach rivals state of the art proposals showing the interest of constrained sequential pattern sampling.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116652883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

A Low Rank Weighted Graph Convolutional Approach to Weather Prediction 一种低秩加权图卷积天气预报方法

2018 IEEE International Conference on Data Mining (ICDM)

Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00078

T. Wilson, P. Tan, L. Luo

Weather forecasting is an important but challenging problem as one must contend with the inherent non-linearities and spatiotemporal autocorrelation present in the data. This paper presents a novel deep learning approach based on a coupled weighted graph convolutional LSTM (WGC-LSTM) to address these challenges. Specifically, our proposed approach uses an LSTM to capture the inherent temporal autocorrelation of the data and a graph convolution to model its spatial relationships. As the weather condition can be influenced by various spatial factors besides the distance between locations, e.g., topography, prevailing winds and jet streams, imposing a fixed graph structure based on the proximity between locations is insufficient to train a robust deep learning model. Instead, our proposed approach treats the adjacency matrix of the graph as a model parameter that can be learned from the training data. However, this introduces an additional O(|V|^2) parameters to be estimated, where V is the number of locations. With large graphs this may also lead to slower performance as well as susceptibility to overfitting. We propose a modified version of our approach that can address this difficulty by assuming that the adjacency matrix is either sparse or low rank. Experimental results using two real-world weather datasets show that WGC-LSTM outperforms all other baseline methods for the majority of the evaluated locations.

天气预报是一个重要但具有挑战性的问题，因为人们必须与数据中存在的固有非线性和时空自相关作斗争。本文提出了一种基于耦合加权图卷积LSTM (WGC-LSTM)的新型深度学习方法来解决这些挑战。具体来说，我们提出的方法使用LSTM来捕获数据固有的时间自相关性，并使用图卷积来建模其空间关系。由于天气状况除了受到地点之间距离的影响外，还会受到地形、盛行风、急流等多种空间因素的影响，基于地点之间的接近程度强加固定的图结构不足以训练出鲁棒的深度学习模型。相反，我们提出的方法将图的邻接矩阵作为可以从训练数据中学习的模型参数。然而，这引入了额外的O(|V|^2)个参数来估计，其中V是位置的数量。对于较大的图形，这也可能导致较慢的性能以及过度拟合的敏感性。我们提出了一个修改版本的方法，可以通过假设邻接矩阵是稀疏的或低秩的来解决这个困难。使用两个真实天气数据集的实验结果表明，在大多数评估地点，WGC-LSTM优于所有其他基线方法。

{"title":"A Low Rank Weighted Graph Convolutional Approach to Weather Prediction","authors":"T. Wilson, P. Tan, L. Luo","doi":"10.1109/ICDM.2018.00078","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00078","url":null,"abstract":"Weather forecasting is an important but challenging problem as one must contend with the inherent non-linearities and spatiotemporal autocorrelation present in the data. This paper presents a novel deep learning approach based on a coupled weighted graph convolutional LSTM (WGC-LSTM) to address these challenges. Specifically, our proposed approach uses an LSTM to capture the inherent temporal autocorrelation of the data and a graph convolution to model its spatial relationships. As the weather condition can be influenced by various spatial factors besides the distance between locations, e.g., topography, prevailing winds and jet streams, imposing a fixed graph structure based on the proximity between locations is insufficient to train a robust deep learning model. Instead, our proposed approach treats the adjacency matrix of the graph as a model parameter that can be learned from the training data. However, this introduces an additional O(|V|^2) parameters to be estimated, where V is the number of locations. With large graphs this may also lead to slower performance as well as susceptibility to overfitting. We propose a modified version of our approach that can address this difficulty by assuming that the adjacency matrix is either sparse or low rank. Experimental results using two real-world weather datasets show that WGC-LSTM outperforms all other baseline methods for the majority of the evaluated locations.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127171770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

Deep Learning Based Scalable Inference of Uncertain Opinions 基于深度学习的不确定意见可扩展推理

2018 IEEE International Conference on Data Mining (ICDM)

Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00096

Xujiang Zhao, F. Chen, Jin-Hee Cho

Subjective Logic (SL) is one of well-known belief models that can explicitly deal with uncertain opinions and infer unknown opinions based on a rich set of operators of fusing multiple opinions. Due to high simplicity and applicability, SL has been popularly applied in a variety of decision making in the area of cybersecurity, opinion models, and/or trust / social network analysis. However, SL has been facing an issue of scalability to deal with a large-scale network data. In addition, SL has shown a bounded prediction accuracy due to its inherent parametric nature by treating heterogeneous data and network structure homogeneously based on the assumption of a Bayesian network. In this work, we take one step further to deal with uncertain opinions for unknown opinion inference. We propose a deep learning (DL)-based opinion inference model while node-level opinions are still formalized based on SL. The proposed DL-based opinion inference model handles node-level opinions explicitly in a large-scale network using graph convoluational network (GCN) and variational autoencoder (VAE) techniques. We adopted the GCN and VAE due to their powerful learning capabilities in dealing with a large-scale network data without parametric fusion operators and/or Bayesian network assumption. This work is the first that leverages the merits of both DL (i.e., GCN and VAE) and a belief model (i.e., SL) where each node level opinion is modeled by the formalism of SL while GCN and VAE are used to achieve non-parametric learning with low complexity. By mapping the node-level opinions modeled by the GCN to their equivalent Beta PDFs (probability density functions), we develop a network-driven VAE to maximize prediction accuracy of unknown opinions while significantly reducing algorithmic complexity. We validate our proposed DL-based algorithm using real-world datasets via extensive simulation experiments for comparative performance analysis.

主观逻辑(Subjective Logic, SL)是一种著名的信念模型，它基于一组丰富的多观点融合算子，可以明确地处理不确定的观点，并推断出未知的观点。由于高度的简单性和适用性，SL已被广泛应用于网络安全、意见模型和/或信任/社会网络分析领域的各种决策中。但是，SL一直面临着处理大规模网络数据的可伸缩性问题。此外，基于贝叶斯网络的假设，通过同质地处理异构数据和网络结构，SL由于其固有的参数性，显示出有界的预测精度。在这项工作中，我们进一步处理了未知意见推理的不确定意见。我们提出了一种基于深度学习(DL)的意见推理模型，而节点级意见仍然是基于SL形式化的。所提出的基于深度学习(DL)的意见推理模型使用图卷积网络(GCN)和变分自编码器(VAE)技术在大规模网络中显式处理节点级意见。我们采用了GCN和VAE，因为它们在处理大规模网络数据时具有强大的学习能力，不需要参数融合算子和/或贝叶斯网络假设。这项工作是第一个利用深度学习(即GCN和VAE)和信念模型(即SL)的优点的工作，其中每个节点级别的意见都由SL的形式化建模，而GCN和VAE用于实现低复杂性的非参数学习。通过将GCN建模的节点级意见映射到其等效的Beta pdf(概率密度函数)，我们开发了一个网络驱动的VAE，以最大限度地提高未知意见的预测精度，同时显着降低算法复杂性。我们通过广泛的模拟实验验证了我们提出的基于dl的算法，使用真实世界的数据集进行比较性能分析。

{"title":"Deep Learning Based Scalable Inference of Uncertain Opinions","authors":"Xujiang Zhao, F. Chen, Jin-Hee Cho","doi":"10.1109/ICDM.2018.00096","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00096","url":null,"abstract":"Subjective Logic (SL) is one of well-known belief models that can explicitly deal with uncertain opinions and infer unknown opinions based on a rich set of operators of fusing multiple opinions. Due to high simplicity and applicability, SL has been popularly applied in a variety of decision making in the area of cybersecurity, opinion models, and/or trust / social network analysis. However, SL has been facing an issue of scalability to deal with a large-scale network data. In addition, SL has shown a bounded prediction accuracy due to its inherent parametric nature by treating heterogeneous data and network structure homogeneously based on the assumption of a Bayesian network. In this work, we take one step further to deal with uncertain opinions for unknown opinion inference. We propose a deep learning (DL)-based opinion inference model while node-level opinions are still formalized based on SL. The proposed DL-based opinion inference model handles node-level opinions explicitly in a large-scale network using graph convoluational network (GCN) and variational autoencoder (VAE) techniques. We adopted the GCN and VAE due to their powerful learning capabilities in dealing with a large-scale network data without parametric fusion operators and/or Bayesian network assumption. This work is the first that leverages the merits of both DL (i.e., GCN and VAE) and a belief model (i.e., SL) where each node level opinion is modeled by the formalism of SL while GCN and VAE are used to achieve non-parametric learning with low complexity. By mapping the node-level opinions modeled by the GCN to their equivalent Beta PDFs (probability density functions), we develop a network-driven VAE to maximize prediction accuracy of unknown opinions while significantly reducing algorithmic complexity. We validate our proposed DL-based algorithm using real-world datasets via extensive simulation experiments for comparative performance analysis.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"280 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123430706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Which Outlier Detector Should I use? 我应该使用哪种离群值检测器?

2018 IEEE International Conference on Data Mining (ICDM)

Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00015

K. Ting, Sunil Aryal, T. Washio

This tutorial has four aims: (1) Providing the current comparative works on different outlier detectors, and analysing the strengths and weaknesses of these works and their recommendations. (2) Presenting non-obvious applications of outlier detectors. This provides examples of how outlier detectors are used in areas which are not normally considered to be the domains of outlier detection. (3) Inviting the research community to explore future research directions, in terms of both comparative study and outlier detection in general. (4) Giving an advice on the factors to consider when choosing an outlier detector, and strengths and weaknesses of some "top" recommended algorithms based on the current understanding in the literature.

本教程有四个目的:(1)提供当前不同离群值检测器的比较工作，并分析这些工作的优缺点及其建议。(2)给出离群值检测器的非明显应用。这提供了异常值检测器如何在通常不被认为是异常值检测域的区域中使用的示例。(3)邀请研究界探讨未来的研究方向，无论是比较研究还是一般的离群值检测。(4)根据目前文献的理解，给出了选择离群值检测器时需要考虑的因素，以及一些“顶级”推荐算法的优缺点。

引用次数: 4

Record2Vec: Unsupervised Representation Learning for Structured Records 结构化记录的无监督表示学习

2018 IEEE International Conference on Data Mining (ICDM)

Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00165

Adelene Y. L. Sim, Andrew Borthwick

Structured records - data with a fixed number of descriptive fields (or attributes) - are often represented by one-hot encoded or term frequency-inverse document frequency (TF-IDF) weighted vectors. These vectors are typically sparse and long, and are inefficient in representing structured records. Here, we introduce Record2Vec, a framework for generating dense embeddings of structured records by training associations between attributes within record instances. We build our embedding from a simple premise that structured records have attributes that are associated, and therefore we can train the embedding of an attribute based on other attributes (or context), much like how we train embeddings for words based on their surrounding context. Because this embedding technique is general and does not assume the availability of any labeled data, it is extendable across different domains and fields. We demonstrate its utility in the context of clustering, record matching, movie rating and movie genre prediction.

结构化记录——具有固定数量的描述性字段(或属性)的数据——通常由单热编码或术语频率逆文档频率(TF-IDF)加权向量表示。这些向量通常是稀疏且长，并且在表示结构化记录时效率低下。在这里，我们介绍Record2Vec，这是一个框架，通过训练记录实例中属性之间的关联来生成结构化记录的密集嵌入。我们从一个简单的前提构建嵌入，即结构化记录具有关联的属性，因此我们可以基于其他属性(或上下文)训练属性的嵌入，就像我们基于周围上下文训练单词的嵌入一样。由于这种嵌入技术是通用的，并且不假设任何标记数据的可用性，因此它可以跨不同的域和字段进行扩展。我们展示了它在聚类、记录匹配、电影评级和电影类型预测等方面的实用性。

引用次数: 9

Local Low-Rank Hawkes Processes for Temporal User-Item Interactions 临时用户-项目交互的局部低秩Hawkes过程

2018 IEEE International Conference on Data Mining (ICDM)

Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00058

Jin Shang, Mingxuan Sun

Hawkes processes have become very popular in modeling multiple recurrent user-item interaction events that exhibit mutual-excitation properties in various domains. Generally, modeling the interaction sequence of each user-item pair as an independent Hawkes process is ineffective since the prediction accuracy of future event occurrences for users and items with few observed interactions is low. On the other hand, multivariate Hawkes processes (MHPs) can be used to handle multi-dimensional random processes where different dimensions are correlated with each other. However, an MHP either fails to describe the correct mutual influence between dimensions or become computational inhibitive in most real-world events involving a large collection of users and items. To tackle this challenge, we propose local low-rank Hawkes processes to model large-scale user-item interactions, which efficiently captures the correlations of Hawkes processes in different dimensions. In addition, we design an efficient convex optimization algorithm to estimate model parameters and present a parallel algorithm to further increase the computation efficiency. Extensive experiments on real-world datasets demonstrate the performance improvements of our model in comparison with the state of the art.

Hawkes过程在多个用户-项目交互事件的建模中变得非常流行，这些事件在各个领域都表现出相互激励的特性。通常，将每个用户-物品对的交互序列建模为一个独立的Hawkes过程是无效的，因为对于观察到交互很少的用户和物品，对未来事件发生的预测精度很低。另一方面，多元Hawkes过程(multivariate Hawkes process, MHPs)可用于处理不同维度相互关联的多维随机过程。然而，MHP要么无法描述维度之间的正确相互影响，要么在涉及大量用户和项目的大多数现实世界事件中成为计算抑制。为了解决这个问题，我们提出了局部低阶Hawkes过程来模拟大规模的用户-物品交互，有效地捕获了Hawkes过程在不同维度上的相关性。此外，我们设计了一种高效的凸优化算法来估计模型参数，并提出了一种并行算法来进一步提高计算效率。在真实世界数据集上进行的大量实验表明，与目前的技术水平相比，我们的模型的性能有所提高。

{"title":"Local Low-Rank Hawkes Processes for Temporal User-Item Interactions","authors":"Jin Shang, Mingxuan Sun","doi":"10.1109/ICDM.2018.00058","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00058","url":null,"abstract":"Hawkes processes have become very popular in modeling multiple recurrent user-item interaction events that exhibit mutual-excitation properties in various domains. Generally, modeling the interaction sequence of each user-item pair as an independent Hawkes process is ineffective since the prediction accuracy of future event occurrences for users and items with few observed interactions is low. On the other hand, multivariate Hawkes processes (MHPs) can be used to handle multi-dimensional random processes where different dimensions are correlated with each other. However, an MHP either fails to describe the correct mutual influence between dimensions or become computational inhibitive in most real-world events involving a large collection of users and items. To tackle this challenge, we propose local low-rank Hawkes processes to model large-scale user-item interactions, which efficiently captures the correlations of Hawkes processes in different dimensions. In addition, we design an efficient convex optimization algorithm to estimate model parameters and present a parallel algorithm to further increase the computation efficiency. Extensive experiments on real-world datasets demonstrate the performance improvements of our model in comparison with the state of the art.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129540963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

CADEN: A Context-Aware Deep Embedding Network for Financial Opinions Mining 基于上下文感知的金融意见挖掘深度嵌入网络

2018 IEEE International Conference on Data Mining (ICDM)

Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00091

Liang Zhang, Keli Xiao, Hengshu Zhu, Chuanren Liu, Jingyuan Yang, Bo Jin

Following the recent advances of artificial intelligence, financial text mining has gained new potential to benefit theoretical research with practice impacts. An essential research question for financial text mining is how to accurately identify the actual financial opinions (e.g., bullish or bearish) behind words in plain text. Traditional methods mainly consider this task as a text classification problem with solutions based on machine learning algorithms. However, most of them rely heavily on the hand-crafted features extracted from the text. Indeed, a critical issue along this line is that the latent global and local contexts of the financial opinions usually cannot be fully captured. To this end, we propose a context-aware deep embedding network for financial text mining, named CADEN, by jointly encoding the global and local contextual information. Especially, we capture and include an attitude-aware user embedding to enhance the performance of our model. We validate our method with extensive experiments based on a real-world dataset and several state-of-the-art baselines for investor sentiment recognition. Our results show a consistently superior performance of our approach for identifying the financial opinions from texts of different formats.

随着人工智能的最新进展，金融文本挖掘在理论研究和实践影响方面获得了新的潜力。金融文本挖掘的一个重要研究问题是如何准确地识别纯文本单词背后的实际金融意见(例如，看涨或看跌)。传统方法主要将此任务视为基于机器学习算法的文本分类问题。然而，它们中的大多数都严重依赖于从文本中提取的手工特征。事实上，这方面的一个关键问题是，金融意见的潜在全球和地方背景通常无法完全捕捉。为此，我们提出了一个上下文感知的金融文本挖掘深度嵌入网络，命名为CADEN，通过对全局和局部上下文信息进行联合编码。特别是，我们捕获并包含一个态度感知的用户嵌入来增强我们模型的性能。我们通过基于真实世界数据集和几个最先进的投资者情绪识别基线的广泛实验验证了我们的方法。我们的结果表明，我们的方法在从不同格式的文本中识别财务意见方面始终表现优异。

{"title":"CADEN: A Context-Aware Deep Embedding Network for Financial Opinions Mining","authors":"Liang Zhang, Keli Xiao, Hengshu Zhu, Chuanren Liu, Jingyuan Yang, Bo Jin","doi":"10.1109/ICDM.2018.00091","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00091","url":null,"abstract":"Following the recent advances of artificial intelligence, financial text mining has gained new potential to benefit theoretical research with practice impacts. An essential research question for financial text mining is how to accurately identify the actual financial opinions (e.g., bullish or bearish) behind words in plain text. Traditional methods mainly consider this task as a text classification problem with solutions based on machine learning algorithms. However, most of them rely heavily on the hand-crafted features extracted from the text. Indeed, a critical issue along this line is that the latent global and local contexts of the financial opinions usually cannot be fully captured. To this end, we propose a context-aware deep embedding network for financial text mining, named CADEN, by jointly encoding the global and local contextual information. Especially, we capture and include an attitude-aware user embedding to enhance the performance of our model. We validate our method with extensive experiments based on a real-world dataset and several state-of-the-art baselines for investor sentiment recognition. Our results show a consistently superior performance of our approach for identifying the financial opinions from texts of different formats.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128658890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 18

Clustered Lifelong Learning Via Representative Task Selection 基于代表性任务选择的聚类终身学习

2018 IEEE International Conference on Data Mining (ICDM)

Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00167

Gan Sun, Yang Cong, Yu Kong, Xiaowei Xu

Consider the lifelong machine learning problem where the objective is to learn new consecutive tasks depending on previously accumulated experiences, i.e., knowledge library. In comparison with most state-of-the-arts which adopt knowledge library with prescribed size, in this paper, we propose a new incremental clustered lifelong learning model with two libraries: feature library and model library, called Clustered Lifelong Learning (CL3), in which the feature library maintains a set of learned features common across all the encountered tasks, and the model library is learned by identifying and adding representative models (clusters). When a new task arrives, the original task model can be firstly reconstructed by representative models measured by capped l2-norm distance, i.e., effectively assigning the new task model to multiple representative models under feature library. Based on this assignment knowledge of new task, the objective of our CL3 model is to transfer the knowledge from both feature library and model library to learn the new task. The new task 1) with a higher outlier probability will then be judged as a new representative, and used to refine both feature library and representative models over time; 2) with lower outlier probability will only update the feature library. For the model optimisation, we cast this problem as an alternating direction minimization problem. To this end, the performance of CL3 is evaluated through comparing with most lifelong learning models, even some batch clustered multi-task learning models.

考虑终身机器学习问题，其目标是根据以前积累的经验(即知识库)学习新的连续任务。与目前大多数采用指定大小知识库的方法相比，本文提出了一种新的增量式聚类终身学习模型，该模型包含两个库:特征库和模型库，称为聚类终身学习(CL3)，其中特征库维护一组在所有遇到的任务中常见的学习特征，模型库通过识别和添加代表性模型(聚类)来学习。当有新任务到达时，可以先用上限12范数距离测量的代表性模型重构原任务模型，即有效地将新任务模型分配给特征库下的多个代表性模型。基于这种新任务的指派知识，我们的CL3模型的目标是从特征库和模型库中转移知识来学习新任务。具有较高离群概率的新任务1)将被判断为新的代表，并用于随着时间的推移改进特征库和代表模型;2)较低离群概率只会更新特征库。对于模型优化，我们把这个问题作为一个交替方向最小化问题。为此，通过与大多数终身学习模型，甚至一些批聚类多任务学习模型的比较来评估CL3的性能。

{"title":"Clustered Lifelong Learning Via Representative Task Selection","authors":"Gan Sun, Yang Cong, Yu Kong, Xiaowei Xu","doi":"10.1109/ICDM.2018.00167","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00167","url":null,"abstract":"Consider the lifelong machine learning problem where the objective is to learn new consecutive tasks depending on previously accumulated experiences, i.e., knowledge library. In comparison with most state-of-the-arts which adopt knowledge library with prescribed size, in this paper, we propose a new incremental clustered lifelong learning model with two libraries: feature library and model library, called Clustered Lifelong Learning (CL3), in which the feature library maintains a set of learned features common across all the encountered tasks, and the model library is learned by identifying and adding representative models (clusters). When a new task arrives, the original task model can be firstly reconstructed by representative models measured by capped l2-norm distance, i.e., effectively assigning the new task model to multiple representative models under feature library. Based on this assignment knowledge of new task, the objective of our CL3 model is to transfer the knowledge from both feature library and model library to learn the new task. The new task 1) with a higher outlier probability will then be judged as a new representative, and used to refine both feature library and representative models over time; 2) with lower outlier probability will only update the feature library. For the model optimisation, we cast this problem as an alternating direction minimization problem. To this end, the performance of CL3 is evaluated through comparing with most lifelong learning models, even some batch clustered multi-task learning models.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127037764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Characteristic Subspace Learning for Time Series Classification 时间序列分类的特征子空间学习

2018 IEEE International Conference on Data Mining (ICDM)

Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00128

Yuanduo He, Jialiang Pei, Xu Chu, Yasha Wang, Zhu Jin, Guangju Peng

This paper presents a novel time series classification algorithm. It exploits time-delay embedding to transform time series into a set of points as a distribution, and attempt to classify time series by classifying corresponding distributions. It proposes a novel geometrical feature, i.e. characteristic subspace, from embedding points for classification, and leverages class-weighted support vector machine (SVM) to learn for it. An efficient boosting strategy is also developed to enable a linear time training. The experiments show great potentials of this novel algorithm on accuracy, efficiency and interpretability.

提出了一种新的时间序列分类算法。它利用时延嵌入将时间序列转化为一组点作为分布，并尝试通过对相应分布进行分类来对时间序列进行分类。该方法从嵌入点中提出一种新的几何特征即特征子空间进行分类，并利用类加权支持向量机(SVM)对其进行学习。为实现线性时间训练，提出了一种有效的提升策略。实验表明，该算法在精度、效率和可解释性方面具有很大的潜力。

引用次数: 2