Obtaining high-quality labelled data for training a classifier in a new application domain is often costly. Transfer Learning (a.k.a. “Inductive Transfer”) tries to alleviate these costs by transferring, to the “target” domain of interest, knowledge available from a different “source” domain. In transfer learning the lack of labelled information from the target domain is compensated by the availability at training time of a set of unlabelled examples from the target distribution. Transductive Transfer Learning denotes the transfer learning setting in which the only set of target documents that we are interested in classifying is known and available at training time. Although this definition is indeed in line with Vapnik’s original definition of “transduction”, current terminology in the field is confused. In this article, we discuss how the term “transduction” has been misused in the transfer learning literature, and propose a clarification consistent with the original characterization of this term given by Vapnik. We go on to observe that the above terminology misuse has brought about misleading experimental comparisons, with inductive transfer learning methods that have been incorrectly compared with transductive transfer learning methods. We then, give empirical evidence that the difference in performance between the inductive version and the transductive version of a transfer learning method can indeed be statistically significant (i.e., that knowing at training time the only data one needs to classify indeed gives an advantage). Our clarification allows a reassessment of the field, and of the relative merits of the major, state-of-the-art algorithms for transfer learning in text classification.
{"title":"Lost in Transduction: Transductive Transfer Learning in Text Classification","authors":"A. Moreo, Andrea Esuli, F. Sebastiani","doi":"10.1145/3453146","DOIUrl":"https://doi.org/10.1145/3453146","url":null,"abstract":"Obtaining high-quality labelled data for training a classifier in a new application domain is often costly. Transfer Learning (a.k.a. “Inductive Transfer”) tries to alleviate these costs by transferring, to the “target” domain of interest, knowledge available from a different “source” domain. In transfer learning the lack of labelled information from the target domain is compensated by the availability at training time of a set of unlabelled examples from the target distribution. Transductive Transfer Learning denotes the transfer learning setting in which the only set of target documents that we are interested in classifying is known and available at training time. Although this definition is indeed in line with Vapnik’s original definition of “transduction”, current terminology in the field is confused. In this article, we discuss how the term “transduction” has been misused in the transfer learning literature, and propose a clarification consistent with the original characterization of this term given by Vapnik. We go on to observe that the above terminology misuse has brought about misleading experimental comparisons, with inductive transfer learning methods that have been incorrectly compared with transductive transfer learning methods. We then, give empirical evidence that the difference in performance between the inductive version and the transductive version of a transfer learning method can indeed be statistically significant (i.e., that knowing at training time the only data one needs to classify indeed gives an advantage). Our clarification allows a reassessment of the field, and of the relative merits of the major, state-of-the-art algorithms for transfer learning in text classification.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123999470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chen Gao, Yong Li, Fuli Feng, Xiangning Chen, Kai Zhao, Xiangnan He, Depeng Jin
Web systems that provide the same functionality usually share a certain amount of items. This makes it possible to combine data from different websites to improve recommendation quality, known as the cross-domain recommendation task. Despite many research efforts on this task, the main drawback is that they largely assume the data of different systems can be fully shared. Such an assumption is unrealistic different systems are typically operated by different companies, and it may violate business privacy policy to directly share user behavior data since it is highly sensitive. In this work, we consider a more practical scenario to perform cross-domain recommendation. To avoid the leak of user privacy during the data sharing process, we consider sharing only the information of the item side, rather than user behavior data. Specifically, we transfer the item embeddings across domains, making it easier for two companies to reach a consensus (e.g., legal policy) on data sharing since the data to be shared is user-irrelevant and has no explicit semantics. To distill useful signals from transferred item embeddings, we rely on the strong representation power of neural networks and develop a new method named as NATR (short for Neural Attentive Transfer Recommendation). We perform extensive experiments on two real-world datasets, demonstrating that NATR achieves similar or even better performance than traditional cross-domain recommendation methods that directly share user-relevant data. Further insights are provided on the efficacy of NATR in using the transferred item embeddings to alleviate the data sparsity issue.
提供相同功能的Web系统通常共享一定数量的项。这使得整合来自不同网站的数据来提高推荐质量成为可能,被称为跨域推荐任务。尽管在这项任务上进行了许多研究,但主要的缺点是它们在很大程度上假设不同系统的数据可以完全共享。这样的假设是不现实的,不同的系统通常由不同的公司运营,直接共享用户行为数据可能会违反商业隐私政策,因为它是高度敏感的。在这项工作中,我们考虑了一个更实际的场景来执行跨领域推荐。为了避免在数据共享过程中泄露用户隐私,我们考虑只共享项目方信息,而不共享用户行为数据。具体来说,我们跨领域转移项目嵌入,使两家公司更容易就数据共享达成共识(例如,法律政策),因为要共享的数据与用户无关,没有明确的语义。为了从转移的项目嵌入中提取有用的信号,我们依靠神经网络的强大表征能力,开发了一种新的方法,称为NATR (neural attention Transfer Recommendation的缩写)。我们在两个真实世界的数据集上进行了大量的实验,证明NATR比直接共享用户相关数据的传统跨领域推荐方法实现了类似甚至更好的性能。在使用转移的项目嵌入来缓解数据稀疏性问题方面,提供了NATR的有效性的进一步见解。
{"title":"Cross-domain Recommendation with Bridge-Item Embeddings","authors":"Chen Gao, Yong Li, Fuli Feng, Xiangning Chen, Kai Zhao, Xiangnan He, Depeng Jin","doi":"10.1145/3447683","DOIUrl":"https://doi.org/10.1145/3447683","url":null,"abstract":"Web systems that provide the same functionality usually share a certain amount of items. This makes it possible to combine data from different websites to improve recommendation quality, known as the cross-domain recommendation task. Despite many research efforts on this task, the main drawback is that they largely assume the data of different systems can be fully shared. Such an assumption is unrealistic different systems are typically operated by different companies, and it may violate business privacy policy to directly share user behavior data since it is highly sensitive. In this work, we consider a more practical scenario to perform cross-domain recommendation. To avoid the leak of user privacy during the data sharing process, we consider sharing only the information of the item side, rather than user behavior data. Specifically, we transfer the item embeddings across domains, making it easier for two companies to reach a consensus (e.g., legal policy) on data sharing since the data to be shared is user-irrelevant and has no explicit semantics. To distill useful signals from transferred item embeddings, we rely on the strong representation power of neural networks and develop a new method named as NATR (short for Neural Attentive Transfer Recommendation). We perform extensive experiments on two real-world datasets, demonstrating that NATR achieves similar or even better performance than traditional cross-domain recommendation methods that directly share user-relevant data. Further insights are provided on the efficacy of NATR in using the transferred item embeddings to alleviate the data sparsity issue.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"170 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121432239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Luyue Lin, Xin Zheng, Bo Liu, Wei Chen, Yanshan Xiao
Over the past few years, we have made great progress in image categorization based on convolutional neural networks (CNNs). These CNNs are always trained based on a large-scale image data set; however, people may only have limited training samples for training CNN in the real-world applications. To solve this problem, one intuition is augmenting training samples. In this article, we propose an algorithm called Lavagan (Latent Variables Augmentation Method based on Generative Adversarial Nets) to improve the performance of CNN with insufficient training samples. The proposed Lavagan method is mainly composed of two tasks. The first task is that we augment a number latent variables (LVs) from a set of adaptive and constrained LVs distributions. In the second task, we take the augmented LVs into the training procedure of the image classifier. By taking these two tasks into account, we propose a uniform objective function to incorporate the two tasks into the learning. We then put forward an alternative two-play minimization game to minimize this uniform loss function such that we can obtain the predictive classifier. Moreover, based on Hoeffding’s Inequality and Chernoff Bounding method, we analyze the feasibility and efficiency of the proposed Lavagan method, which manifests that the LV augmentation method is able to improve the performance of Lavagan with insufficient training samples. Finally, the experiment has shown that the proposed Lavagan method is able to deliver more accurate performance than the existing state-of-the-art methods.
在过去的几年里,我们在基于卷积神经网络(cnn)的图像分类方面取得了很大的进展。这些cnn总是基于大规模的图像数据集进行训练;然而,人们可能只有有限的训练样本来训练CNN在现实世界的应用。为了解决这个问题,一种直觉是增加训练样本。在本文中,我们提出了一种名为Lavagan (Latent Variables Augmentation Method based on Generative Adversarial Nets)的算法来改善训练样本不足的CNN的性能。提出的Lavagan方法主要由两个任务组成。第一个任务是我们从一组自适应和约束的潜在变量分布中增加一些潜在变量(lv)。在第二个任务中,我们将增强lv引入到图像分类器的训练过程中。考虑到这两个任务,我们提出了一个统一的目标函数,将这两个任务合并到学习中。然后,我们提出了一种备选的两局最小化对策,以最小化该均匀损失函数,从而获得预测分类器。此外,基于Hoeffding不等式和Chernoff边界法,我们分析了所提出的Lavagan方法的可行性和效率,表明LV增强方法能够在训练样本不足的情况下提高Lavagan的性能。最后,实验表明,所提出的Lavagan方法能够提供比现有的最先进的方法更准确的性能。
{"title":"A Latent Variable Augmentation Method for Image Categorization with Insufficient Training Samples","authors":"Luyue Lin, Xin Zheng, Bo Liu, Wei Chen, Yanshan Xiao","doi":"10.1145/3451165","DOIUrl":"https://doi.org/10.1145/3451165","url":null,"abstract":"Over the past few years, we have made great progress in image categorization based on convolutional neural networks (CNNs). These CNNs are always trained based on a large-scale image data set; however, people may only have limited training samples for training CNN in the real-world applications. To solve this problem, one intuition is augmenting training samples. In this article, we propose an algorithm called Lavagan (Latent Variables Augmentation Method based on Generative Adversarial Nets) to improve the performance of CNN with insufficient training samples. The proposed Lavagan method is mainly composed of two tasks. The first task is that we augment a number latent variables (LVs) from a set of adaptive and constrained LVs distributions. In the second task, we take the augmented LVs into the training procedure of the image classifier. By taking these two tasks into account, we propose a uniform objective function to incorporate the two tasks into the learning. We then put forward an alternative two-play minimization game to minimize this uniform loss function such that we can obtain the predictive classifier. Moreover, based on Hoeffding’s Inequality and Chernoff Bounding method, we analyze the feasibility and efficiency of the proposed Lavagan method, which manifests that the LV augmentation method is able to improve the performance of Lavagan with insufficient training samples. Finally, the experiment has shown that the proposed Lavagan method is able to deliver more accurate performance than the existing state-of-the-art methods.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133767866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vision-and-language (V-L) tasks require the system to understand both vision content and natural language, thus learning fine-grained joint representations of vision and language (a.k.a. V-L representations) is of paramount importance. Recently, various pre-trained V-L models are proposed to learn V-L representations and achieve improved results in many tasks. However, the mainstream models process both vision and language inputs with the same set of attention matrices. As a result, the generated V-L representations are entangled in one common latent space. To tackle this problem, we propose DiMBERT (short for Disentangled Multimodal-Attention BERT), which is a novel framework that applies separated attention spaces for vision and language, and the representations of multi-modalities can thus be disentangled explicitly. To enhance the correlation between vision and language in disentangled spaces, we introduce the visual concepts to DiMBERT which represent visual information in textual format. In this manner, visual concepts help to bridge the gap between the two modalities. We pre-train DiMBERT on a large amount of image–sentence pairs on two tasks: bidirectional language modeling and sequence-to-sequence language modeling. After pre-train, DiMBERT is further fine-tuned for the downstream tasks. Experiments show that DiMBERT sets new state-of-the-art performance on three tasks (over four datasets), including both generation tasks (image captioning and visual storytelling) and classification tasks (referring expressions). The proposed DiM (short for Disentangled Multimodal-Attention) module can be easily incorporated into existing pre-trained V-L models to boost their performance, up to a 5% increase on the representative task. Finally, we conduct a systematic analysis and demonstrate the effectiveness of our DiM and the introduced visual concepts.
{"title":"DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention","authors":"Fenglin Liu, Xuancheng Ren","doi":"10.1145/3447685","DOIUrl":"https://doi.org/10.1145/3447685","url":null,"abstract":"Vision-and-language (V-L) tasks require the system to understand both vision content and natural language, thus learning fine-grained joint representations of vision and language (a.k.a. V-L representations) is of paramount importance. Recently, various pre-trained V-L models are proposed to learn V-L representations and achieve improved results in many tasks. However, the mainstream models process both vision and language inputs with the same set of attention matrices. As a result, the generated V-L representations are entangled in one common latent space. To tackle this problem, we propose DiMBERT (short for Disentangled Multimodal-Attention BERT), which is a novel framework that applies separated attention spaces for vision and language, and the representations of multi-modalities can thus be disentangled explicitly. To enhance the correlation between vision and language in disentangled spaces, we introduce the visual concepts to DiMBERT which represent visual information in textual format. In this manner, visual concepts help to bridge the gap between the two modalities. We pre-train DiMBERT on a large amount of image–sentence pairs on two tasks: bidirectional language modeling and sequence-to-sequence language modeling. After pre-train, DiMBERT is further fine-tuned for the downstream tasks. Experiments show that DiMBERT sets new state-of-the-art performance on three tasks (over four datasets), including both generation tasks (image captioning and visual storytelling) and classification tasks (referring expressions). The proposed DiM (short for Disentangled Multimodal-Attention) module can be easily incorporated into existing pre-trained V-L models to boost their performance, up to a 5% increase on the representative task. Finally, we conduct a systematic analysis and demonstrate the effectiveness of our DiM and the introduced visual concepts.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132378904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this article, we consider the problem of crawling a multiplex network to identify the community structure of a layer-of-interest. A multiplex network is one where there are multiple types of relationships between the nodes. In many multiplex networks, some layers might be easier to explore (in terms of time, money etc.). We propose MCS+, an algorithm that can use the information from the easier to explore layers to help in the exploration of a layer-of-interest that is expensive to explore. We consider the goal of exploration to be generating a sample that is representative of the communities in the complete layer-of-interest. This work has practical applications in areas such as exploration of dark (e.g., criminal) networks, online social networks, biological networks, and so on. For example, in a terrorist network, relationships such as phone records, e-mail records, and so on are easier to collect; in contrast, data on the face-to-face communications are much harder to collect, but also potentially more valuable. We perform extensive experimental evaluations on real-world networks, and we observe that MCS+ consistently outperforms the best baseline—the similarity of the sample that MCS+ generates to the real network is up to three times that of the best baseline in some networks. We also perform theoretical and experimental evaluations on the scalability of MCS+ to network properties, and find that it scales well with the budget, number of layers in the multiplex network, and the average degree in the original network.
{"title":"MCS+: An Efficient Algorithm for Crawling the Community Structure in Multiplex Networks","authors":"Ricky Laishram, Jeremy D. Wendt, S. Soundarajan","doi":"10.1145/3451527","DOIUrl":"https://doi.org/10.1145/3451527","url":null,"abstract":"In this article, we consider the problem of crawling a multiplex network to identify the community structure of a layer-of-interest. A multiplex network is one where there are multiple types of relationships between the nodes. In many multiplex networks, some layers might be easier to explore (in terms of time, money etc.). We propose MCS+, an algorithm that can use the information from the easier to explore layers to help in the exploration of a layer-of-interest that is expensive to explore. We consider the goal of exploration to be generating a sample that is representative of the communities in the complete layer-of-interest. This work has practical applications in areas such as exploration of dark (e.g., criminal) networks, online social networks, biological networks, and so on. For example, in a terrorist network, relationships such as phone records, e-mail records, and so on are easier to collect; in contrast, data on the face-to-face communications are much harder to collect, but also potentially more valuable. We perform extensive experimental evaluations on real-world networks, and we observe that MCS+ consistently outperforms the best baseline—the similarity of the sample that MCS+ generates to the real network is up to three times that of the best baseline in some networks. We also perform theoretical and experimental evaluations on the scalability of MCS+ to network properties, and find that it scales well with the budget, number of layers in the multiplex network, and the average degree in the original network.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124079447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Latent Dirichlet Allocation (LDA) has been widely used for topic modeling, with applications spanning various areas such as natural language processing and information retrieval. While LDA on small and static datasets has been extensively studied, several real-world challenges are posed in practical scenarios where datasets are often huge and are gathered in a streaming fashion. As the state-of-the-art LDA algorithm on streams, Streaming Variational Bayes (SVB) introduced Bayesian updating to provide a streaming procedure. However, the utility of SVB is limited in applications since it ignored three challenges of processing real-world streams: topic evolution, data turbulence, and real-time inference. In this article, we propose a novel distributed LDA algorithm—referred to as StreamFed-LDA—to deal with challenges on streams. For topic modeling of streaming data, the ability to capture evolving topics is essential for practical online inference. To achieve this goal, StreamFed-LDA is based on a specialized framework that supports lifelong (continual) learning of evolving topics. On the other hand, data turbulence is commonly present in streams due to real-life events. In that case, the design of StreamFed-LDA allows the model to learn new characteristics from the most recent data while maintaining the historical information. On massive streaming data, it is difficult and crucial to provide real-time inference results. To increase the throughput and reduce the latency, StreamFed-LDA introduces additional techniques that substantially reduce both computation and communication costs in distributed systems. Experiments on four real-world datasets show that the proposed framework achieves significantly better performance of online inference compared with the baselines. At the same time, StreamFed-LDA also reduces the latency by orders of magnitudes in real-world datasets.
{"title":"Distributed Latent Dirichlet Allocation on Streams","authors":"Yunyan Guo, Jianzhong Li","doi":"10.1145/3451528","DOIUrl":"https://doi.org/10.1145/3451528","url":null,"abstract":"Latent Dirichlet Allocation (LDA) has been widely used for topic modeling, with applications spanning various areas such as natural language processing and information retrieval. While LDA on small and static datasets has been extensively studied, several real-world challenges are posed in practical scenarios where datasets are often huge and are gathered in a streaming fashion. As the state-of-the-art LDA algorithm on streams, Streaming Variational Bayes (SVB) introduced Bayesian updating to provide a streaming procedure. However, the utility of SVB is limited in applications since it ignored three challenges of processing real-world streams: topic evolution, data turbulence, and real-time inference. In this article, we propose a novel distributed LDA algorithm—referred to as StreamFed-LDA—to deal with challenges on streams. For topic modeling of streaming data, the ability to capture evolving topics is essential for practical online inference. To achieve this goal, StreamFed-LDA is based on a specialized framework that supports lifelong (continual) learning of evolving topics. On the other hand, data turbulence is commonly present in streams due to real-life events. In that case, the design of StreamFed-LDA allows the model to learn new characteristics from the most recent data while maintaining the historical information. On massive streaming data, it is difficult and crucial to provide real-time inference results. To increase the throughput and reduce the latency, StreamFed-LDA introduces additional techniques that substantially reduce both computation and communication costs in distributed systems. Experiments on four real-world datasets show that the proposed framework achieves significantly better performance of online inference compared with the baselines. At the same time, StreamFed-LDA also reduces the latency by orders of magnitudes in real-world datasets.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"06 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128926622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Time-series forecasting is an important problem across a wide range of domains. Designing accurate and prompt forecasting algorithms is a non-trivial task, as temporal data that arise in real applications often involve both non-linear dynamics and linear dependencies, and always have some mixtures of sequential and periodic patterns, such as daily, weekly repetitions, and so on. At this point, however, most recent deep models often use Recurrent Neural Networks (RNNs) to capture these temporal patterns, which is hard to parallelize and not fast enough for real-world applications especially when a huge amount of user requests are coming. Recently, CNNs have demonstrated significant advantages for sequence modeling tasks over the de-facto RNNs, while providing high computational efficiency due to the inherent parallelism. In this work, we propose HyDCNN, a novel hybrid framework based on fully Dilated CNN for time-series forecasting tasks. The core component in HyDCNN is a proposed hybrid module, in which our proposed position-aware dilated CNNs are utilized to capture the sequential non-linear dynamics and an autoregressive model is leveraged to capture the sequential linear dependencies. To further capture the periodic temporal patterns, a novel hop scheme is introduced in the hybrid module. HyDCNN is then composed of multiple hybrid modules to capture the sequential and periodic patterns. Each of these hybrid modules targets on either the sequential pattern or one kind of periodic patterns. Extensive experiments on five real-world datasets have shown that the proposed HyDCNN is better compared with state-of-the-art baselines and is at least 200% better than RNN baselines. The datasets and source code will be published in Github to facilitate more future work.
{"title":"Modeling Temporal Patterns with Dilated Convolutions for Time-Series Forecasting","authors":"Yangfan Li, Kenli Li, Cen Chen, Xu Zhou, Zeng Zeng, Kuan-Ching Li","doi":"10.1145/3453724","DOIUrl":"https://doi.org/10.1145/3453724","url":null,"abstract":"Time-series forecasting is an important problem across a wide range of domains. Designing accurate and prompt forecasting algorithms is a non-trivial task, as temporal data that arise in real applications often involve both non-linear dynamics and linear dependencies, and always have some mixtures of sequential and periodic patterns, such as daily, weekly repetitions, and so on. At this point, however, most recent deep models often use Recurrent Neural Networks (RNNs) to capture these temporal patterns, which is hard to parallelize and not fast enough for real-world applications especially when a huge amount of user requests are coming. Recently, CNNs have demonstrated significant advantages for sequence modeling tasks over the de-facto RNNs, while providing high computational efficiency due to the inherent parallelism. In this work, we propose HyDCNN, a novel hybrid framework based on fully Dilated CNN for time-series forecasting tasks. The core component in HyDCNN is a proposed hybrid module, in which our proposed position-aware dilated CNNs are utilized to capture the sequential non-linear dynamics and an autoregressive model is leveraged to capture the sequential linear dependencies. To further capture the periodic temporal patterns, a novel hop scheme is introduced in the hybrid module. HyDCNN is then composed of multiple hybrid modules to capture the sequential and periodic patterns. Each of these hybrid modules targets on either the sequential pattern or one kind of periodic patterns. Extensive experiments on five real-world datasets have shown that the proposed HyDCNN is better compared with state-of-the-art baselines and is at least 200% better than RNN baselines. The datasets and source code will be published in Github to facilitate more future work.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124996552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The stock market investors aim at maximizing their investment returns. Stock recommendation task is to recommend stocks with higher return ratios for the investors. Most stock prediction methods study the historical sequence patterns to predict stock trend or price in the near future. In fact, the future price of a stock is correlated not only with its historical price, but also with other stocks. In this article, we take into account the relationships between stocks (corporations) by stock relation graph. Furthermore, we propose a Time-aware Relational Attention Network (TRAN) for graph-based stock recommendation according to return ratio ranking. In TRAN, the time-aware relational attention mechanism is designed to capture time-varying correlation strengths between stocks by the interaction of historical sequences and stock description documents. With the dynamic strengths, the nodes of the stock relation graph aggregate the features of neighbor stock nodes by graph convolution operation. For a given group of stocks, the proposed TRAN model can output the ranking results of stocks according to their return ratios. The experimental results on several real-world datasets demonstrate the effectiveness of our TRAN for stock recommendation.
{"title":"Graph-Based Stock Recommendation by Time-Aware Relational Attention Network","authors":"Jianliang Gao, Xiaoting Ying, Cong Xu, Jianxin Wang, Shichao Zhang, Zhao Li","doi":"10.1145/3451397","DOIUrl":"https://doi.org/10.1145/3451397","url":null,"abstract":"The stock market investors aim at maximizing their investment returns. Stock recommendation task is to recommend stocks with higher return ratios for the investors. Most stock prediction methods study the historical sequence patterns to predict stock trend or price in the near future. In fact, the future price of a stock is correlated not only with its historical price, but also with other stocks. In this article, we take into account the relationships between stocks (corporations) by stock relation graph. Furthermore, we propose a Time-aware Relational Attention Network (TRAN) for graph-based stock recommendation according to return ratio ranking. In TRAN, the time-aware relational attention mechanism is designed to capture time-varying correlation strengths between stocks by the interaction of historical sequences and stock description documents. With the dynamic strengths, the nodes of the stock relation graph aggregate the features of neighbor stock nodes by graph convolution operation. For a given group of stocks, the proposed TRAN model can output the ranking results of stocks according to their return ratios. The experimental results on several real-world datasets demonstrate the effectiveness of our TRAN for stock recommendation.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125607440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The main content of a webpage is often surrounded by other boilerplate elements related to the template, such as menus, advertisements, copyright notices, and comments. For crawlers and indexers, isolating the main content from the template and other noisy information is an essential task, because processing and storing noisy information produce a waste of resources such as bandwidth, storage space, and computing time. Besides, the detection and extraction of the main content is useful in different areas, such as data mining, web summarization, and content adaptation to low resolutions. This work introduces a new technique for main content extraction. In contrast to most techniques, this technique not only extracts text, but also other types of content, such as images, and animations. It is a Document Object Model-based page-level technique, thus it only needs to load one single webpage to extract the main content. As a consequence, it is efficient enough as to be used online (in real-time). We have empirically evaluated the technique using a suite of real heterogeneous benchmarks producing very good results compared with other well-known content extraction techniques.
{"title":"Page-Level Main Content Extraction From Heterogeneous Webpages","authors":"Julián Alarte, Josep Silva","doi":"10.1145/3451168","DOIUrl":"https://doi.org/10.1145/3451168","url":null,"abstract":"The main content of a webpage is often surrounded by other boilerplate elements related to the template, such as menus, advertisements, copyright notices, and comments. For crawlers and indexers, isolating the main content from the template and other noisy information is an essential task, because processing and storing noisy information produce a waste of resources such as bandwidth, storage space, and computing time. Besides, the detection and extraction of the main content is useful in different areas, such as data mining, web summarization, and content adaptation to low resolutions. This work introduces a new technique for main content extraction. In contrast to most techniques, this technique not only extracts text, but also other types of content, such as images, and animations. It is a Document Object Model-based page-level technique, thus it only needs to load one single webpage to extract the main content. As a consequence, it is efficient enough as to be used online (in real-time). We have empirically evaluated the technique using a suite of real heterogeneous benchmarks producing very good results compared with other well-known content extraction techniques.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114326070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huandong Wang, Yong Li, Mu Du, Zhenhui Li, Depeng Jin
Both app developers and service providers have strong motivations to understand when and where certain apps are used by users. However, it has been a challenging problem due to the highly skewed and noisy app usage data. Moreover, apps are regarded as independent items in existing studies, which fail to capture the hidden semantics in app usage traces. In this article, we propose App2Vec, a powerful representation learning model to learn the semantic embedding of apps with the consideration of spatio-temporal context. Based on the obtained semantic embeddings, we develop a probabilistic model based on the Bayesian mixture model and Dirichlet process to capture when, where, and what semantics of apps are used to predict the future usage. We evaluate our model using two different app usage datasets, which involve over 1.7 million users and 2,000+ apps. Evaluation results show that our proposed App2Vec algorithm outperforms the state-of-the-art algorithms in app usage prediction with a performance gap of over 17.0%.
{"title":"App2Vec: Context-Aware Application Usage Prediction","authors":"Huandong Wang, Yong Li, Mu Du, Zhenhui Li, Depeng Jin","doi":"10.1145/3451396","DOIUrl":"https://doi.org/10.1145/3451396","url":null,"abstract":"Both app developers and service providers have strong motivations to understand when and where certain apps are used by users. However, it has been a challenging problem due to the highly skewed and noisy app usage data. Moreover, apps are regarded as independent items in existing studies, which fail to capture the hidden semantics in app usage traces. In this article, we propose App2Vec, a powerful representation learning model to learn the semantic embedding of apps with the consideration of spatio-temporal context. Based on the obtained semantic embeddings, we develop a probabilistic model based on the Bayesian mixture model and Dirichlet process to capture when, where, and what semantics of apps are used to predict the future usage. We evaluate our model using two different app usage datasets, which involve over 1.7 million users and 2,000+ apps. Evaluation results show that our proposed App2Vec algorithm outperforms the state-of-the-art algorithms in app usage prediction with a performance gap of over 17.0%.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121402201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}