首页 > 最新文献

Proceedings of the 13th International Conference on Web Search and Data Mining最新文献

英文 中文
Overlapping Community Detection in Static and Dynamic Networks 静态和动态网络中的重叠社区检测
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3372185
Renny Márquez
Studying behavior of systems through networks is important because it allows to understand them and make decisions based on this knowledge. Community detection is one of the tools used in this sense, for detection of groups in graphs. This can be done not only considering connections between nodes, but also including their attributes. Also, objects can be part of different groups with varying degrees, so overlapping fuzzy assignment is relevant in this context. Furthermore, most networks change overtime, so including this aspect also enhance the benefits of using community detection. Hence, in this doctoral thesis we propose to design models for overlapping community detection for static and dynamic networks with node attributes. Firstly, an approach based on a nonnegative matrix factorization generative model that automatically detects the number of communities in the network, is designed. Secondly, tensor factorization is used in order to overcome some of the challenges faced in the first model.
通过网络研究系统的行为是很重要的,因为它允许理解它们并根据这些知识做出决策。社区检测是在这种意义上使用的工具之一,用于检测图中的组。这不仅可以考虑节点之间的连接,还可以考虑它们的属性。此外,对象可以是不同程度的不同组的一部分,因此重叠模糊分配在这种情况下是相关的。此外,大多数网络会随着时间的推移而变化,因此包含这一方面也增强了使用社区检测的好处。因此,在本博士论文中,我们提出设计具有节点属性的静态和动态网络的重叠社区检测模型。首先,设计了一种基于非负矩阵分解生成模型的自动检测社区数量的方法。其次,为了克服第一个模型所面临的一些挑战,使用了张量分解。
{"title":"Overlapping Community Detection in Static and Dynamic Networks","authors":"Renny Márquez","doi":"10.1145/3336191.3372185","DOIUrl":"https://doi.org/10.1145/3336191.3372185","url":null,"abstract":"Studying behavior of systems through networks is important because it allows to understand them and make decisions based on this knowledge. Community detection is one of the tools used in this sense, for detection of groups in graphs. This can be done not only considering connections between nodes, but also including their attributes. Also, objects can be part of different groups with varying degrees, so overlapping fuzzy assignment is relevant in this context. Furthermore, most networks change overtime, so including this aspect also enhance the benefits of using community detection. Hence, in this doctoral thesis we propose to design models for overlapping community detection for static and dynamic networks with node attributes. Firstly, an approach based on a nonnegative matrix factorization generative model that automatically detects the number of communities in the network, is designed. Secondly, tensor factorization is used in order to overcome some of the challenges faced in the first model.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122257595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
ADMM SLIM: Sparse Recommendations for Many Users ADMM SLIM:针对许多用户的稀疏推荐
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3371774
H. Steck, Maria Dimakopoulou, Nickolai Riabov, T. Jebara
The Sparse Linear Method (SLIM) is a well-established approach for top-N recommendations. This article proposes several improvements that are enabled by the Alternating Directions Method of Multipliers (ADMM), a well-known optimization method with many application areas. First, we show that optimizing the original SLIM-objective by ADMM results in an approach where the training time is independent of the number of users in the training data, and hence trivially scales to large numbers of users. Second, the flexibility of ADMM allows us to switch on and off the various constraints and regularization terms in the original SLIM-objective, in order to empirically assess their contributions to ranking accuracy on given data. Third, we also propose two extensions to the original SLIM training-objective in order to improve recommendation accuracy further without increasing the computational cost. In our experiments on three well-known data-sets, we first compare to the original SLIM-implementation and find that not only ADMM reduces training time considerably, but also achieves an improvement in recommendation accuracy due to better optimization. We then compare to various state-of-the-art approaches and observe up to 25% improvement in recommendation accuracy in our experiments. Finally, we evaluate the importance of sparsity and the non-negativity constraint in the original SLIM-objective with sub-sampling experiments that simulate scenarios of cold-starting and large catalog sizes compared to relatively small user base, which often occur in practice.
稀疏线性方法(SLIM)是一种成熟的top-N推荐方法。本文提出了乘法器交替方向法(ADMM)的几个改进,ADMM是一种众所周知的优化方法,具有许多应用领域。首先,我们证明了通过ADMM优化原始SLIM-objective的结果是训练时间与训练数据中的用户数量无关,因此可以轻松扩展到大量用户。其次,ADMM的灵活性允许我们打开和关闭原始SLIM-objective中的各种约束和正则化项,以便根据经验评估它们对给定数据的排名准确性的贡献。第三,我们在原有SLIM训练目标的基础上提出了两个扩展,在不增加计算成本的前提下进一步提高推荐准确率。在我们对三个知名数据集的实验中,我们首先与原始SLIM-implementation进行了比较,发现ADMM不仅大大减少了训练时间,而且由于更好的优化,推荐准确率也得到了提高。然后,我们比较了各种最先进的方法,并观察到在我们的实验中推荐准确性提高了25%。最后,我们评估了稀疏性和非负性约束在原始slim目标中的重要性,通过模拟实际中经常发生的冷启动和大目录规模相比于相对较小的用户群的子抽样实验。
{"title":"ADMM SLIM: Sparse Recommendations for Many Users","authors":"H. Steck, Maria Dimakopoulou, Nickolai Riabov, T. Jebara","doi":"10.1145/3336191.3371774","DOIUrl":"https://doi.org/10.1145/3336191.3371774","url":null,"abstract":"The Sparse Linear Method (SLIM) is a well-established approach for top-N recommendations. This article proposes several improvements that are enabled by the Alternating Directions Method of Multipliers (ADMM), a well-known optimization method with many application areas. First, we show that optimizing the original SLIM-objective by ADMM results in an approach where the training time is independent of the number of users in the training data, and hence trivially scales to large numbers of users. Second, the flexibility of ADMM allows us to switch on and off the various constraints and regularization terms in the original SLIM-objective, in order to empirically assess their contributions to ranking accuracy on given data. Third, we also propose two extensions to the original SLIM training-objective in order to improve recommendation accuracy further without increasing the computational cost. In our experiments on three well-known data-sets, we first compare to the original SLIM-implementation and find that not only ADMM reduces training time considerably, but also achieves an improvement in recommendation accuracy due to better optimization. We then compare to various state-of-the-art approaches and observe up to 25% improvement in recommendation accuracy in our experiments. Finally, we evaluate the importance of sparsity and the non-negativity constraint in the original SLIM-objective with sub-sampling experiments that simulate scenarios of cold-starting and large catalog sizes compared to relatively small user base, which often occur in practice.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122385755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Beyond Sessions: Exploiting Hybrid Contextual Information for Web Search 超越会话:为网络搜索开发混合上下文信息
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3372179
Jia Chen
It is essential to fully understand user intents for the optimization of downstream tasks such as document ranking and query suggestion in web search. As users tend to submit ambiguous queries, numer- ous studies utilize contextual information such as query sequence and user clicks for the auxiliary of user intent modeling. Most of these work adopted Recurrent Neural Network (RNN) based frame- works to encode sequential information within a session, which is hard to realize parallel computation. To this end, we plan to adopt attention-based units to generate context-aware representations for elements in sessions. As intra-session contexts are deficient for handling the data sparsity and cold-start problems in session search, we would also attempt to integrate cross-session dependen- cies by constructing session graphs on the whole corpus to enrich the representation of queries and documents.
充分了解用户意图对于优化web搜索中的文档排序和查询建议等下游任务至关重要。由于用户倾向于提交模棱两可的查询,许多研究利用上下文信息(如查询序列和用户点击)作为用户意图建模的辅助。这些工作大多采用基于递归神经网络(RNN)的帧工作对一个会话内的顺序信息进行编码,难以实现并行计算。为此,我们计划采用基于注意力的单元来为会话中的元素生成上下文感知的表示。由于会话上下文在处理会话搜索中的数据稀疏性和冷启动问题方面存在不足,我们还将尝试通过在整个语料库上构建会话图来集成跨会话依赖关系,以丰富查询和文档的表示。
{"title":"Beyond Sessions: Exploiting Hybrid Contextual Information for Web Search","authors":"Jia Chen","doi":"10.1145/3336191.3372179","DOIUrl":"https://doi.org/10.1145/3336191.3372179","url":null,"abstract":"It is essential to fully understand user intents for the optimization of downstream tasks such as document ranking and query suggestion in web search. As users tend to submit ambiguous queries, numer- ous studies utilize contextual information such as query sequence and user clicks for the auxiliary of user intent modeling. Most of these work adopted Recurrent Neural Network (RNN) based frame- works to encode sequential information within a session, which is hard to realize parallel computation. To this end, we plan to adopt attention-based units to generate context-aware representations for elements in sessions. As intra-session contexts are deficient for handling the data sparsity and cold-start problems in session search, we would also attempt to integrate cross-session dependen- cies by constructing session graphs on the whole corpus to enrich the representation of queries and documents.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126150727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Enhancing Re-finding Behavior with External Memories for Personalized Search 利用外部记忆增强个性化搜索的重新发现行为
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3371794
Yujia Zhou, Zhicheng Dou, Ji-rong Wen
The goal of personalized search is to tailor the document ranking list to meet user's individual needs. Previous studies showed users usually look for the information that has been searched before. This is called re-finding behavior which is widely explored in existing personalized search approaches. However, most existing methods for identifying re-finding behavior focus on simple lexical similarities between queries. In this paper, we propose to construct memory networks (MN) to support the identification of more complex re-finding behavior. Specifically, incorporating semantic information, we devise two external memories to make an expansion of re-finding based on the query and the document respectively. We further design an intent memory to recognize session-based re-finding behavior. Endowed with these memory networks, we can build a fine-grained user model dynamically based on the current query and documents, and use the model to re-rank the results. Experimental results show the significant improvement of our model compared with traditional methods.
个性化搜索的目标是定制文档排名列表,以满足用户的个性化需求。之前的研究表明,用户通常会寻找之前搜索过的信息。这被称为重新发现行为,在现有的个性化搜索方法中得到了广泛的探索。然而,大多数现有的识别重新查找行为的方法只关注查询之间的简单词汇相似性。在本文中,我们提出构建记忆网络(MN)来支持识别更复杂的重新发现行为。具体来说,结合语义信息,我们设计了两个外部存储器,分别在查询和文档的基础上扩展重新查找。我们进一步设计了一个意图存储器来识别基于会话的重新查找行为。利用这些内存网络,我们可以基于当前查询和文档动态构建细粒度的用户模型,并使用该模型对结果进行重新排序。实验结果表明,与传统方法相比,我们的模型有了显著的改进。
{"title":"Enhancing Re-finding Behavior with External Memories for Personalized Search","authors":"Yujia Zhou, Zhicheng Dou, Ji-rong Wen","doi":"10.1145/3336191.3371794","DOIUrl":"https://doi.org/10.1145/3336191.3371794","url":null,"abstract":"The goal of personalized search is to tailor the document ranking list to meet user's individual needs. Previous studies showed users usually look for the information that has been searched before. This is called re-finding behavior which is widely explored in existing personalized search approaches. However, most existing methods for identifying re-finding behavior focus on simple lexical similarities between queries. In this paper, we propose to construct memory networks (MN) to support the identification of more complex re-finding behavior. Specifically, incorporating semantic information, we devise two external memories to make an expansion of re-finding based on the query and the document respectively. We further design an intent memory to recognize session-based re-finding behavior. Endowed with these memory networks, we can build a fine-grained user model dynamically based on the current query and documents, and use the model to re-rank the results. Experimental results show the significant improvement of our model compared with traditional methods.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114497326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
MRAEA
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3371804
Xin Mao, Wenting Wang, Huimin Xu, Man Lan, Yuanbin Wu
Entity alignment to find equivalent entities in cross-lingual Knowledge Graphs (KGs) plays a vital role in automatically integrating multiple KGs. Existing translation-based entity alignment methods jointly model the cross-lingual knowledge and monolingual knowledge into one unified optimization problem. On the other hand, the Graph Neural Network (GNN) based methods either ignore the node differentiations, or represent relation through entity or triple instances. They all fail to model the meta semantics embedded in relation nor complex relations such as n-to-n and multi-graphs. To tackle these challenges, we propose a novel Meta Relation Aware Entity Alignment (MRAEA) to directly model cross-lingual entity embeddings by attending over the node's incoming and outgoing neighbors and its connected relations' meta semantics. In addition, we also propose a simple and effective bi-directional iterative strategy to add new aligned seeds during training. Our experiments on all three benchmark entity alignment datasets show that our approach consistently outperforms the state-of-the-art methods, exceeding by 15%-58% on Hit@1. Through an extensive ablation study, we validate that the proposed meta relation aware representations, relation aware self-attention and bi-directional iterative strategy of new seed selection all make contributions to significant performance improvement. The code is available at https://github.com/MaoXinn/MRAEA.
{"title":"MRAEA","authors":"Xin Mao, Wenting Wang, Huimin Xu, Man Lan, Yuanbin Wu","doi":"10.1145/3336191.3371804","DOIUrl":"https://doi.org/10.1145/3336191.3371804","url":null,"abstract":"Entity alignment to find equivalent entities in cross-lingual Knowledge Graphs (KGs) plays a vital role in automatically integrating multiple KGs. Existing translation-based entity alignment methods jointly model the cross-lingual knowledge and monolingual knowledge into one unified optimization problem. On the other hand, the Graph Neural Network (GNN) based methods either ignore the node differentiations, or represent relation through entity or triple instances. They all fail to model the meta semantics embedded in relation nor complex relations such as n-to-n and multi-graphs. To tackle these challenges, we propose a novel Meta Relation Aware Entity Alignment (MRAEA) to directly model cross-lingual entity embeddings by attending over the node's incoming and outgoing neighbors and its connected relations' meta semantics. In addition, we also propose a simple and effective bi-directional iterative strategy to add new aligned seeds during training. Our experiments on all three benchmark entity alignment datasets show that our approach consistently outperforms the state-of-the-art methods, exceeding by 15%-58% on Hit@1. Through an extensive ablation study, we validate that the proposed meta relation aware representations, relation aware self-attention and bi-directional iterative strategy of new seed selection all make contributions to significant performance improvement. The code is available at https://github.com/MaoXinn/MRAEA.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122275021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Estimation-Action-Reflection: Towards Deep Interaction Between Conversational and Recommender Systems 评估-行动-反思:对话系统和推荐系统之间的深度互动
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3371769
Wenqiang Lei, Xiangnan He, Yisong Miao, Qingyun Wu, Richang Hong, Min-Yen Kan, Tat-Seng Chua
Recommender systems are embracing conversational technologies to obtain user preferences dynamically, and to overcome inherent limitations of their static models. A successful Conversational Recommender System (CRS) requires proper handling of interactions between conversation and recommendation. We argue that three fundamental problems need to be solved: 1) what questions to ask regarding item attributes, 2) when to recommend items, and 3) how to adapt to the users' online feedback. To the best of our knowledge, there lacks a unified framework that addresses these problems. In this work, we fill this missing interaction framework gap by proposing a new CRS framework named Estimation"Action" Reflection, or EAR, which consists of three stages to better converse with users. (1) Estimation, which builds predictive models to estimate user preference on both items and item attributes; (2) Action, which learns a dialogue policy to determine whether to ask attributes or recommend items, based on Estimation stage and conversation history; and (3) Reflection, which updates the recommender model when a user rejects the recommendations made by the Action stage. We present two conversation scenarios on binary and enumerated questions, and conduct extensive experiments on two datasets from Yelp and LastFM, for each scenario, respectively. Our experiments demonstrate significant improvements over the state-of-the-art method CRM [32], corresponding to fewer conversation turns and a higher level of recommendation hits.
推荐系统正在采用会话技术来动态获取用户偏好,并克服其静态模型的固有局限性。一个成功的会话推荐系统(CRS)需要正确处理会话和推荐之间的交互。我们认为需要解决三个基本问题:1)关于商品属性该问什么问题,2)何时推荐商品,以及3)如何适应用户的在线反馈。据我们所知,目前还缺乏解决这些问题的统一框架。在这项工作中,我们通过提出一个新的CRS框架来填补这个缺失的交互框架的空白,该框架名为评估“行动”反射,或EAR,它由三个阶段组成,以更好地与用户交谈。(1)估计,构建预测模型来估计用户对物品和物品属性的偏好;(2) Action,学习对话策略,根据estimate阶段和对话历史来决定是询问属性还是推荐项目;(3) Reflection,当用户拒绝Action阶段的推荐时,会更新推荐模型。我们提出了关于二进制和枚举问题的两个对话场景,并分别在来自Yelp和LastFM的两个数据集上对每个场景进行了广泛的实验。我们的实验表明,与最先进的CRM方法相比,该方法有了显著的改进[32],对应于更少的会话次数和更高的推荐点击率。
{"title":"Estimation-Action-Reflection: Towards Deep Interaction Between Conversational and Recommender Systems","authors":"Wenqiang Lei, Xiangnan He, Yisong Miao, Qingyun Wu, Richang Hong, Min-Yen Kan, Tat-Seng Chua","doi":"10.1145/3336191.3371769","DOIUrl":"https://doi.org/10.1145/3336191.3371769","url":null,"abstract":"Recommender systems are embracing conversational technologies to obtain user preferences dynamically, and to overcome inherent limitations of their static models. A successful Conversational Recommender System (CRS) requires proper handling of interactions between conversation and recommendation. We argue that three fundamental problems need to be solved: 1) what questions to ask regarding item attributes, 2) when to recommend items, and 3) how to adapt to the users' online feedback. To the best of our knowledge, there lacks a unified framework that addresses these problems. In this work, we fill this missing interaction framework gap by proposing a new CRS framework named Estimation\"Action\" Reflection, or EAR, which consists of three stages to better converse with users. (1) Estimation, which builds predictive models to estimate user preference on both items and item attributes; (2) Action, which learns a dialogue policy to determine whether to ask attributes or recommend items, based on Estimation stage and conversation history; and (3) Reflection, which updates the recommender model when a user rejects the recommendations made by the Action stage. We present two conversation scenarios on binary and enumerated questions, and conduct extensive experiments on two datasets from Yelp and LastFM, for each scenario, respectively. Our experiments demonstrate significant improvements over the state-of-the-art method CRM [32], corresponding to fewer conversation turns and a higher level of recommendation hits.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123673216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 183
Hierarchical User Profiling for E-commerce Recommender Systems 电子商务推荐系统的分层用户分析
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3371827
Yulong Gu, Zhuoye Ding, Shuaiqiang Wang, Dawei Yin
Hierarchical user profiling that aims to model users' real-time interests in different granularity is an essential issue for personalized recommendations in E-commerce. On one hand, items (i.e. products) are usually organized hierarchically in categories, and correspondingly users' interests are naturally hierarchical on different granularity of items and categories. On the other hand, multiple granularity oriented recommendations become very popular in E-commerce sites, which require hierarchical user profiling in different granularity as well. In this paper, we propose HUP, a Hierarchical User Profiling framework to solve the hierarchical user profiling problem in E-commerce recommender systems. In HUP, we provide a Pyramid Recurrent Neural Networks, equipped with Behavior-LSTM to formulate users' hierarchical real-time interests at multiple scales. Furthermore, instead of simply utilizing users' item-level behaviors (e.g., ratings or clicks) in conventional methods, HUP harvests the sequential information of users' temporal finely-granular interactions (micro-behaviors, e.g., clicks on components of items like pictures or comments, browses with navigation of the search engines or recommendations) for modeling. Extensive experiments on two real-world E-commerce datasets demonstrate the significant performance gains of the HUP against state-of-the-art methods for the hierarchical user profiling and recommendation problems. We release the codes and datasets at https://github.com/guyulongcs/WSDM2020_HUP.
分层用户分析是电子商务中个性化推荐的一个重要问题,它旨在对用户的实时兴趣进行不同粒度的建模。一方面,物品(即产品)通常按类别进行分层组织,相应地,用户的兴趣自然会在物品和类别的不同粒度上进行分层。另一方面,面向多个粒度的推荐在电子商务网站中变得非常流行,这也需要不同粒度的分层用户分析。为了解决电子商务推荐系统中的分层用户分析问题,本文提出了分层用户分析框架HUP。在HUP中,我们提供了一个金字塔递归神经网络,配备了Behavior-LSTM来制定用户在多个尺度上的分层实时兴趣。此外,与传统方法中简单地利用用户的项目级行为(例如,评分或点击)不同,HUP收集用户时间细粒度交互(微观行为,例如,点击图片或评论等项目组件,浏览搜索引擎导航或推荐)的顺序信息进行建模。在两个真实世界的电子商务数据集上进行的大量实验表明,在分层用户分析和推荐问题上,HUP相对于最先进的方法取得了显著的性能提升。我们在https://github.com/guyulongcs/WSDM2020_HUP上发布代码和数据集。
{"title":"Hierarchical User Profiling for E-commerce Recommender Systems","authors":"Yulong Gu, Zhuoye Ding, Shuaiqiang Wang, Dawei Yin","doi":"10.1145/3336191.3371827","DOIUrl":"https://doi.org/10.1145/3336191.3371827","url":null,"abstract":"Hierarchical user profiling that aims to model users' real-time interests in different granularity is an essential issue for personalized recommendations in E-commerce. On one hand, items (i.e. products) are usually organized hierarchically in categories, and correspondingly users' interests are naturally hierarchical on different granularity of items and categories. On the other hand, multiple granularity oriented recommendations become very popular in E-commerce sites, which require hierarchical user profiling in different granularity as well. In this paper, we propose HUP, a Hierarchical User Profiling framework to solve the hierarchical user profiling problem in E-commerce recommender systems. In HUP, we provide a Pyramid Recurrent Neural Networks, equipped with Behavior-LSTM to formulate users' hierarchical real-time interests at multiple scales. Furthermore, instead of simply utilizing users' item-level behaviors (e.g., ratings or clicks) in conventional methods, HUP harvests the sequential information of users' temporal finely-granular interactions (micro-behaviors, e.g., clicks on components of items like pictures or comments, browses with navigation of the search engines or recommendations) for modeling. Extensive experiments on two real-world E-commerce datasets demonstrate the significant performance gains of the HUP against state-of-the-art methods for the hierarchical user profiling and recommendation problems. We release the codes and datasets at https://github.com/guyulongcs/WSDM2020_HUP.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126574228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 65
SPread: Automated Financial Metric Extraction and Spreading Tool from Earnings Reports 从收益报告中自动提取和传播财务指标工具
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3371869
Armineh Nourbakhsh, M. Ghassemi, Steven Pomerville
In this paper, we present SPread, an automated financial metric extraction and spreading tool from earnings reports. The tool is created in a document-agnostic fashion, and uses an interpolation of tagging methods to capture arbitrarily complicated expressions. SPread can handle single-line items as well as metrics broken down into sub-items. A validation layer further improves the performance of upstream modules and enables the tool to reach an F1 performance of more than 87% for metrics expressed in tabular format, and 76% for metrics in free-form text. The results are displayed to end-users in an interactive web interface, which allows them to locate, compare, validate, adjust, and export the values.
在本文中,我们介绍了SPread,一个从收益报告中自动提取和传播财务指标的工具。该工具以与文档无关的方式创建,并使用标记方法的插值来捕获任意复杂的表达式。SPread可以处理单行项目,也可以处理分解成子项目的指标。验证层进一步提高了上游模块的性能,并使该工具能够达到F1性能,以表格格式表示的指标超过87%,以自由格式文本表示的指标超过76%。结果在交互式web界面中显示给最终用户,允许他们定位、比较、验证、调整和导出值。
{"title":"SPread: Automated Financial Metric Extraction and Spreading Tool from Earnings Reports","authors":"Armineh Nourbakhsh, M. Ghassemi, Steven Pomerville","doi":"10.1145/3336191.3371869","DOIUrl":"https://doi.org/10.1145/3336191.3371869","url":null,"abstract":"In this paper, we present SPread, an automated financial metric extraction and spreading tool from earnings reports. The tool is created in a document-agnostic fashion, and uses an interpolation of tagging methods to capture arbitrarily complicated expressions. SPread can handle single-line items as well as metrics broken down into sub-items. A validation layer further improves the performance of upstream modules and enables the tool to reach an F1 performance of more than 87% for metrics expressed in tabular format, and 76% for metrics in free-form text. The results are displayed to end-users in an interactive web interface, which allows them to locate, compare, validate, adjust, and export the values.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116779073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Workshop on Privacy in NLP (PrivateNLP 2020) NLP私隐工作坊(PrivateNLP 2020)
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3371881
Oluwaseyi Feyisetan, S. Ghanavati, Patricia Thaine
Privacy-preserving data analysis has become essential in Machine Learning (ML), where access to vast amounts of data can provide large gains the in accuracies of tuned models. A large proportion of user-contributed data comes from natural language e.g., text transcriptions from voice assistants. It is therefore important for curated natural language datasets to preserve the privacy of the users whose data is collected and for the models trained on sensitive data to only retain non-identifying (i.e., generalizable) information. The workshop aims to bring together researchers and practitioners from academia and industry to discuss the challenges and approaches to designing, building, verifying, and testing privacy-preserving systems in the context of Natural Language Processing (NLP).
保护隐私的数据分析在机器学习(ML)中变得至关重要,在机器学习中,对大量数据的访问可以大大提高调优模型的准确性。很大一部分用户贡献的数据来自自然语言,例如语音助手的文本转录。因此,对于精心策划的自然语言数据集来说,保护被收集数据的用户的隐私以及对敏感数据进行训练的模型只保留非识别(即可概括)信息是很重要的。研讨会旨在汇集来自学术界和工业界的研究人员和实践者,讨论在自然语言处理(NLP)背景下设计、构建、验证和测试隐私保护系统的挑战和方法。
{"title":"Workshop on Privacy in NLP (PrivateNLP 2020)","authors":"Oluwaseyi Feyisetan, S. Ghanavati, Patricia Thaine","doi":"10.1145/3336191.3371881","DOIUrl":"https://doi.org/10.1145/3336191.3371881","url":null,"abstract":"Privacy-preserving data analysis has become essential in Machine Learning (ML), where access to vast amounts of data can provide large gains the in accuracies of tuned models. A large proportion of user-contributed data comes from natural language e.g., text transcriptions from voice assistants. It is therefore important for curated natural language datasets to preserve the privacy of the users whose data is collected and for the models trained on sensitive data to only retain non-identifying (i.e., generalizable) information. The workshop aims to bring together researchers and practitioners from academia and industry to discuss the challenges and approaches to designing, building, verifying, and testing privacy-preserving systems in the context of Natural Language Processing (NLP).","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114697931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Predicting Human Mobility via Attentive Convolutional Network 通过细心卷积网络预测人类移动性
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3371846
Congcong Miao, Ziyan Luo, Fengzhu Zeng, Jilong Wang
Predicting human mobility is an important trajectory mining task for various applications, ranging from smart city planning to personalized recommendation system. While most of previous works adopt GPS tracking data to model human mobility, the recent fast-growing geo-tagged social media (GTSM) data brings new opportunities to this task. However, predicting human mobility on GTSM data is not trivial because of three challenges: 1) extreme data sparsity; 2) high order sequential patterns of human mobility and 3) evolving preference of users for tagging. In this paper, we propose ACN, an attentive convolutional network model for predicting human mobility from sparse and complex GTSM data. In ACN, we firstly design a multi-dimension embedding layer which jointly embeds key features (i.e., spatial, temporal and user features) that govern human mobility. Then, we regard the embedded trajectory as an "image" and learn short-term sequential patterns as local features of the image using convolution filters. Instead of directly using convention filters, we design hybrid dilated and separable convolution filters to effectively capture high order sequential patterns from lengthy trajectory. In addition, we propose an attention mechanism which learns the user long-term preference to augment convolutional network for mobility prediction. We conduct extensive experiments on three publicly available GTSM datasets to evaluate the effectiveness of our model. The results demonstrate that ACN consistently outperforms existing state-of-art mobility prediction approaches on a variety of common evaluation metrics.
从智能城市规划到个性化推荐系统,预测人类的移动性是一项重要的轨迹挖掘任务。虽然以前的大多数工作采用GPS跟踪数据来模拟人类的移动,但最近快速增长的地理标记社交媒体(GTSM)数据为这一任务带来了新的机会。然而,在GTSM数据上预测人类流动性并非易事,因为存在三个挑战:1)极端的数据稀疏性;2)人类活动的高阶顺序模式和3)用户对标签的偏好演变。在本文中,我们提出了一种关注卷积网络模型ACN,用于从稀疏和复杂的GTSM数据中预测人类迁移。在ACN中,我们首先设计了一个多维嵌入层,该嵌入层联合嵌入了控制人类移动性的关键特征(即空间、时间和用户特征)。然后,我们将嵌入的轨迹视为“图像”,并使用卷积滤波器学习短期序列模式作为图像的局部特征。我们设计了混合扩展和可分离卷积滤波器,以有效地捕获长轨迹中的高阶序列模式,而不是直接使用常规滤波器。此外,我们提出了一种学习用户长期偏好的注意机制,以增强卷积网络的移动性预测。我们在三个公开可用的GTSM数据集上进行了广泛的实验,以评估我们模型的有效性。结果表明,在各种常见的评估指标上,ACN始终优于现有的最先进的移动性预测方法。
{"title":"Predicting Human Mobility via Attentive Convolutional Network","authors":"Congcong Miao, Ziyan Luo, Fengzhu Zeng, Jilong Wang","doi":"10.1145/3336191.3371846","DOIUrl":"https://doi.org/10.1145/3336191.3371846","url":null,"abstract":"Predicting human mobility is an important trajectory mining task for various applications, ranging from smart city planning to personalized recommendation system. While most of previous works adopt GPS tracking data to model human mobility, the recent fast-growing geo-tagged social media (GTSM) data brings new opportunities to this task. However, predicting human mobility on GTSM data is not trivial because of three challenges: 1) extreme data sparsity; 2) high order sequential patterns of human mobility and 3) evolving preference of users for tagging. In this paper, we propose ACN, an attentive convolutional network model for predicting human mobility from sparse and complex GTSM data. In ACN, we firstly design a multi-dimension embedding layer which jointly embeds key features (i.e., spatial, temporal and user features) that govern human mobility. Then, we regard the embedded trajectory as an \"image\" and learn short-term sequential patterns as local features of the image using convolution filters. Instead of directly using convention filters, we design hybrid dilated and separable convolution filters to effectively capture high order sequential patterns from lengthy trajectory. In addition, we propose an attention mechanism which learns the user long-term preference to augment convolutional network for mobility prediction. We conduct extensive experiments on three publicly available GTSM datasets to evaluate the effectiveness of our model. The results demonstrate that ACN consistently outperforms existing state-of-art mobility prediction approaches on a variety of common evaluation metrics.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125273952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
期刊
Proceedings of the 13th International Conference on Web Search and Data Mining
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1