首页 > 最新文献

The World Wide Web Conference最新文献

英文 中文
Sampled in Pairs and Driven by Text: A New Graph Embedding Framework 成对采样和文本驱动:一种新的图嵌入框架
Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313520
Liheng Chen, Yanru Qu, Zhenghui Wang, Lin Qiu, Weinan Zhang, Ken Chen, Shaodian Zhang, Yong Yu
In graphs with rich texts, incorporating textual information with structural information would benefit constructing expressive graph embeddings. Among various graph embedding models, random walk (RW)-based is one of the most popular and successful groups. However, it is challenged by two issues when applied on graphs with rich texts: (i) sampling efficiency: deriving from the training objective of RW-based models (e.g., DeepWalk and node2vec), we show that RW-based models are likely to generate large amounts of redundant training samples due to three main drawbacks. (ii) text utilization: these models have difficulty in dealing with zero-shot scenarios where graph embedding models have to infer graph structures directly from texts. To solve these problems, we propose a novel framework, namely Text-driven Graph Embedding with Pairs Sampling (TGE-PS). TGE-PS uses Pairs Sampling (PS) to improve the sampling strategy of RW, being able to reduce ~ 99% training samples while preserving competitive performance. TGE-PS uses Text-driven Graph Embedding (TGE), an inductive graph embedding approach, to generate node embeddings from texts. Since each node contains rich texts, TGE is able to generate high-quality embeddings and provide reasonable predictions on existence of links to unseen nodes. We evaluate TGE-PS on several real-world datasets, and experiment results demonstrate that TGE-PS produces state-of-the-art results on both traditional and zero-shot link prediction tasks.
在具有丰富文本的图中,将文本信息与结构信息结合将有利于构造富有表现力的图嵌入。在各种图嵌入模型中,基于随机漫步的图嵌入模型是最受欢迎和成功的一种。然而,当应用于具有丰富文本的图时,它受到两个问题的挑战:(i)采样效率:从基于rw的模型(例如DeepWalk和node2vec)的训练目标出发,我们发现基于rw的模型可能会产生大量冗余的训练样本,这主要有三个缺点。(ii)文本利用:在图嵌入模型必须直接从文本推断图结构的情况下,这些模型难以处理零射击场景。为了解决这些问题,我们提出了一种新的框架,即文本驱动图嵌入对采样(TGE-PS)。TGE-PS使用成对采样(PS)来改进RW的采样策略,能够在保持竞争性能的同时减少~ 99%的训练样本。TGE- ps使用文本驱动图嵌入(TGE),一种归纳图嵌入方法,从文本中生成节点嵌入。由于每个节点都包含丰富的文本,TGE能够生成高质量的嵌入,并对未见节点的链接的存在提供合理的预测。我们在几个真实数据集上评估了ge - ps,实验结果表明ge - ps在传统和零射击链路预测任务上都能产生最先进的结果。
{"title":"Sampled in Pairs and Driven by Text: A New Graph Embedding Framework","authors":"Liheng Chen, Yanru Qu, Zhenghui Wang, Lin Qiu, Weinan Zhang, Ken Chen, Shaodian Zhang, Yong Yu","doi":"10.1145/3308558.3313520","DOIUrl":"https://doi.org/10.1145/3308558.3313520","url":null,"abstract":"In graphs with rich texts, incorporating textual information with structural information would benefit constructing expressive graph embeddings. Among various graph embedding models, random walk (RW)-based is one of the most popular and successful groups. However, it is challenged by two issues when applied on graphs with rich texts: (i) sampling efficiency: deriving from the training objective of RW-based models (e.g., DeepWalk and node2vec), we show that RW-based models are likely to generate large amounts of redundant training samples due to three main drawbacks. (ii) text utilization: these models have difficulty in dealing with zero-shot scenarios where graph embedding models have to infer graph structures directly from texts. To solve these problems, we propose a novel framework, namely Text-driven Graph Embedding with Pairs Sampling (TGE-PS). TGE-PS uses Pairs Sampling (PS) to improve the sampling strategy of RW, being able to reduce ~ 99% training samples while preserving competitive performance. TGE-PS uses Text-driven Graph Embedding (TGE), an inductive graph embedding approach, to generate node embeddings from texts. Since each node contains rich texts, TGE is able to generate high-quality embeddings and provide reasonable predictions on existence of links to unseen nodes. We evaluate TGE-PS on several real-world datasets, and experiment results demonstrate that TGE-PS produces state-of-the-art results on both traditional and zero-shot link prediction tasks.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"239 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79302356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The Illusion of Change: Correcting for Biases in Change Inference for Sparse, Societal-Scale Data 变化的错觉:对稀疏的社会尺度数据的变化推断的偏差纠正
Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313722
Gabriel Cadamuro, Ramya Korlakai Vinayak, J. Blumenstock, S. Kakade, Jacob N. Shapiro
Societal-scale data is playing an increasingly prominent role in social science research; examples from research on geopolitical events include questions on how emergency events impact the diffusion of information or how new policies change patterns of social interaction. Such research often draws critical inferences from observing how an exogenous event changes meaningful metrics like network degree or network entropy. However, as we show in this work, standard estimation methodologies make systematically incorrect inferences when the event also changes the sparsity of the data. To address this issue, we provide a general framework for inferring changes in social metrics when dealing with non-stationary sparsity. We propose a plug-in correction that can be applied to any estimator, including several recently proposed procedures. Using both simulated and real data, we demonstrate that the correction significantly improves the accuracy of the estimated change under a variety of plausible data generating processes. In particular, using a large dataset of calls from Afghanistan, we show that whereas traditional methods substantially overestimate the impact of a violent event on social diversity, the plug-in correction reveals the true response to be much more modest.
社会尺度数据在社会科学研究中的作用日益突出;地缘政治事件研究的例子包括关于紧急事件如何影响信息传播或新政策如何改变社会互动模式的问题。此类研究通常通过观察外生事件如何改变网络度或网络熵等有意义的指标,得出关键的推论。然而,正如我们在这项工作中所展示的,当事件也改变了数据的稀疏性时,标准估计方法会做出系统错误的推断。为了解决这个问题,我们提供了一个通用框架,用于在处理非平稳稀疏性时推断社会指标的变化。我们提出了一个插件校正,可以应用于任何估计器,包括最近提出的几个过程。利用模拟数据和真实数据,我们证明了在各种可能的数据生成过程下,校正显著提高了估计变化的准确性。特别是,使用来自阿富汗的电话的大型数据集,我们表明,传统方法大大高估了暴力事件对社会多样性的影响,而插件修正显示,真实的反应要温和得多。
{"title":"The Illusion of Change: Correcting for Biases in Change Inference for Sparse, Societal-Scale Data","authors":"Gabriel Cadamuro, Ramya Korlakai Vinayak, J. Blumenstock, S. Kakade, Jacob N. Shapiro","doi":"10.1145/3308558.3313722","DOIUrl":"https://doi.org/10.1145/3308558.3313722","url":null,"abstract":"Societal-scale data is playing an increasingly prominent role in social science research; examples from research on geopolitical events include questions on how emergency events impact the diffusion of information or how new policies change patterns of social interaction. Such research often draws critical inferences from observing how an exogenous event changes meaningful metrics like network degree or network entropy. However, as we show in this work, standard estimation methodologies make systematically incorrect inferences when the event also changes the sparsity of the data. To address this issue, we provide a general framework for inferring changes in social metrics when dealing with non-stationary sparsity. We propose a plug-in correction that can be applied to any estimator, including several recently proposed procedures. Using both simulated and real data, we demonstrate that the correction significantly improves the accuracy of the estimated change under a variety of plausible data generating processes. In particular, using a large dataset of calls from Afghanistan, we show that whereas traditional methods substantially overestimate the impact of a violent event on social diversity, the plug-in correction reveals the true response to be much more modest.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85620995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
With a Little Help from My Friends (and Their Friends): Influence Neighborhoods for Social Recommendations 从我的朋友(和他们的朋友)的一点帮助:影响社区的社会推荐
Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313745
Avni Gulati, M. Eirinaki
Social recommendations have been a very intriguing domain for researchers in the past decade. The main premise is that the social network of a user can be leveraged to enhance the rating-based recommendation process. This has been achieved in various ways, and under different assumptions about the network characteristics, structure, and availability of other information (such as trust, content, etc.) In this work, we create neighborhoods of influence leveraging only the social graph structure. These are in turn introduced in the recommendation process both as a pre-processing step and as a social regularization factor of the matrix factorization algorithm. Our experimental evaluation using real-life datasets demonstrates the effectiveness of the proposed technique.
在过去的十年里,社会推荐一直是研究人员非常感兴趣的领域。其主要前提是,可以利用用户的社交网络来增强基于评级的推荐过程。这是通过各种方式实现的,并且是在关于网络特征、结构和其他信息(如信任、内容等)的可用性的不同假设下实现的。在这项工作中,我们仅利用社交图结构创建了影响力社区。这些依次作为预处理步骤和矩阵分解算法的社会正则化因子引入到推荐过程中。我们使用真实数据集的实验评估证明了所提出技术的有效性。
{"title":"With a Little Help from My Friends (and Their Friends): Influence Neighborhoods for Social Recommendations","authors":"Avni Gulati, M. Eirinaki","doi":"10.1145/3308558.3313745","DOIUrl":"https://doi.org/10.1145/3308558.3313745","url":null,"abstract":"Social recommendations have been a very intriguing domain for researchers in the past decade. The main premise is that the social network of a user can be leveraged to enhance the rating-based recommendation process. This has been achieved in various ways, and under different assumptions about the network characteristics, structure, and availability of other information (such as trust, content, etc.) In this work, we create neighborhoods of influence leveraging only the social graph structure. These are in turn introduced in the recommendation process both as a pre-processing step and as a social regularization factor of the matrix factorization algorithm. Our experimental evaluation using real-life datasets demonstrates the effectiveness of the proposed technique.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"48 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85731331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A Hierarchical Attention Retrieval Model for Healthcare Question Answering 医疗保健问答的分层注意检索模型
Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313699
Ming Zhu, Aman Ahuja, Wei Wei, C. Reddy
The growth of the Web in recent years has resulted in the development of various online platforms that provide healthcare information services. These platforms contain an enormous amount of information, which could be beneficial for a large number of people. However, navigating through such knowledgebases to answer specific queries of healthcare consumers is a challenging task. A majority of such queries might be non-factoid in nature, and hence, traditional keyword-based retrieval models do not work well for such cases. Furthermore, in many scenarios, it might be desirable to get a short answer that sufficiently answers the query, instead of a long document with only a small amount of useful information. In this paper, we propose a neural network model for ranking documents for question answering in the healthcare domain. The proposed model uses a deep attention mechanism at word, sentence, and document levels, for efficient retrieval for both factoid and non-factoid queries, on documents of varied lengths. Specifically, the word-level cross-attention allows the model to identify words that might be most relevant for a query, and the hierarchical attention at sentence and document levels allows it to do effective retrieval on both long and short documents. We also construct a new large-scale healthcare question-answering dataset, which we use to evaluate our model. Experimental evaluation results against several state-of-the-art baselines show that our model outperforms the existing retrieval techniques.
近年来网络的发展导致了各种提供医疗信息服务的在线平台的发展。这些平台包含了大量的信息,这可能对很多人有益。然而,通过这样的知识库来回答医疗保健消费者的特定查询是一项具有挑战性的任务。大多数此类查询本质上可能是非事实性的,因此,传统的基于关键字的检索模型不适用于此类情况。此外,在许多场景中,可能希望得到一个能够充分回答查询的简短答案,而不是一个只有少量有用信息的长文档。在本文中,我们提出了一种神经网络模型,用于对医疗保健领域的问答文档进行排序。提出的模型在单词、句子和文档级别上使用深度注意机制,以便在不同长度的文档上有效地检索事实和非事实查询。具体来说,单词级别的交叉注意允许模型识别可能与查询最相关的单词,句子和文档级别的分层注意允许它对长文档和短文档进行有效检索。我们还构建了一个新的大规模医疗保健问答数据集,我们使用它来评估我们的模型。针对几种最先进的基线的实验评估结果表明,我们的模型优于现有的检索技术。
{"title":"A Hierarchical Attention Retrieval Model for Healthcare Question Answering","authors":"Ming Zhu, Aman Ahuja, Wei Wei, C. Reddy","doi":"10.1145/3308558.3313699","DOIUrl":"https://doi.org/10.1145/3308558.3313699","url":null,"abstract":"The growth of the Web in recent years has resulted in the development of various online platforms that provide healthcare information services. These platforms contain an enormous amount of information, which could be beneficial for a large number of people. However, navigating through such knowledgebases to answer specific queries of healthcare consumers is a challenging task. A majority of such queries might be non-factoid in nature, and hence, traditional keyword-based retrieval models do not work well for such cases. Furthermore, in many scenarios, it might be desirable to get a short answer that sufficiently answers the query, instead of a long document with only a small amount of useful information. In this paper, we propose a neural network model for ranking documents for question answering in the healthcare domain. The proposed model uses a deep attention mechanism at word, sentence, and document levels, for efficient retrieval for both factoid and non-factoid queries, on documents of varied lengths. Specifically, the word-level cross-attention allows the model to identify words that might be most relevant for a query, and the hierarchical attention at sentence and document levels allows it to do effective retrieval on both long and short documents. We also construct a new large-scale healthcare question-answering dataset, which we use to evaluate our model. Experimental evaluation results against several state-of-the-art baselines show that our model outperforms the existing retrieval techniques.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87715922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
From Small-scale to Large-scale Text Classification 从小规模到大规模文本分类
Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313563
Kang-Min Kim, Yeachan Kim, Jungho Lee, Ji-Min Lee, SangKeun Lee
Neural network models have achieved impressive results in the field of text classification. However, existing approaches often suffer from insufficient training data in a large-scale text classification involving a large number of categories (e.g., several thousands of categories). Several neural network models have utilized multi-task learning to overcome the limited amount of training data. However, these approaches are also limited to small-scale text classification. In this paper, we propose a novel neural network-based multi-task learning framework for large-scale text classification. To this end, we first treat the different scales of text classification (i.e., large and small numbers of categories) as multiple, related tasks. Then, we train the proposed neural network, which learns small- and large-scale text classification tasks simultaneously. In particular, we further enhance this multi-task learning architecture by using a gate mechanism, which controls the flow of features between the small- and large-scale text classification tasks. Experimental results clearly show that our proposed model improves the performance of the large-scale text classification task with the help of the small-scale text classification task. The proposed scheme exhibits significant improvements of as much as 14% and 5% in terms of micro-averaging and macro-averaging F1-score, respectively, over state-of-the-art techniques.
神经网络模型在文本分类领域取得了令人瞩目的成绩。然而,在涉及大量类别(例如数千个类别)的大规模文本分类中,现有方法往往存在训练数据不足的问题。一些神经网络模型利用多任务学习来克服训练数据量有限的问题。然而,这些方法也局限于小规模文本分类。本文提出了一种新的基于神经网络的多任务学习框架,用于大规模文本分类。为此,我们首先将文本分类的不同尺度(即大类和小大类)视为多个相关的任务。然后,我们训练所提出的神经网络,它可以同时学习小型和大规模的文本分类任务。特别是,我们通过使用gate机制进一步增强了这种多任务学习架构,该机制控制了小型和大型文本分类任务之间的特征流。实验结果清楚地表明,我们提出的模型在小规模文本分类任务的帮助下提高了大规模文本分类任务的性能。与最先进的技术相比,所提出的方案在微观平均和宏观平均f1得分方面分别表现出高达14%和5%的显著改进。
{"title":"From Small-scale to Large-scale Text Classification","authors":"Kang-Min Kim, Yeachan Kim, Jungho Lee, Ji-Min Lee, SangKeun Lee","doi":"10.1145/3308558.3313563","DOIUrl":"https://doi.org/10.1145/3308558.3313563","url":null,"abstract":"Neural network models have achieved impressive results in the field of text classification. However, existing approaches often suffer from insufficient training data in a large-scale text classification involving a large number of categories (e.g., several thousands of categories). Several neural network models have utilized multi-task learning to overcome the limited amount of training data. However, these approaches are also limited to small-scale text classification. In this paper, we propose a novel neural network-based multi-task learning framework for large-scale text classification. To this end, we first treat the different scales of text classification (i.e., large and small numbers of categories) as multiple, related tasks. Then, we train the proposed neural network, which learns small- and large-scale text classification tasks simultaneously. In particular, we further enhance this multi-task learning architecture by using a gate mechanism, which controls the flow of features between the small- and large-scale text classification tasks. Experimental results clearly show that our proposed model improves the performance of the large-scale text classification task with the help of the small-scale text classification task. The proposed scheme exhibits significant improvements of as much as 14% and 5% in terms of micro-averaging and macro-averaging F1-score, respectively, over state-of-the-art techniques.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"57 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88749187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Bridging Screen Readers and Voice Assistants for Enhanced Eyes-Free Web Search 桥接屏幕阅读器和语音助手增强眼睛自由的网络搜索
Pub Date : 2019-05-13 DOI: 10.1145/3308558.3314136
Alexandra Vtyurina, Adam Fourney, M. Morris, Leah Findlater, Ryen W. White
People with visual impairments often rely on screen readers when interacting with computer systems. Increasingly, these individuals also make extensive use of voice-based virtual assistants (VAs). We conducted a survey of 53 people who are legally blind to identify the strengths and weaknesses of both technologies, as well as the unmet opportunities at their intersection. We learned that virtual assistants are convenient and accessible, but lack the ability to deeply engage with content (e.g., read beyond the first few sentences of Wikipedia), and the ability to get a quick overview of the landscape (list alternative search results & suggestions). In contrast, screen readers allow for deep engagement with content (when content is accessible), and provide fine-grained navigation & control, but at the cost of increased complexity, and reduced walk-up-and-use convenience. In this demonstration, we showcase VERSE, a system that combines the positive aspects of VAs and screen readers, and allows other devices (e.g., smart watches) to serve as optional input accelerators. Together, these features allow people with visual impairments to deeply engage with web content through voice interaction.
有视觉障碍的人在与计算机系统交互时经常依赖屏幕阅读器。这些人也越来越多地广泛使用基于语音的虚拟助手(VAs)。我们对53名法律上失明的人进行了调查,以确定这两种技术的优缺点,以及它们交叉处未满足的机会。我们了解到,虚拟助手方便且易于访问,但缺乏深度参与内容的能力(例如,阅读维基百科的前几句以外的内容),以及快速概述内容的能力(列出可供选择的搜索结果和建议)。相比之下,屏幕阅读器允许与内容进行深度交互(当内容可访问时),并提供细粒度的导航和控制,但代价是增加了复杂性,减少了行走和使用的便利性。在这个演示中,我们展示了VERSE,一个结合了VAs和屏幕阅读器的积极方面的系统,并允许其他设备(例如智能手表)作为可选的输入加速器。总之,这些功能使视障人士能够通过语音交互深入参与网络内容。
{"title":"Bridging Screen Readers and Voice Assistants for Enhanced Eyes-Free Web Search","authors":"Alexandra Vtyurina, Adam Fourney, M. Morris, Leah Findlater, Ryen W. White","doi":"10.1145/3308558.3314136","DOIUrl":"https://doi.org/10.1145/3308558.3314136","url":null,"abstract":"People with visual impairments often rely on screen readers when interacting with computer systems. Increasingly, these individuals also make extensive use of voice-based virtual assistants (VAs). We conducted a survey of 53 people who are legally blind to identify the strengths and weaknesses of both technologies, as well as the unmet opportunities at their intersection. We learned that virtual assistants are convenient and accessible, but lack the ability to deeply engage with content (e.g., read beyond the first few sentences of Wikipedia), and the ability to get a quick overview of the landscape (list alternative search results & suggestions). In contrast, screen readers allow for deep engagement with content (when content is accessible), and provide fine-grained navigation & control, but at the cost of increased complexity, and reduced walk-up-and-use convenience. In this demonstration, we showcase VERSE, a system that combines the positive aspects of VAs and screen readers, and allows other devices (e.g., smart watches) to serve as optional input accelerators. Together, these features allow people with visual impairments to deeply engage with web content through voice interaction.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"88 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91416613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
Pcard: Personalized Restaurants Recommendation from Card Payment Transaction Records Pcard:从信用卡支付交易记录中个性化推荐餐厅
Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313494
Min Du, Robert Christensen, Wei Zhang, Feifei Li
Personalized Point of Interest (POI) recommendation that incorporates users' personal preferences is an important subject of research. However, challenges exist such as dealing with sparse rating data and spatial location factors. As one of the biggest card payment organizations in the United States, our company holds abundant card payment transaction records with numerous features. In this paper, using restaurant recommendation as a demonstrating example, we present a personalized POI recommendation system (Pcard) that learns user preferences based on user transaction history and restaurants' locations. With a novel embedding approach that captures user embeddings and restaurant embeddings, we model pairwise restaurant preferences with respect to each user based on their locations and dining histories. Finally, a ranking list of restaurants within a spatial region is presented to the user. The evaluation results show that the proposed approach is able to achieve high accuracy and present effective recommendations.
结合用户个人偏好的个性化兴趣点(POI)推荐是一个重要的研究课题。然而,在处理稀疏的评级数据和空间位置因素等方面存在挑战。作为美国最大的信用卡支付机构之一,我们公司拥有丰富的信用卡支付交易记录,特征众多。本文以餐厅推荐为例,提出了一种基于用户交易历史和餐厅位置学习用户偏好的个性化POI推荐系统(Pcard)。通过一种新颖的嵌入方法,捕获用户嵌入和餐厅嵌入,我们基于每个用户的位置和用餐历史,对餐馆偏好进行两两建模。最后,一个空间区域内的餐厅排名列表呈现给用户。评估结果表明,该方法能够达到较高的准确率,并提供有效的推荐。
{"title":"Pcard: Personalized Restaurants Recommendation from Card Payment Transaction Records","authors":"Min Du, Robert Christensen, Wei Zhang, Feifei Li","doi":"10.1145/3308558.3313494","DOIUrl":"https://doi.org/10.1145/3308558.3313494","url":null,"abstract":"Personalized Point of Interest (POI) recommendation that incorporates users' personal preferences is an important subject of research. However, challenges exist such as dealing with sparse rating data and spatial location factors. As one of the biggest card payment organizations in the United States, our company holds abundant card payment transaction records with numerous features. In this paper, using restaurant recommendation as a demonstrating example, we present a personalized POI recommendation system (Pcard) that learns user preferences based on user transaction history and restaurants' locations. With a novel embedding approach that captures user embeddings and restaurant embeddings, we model pairwise restaurant preferences with respect to each user based on their locations and dining histories. Finally, a ranking list of restaurants within a spatial region is presented to the user. The evaluation results show that the proposed approach is able to achieve high accuracy and present effective recommendations.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"65 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91079781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Evaluating Neural Text Simplification in the Medical Domain 评价医学领域的神经文本简化
Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313630
Laurens Van den Bercken, Robert-Jan Sips, C. Lofi
Health literacy, i.e. the ability to read and understand medical text, is a relevant component of public health. Unfortunately, many medical texts are hard to grasp by the general population as they are targeted at highly-skilled professionals and use complex language and domain-specific terms. Here, automatic text simplification making text commonly understandable would be very beneficial. However, research and development into medical text simplification is hindered by the lack of openly available training and test corpora which contain complex medical sentences and their aligned simplified versions. In this paper, we introduce such a dataset to aid medical text simplification research. The dataset is created by filtering aligned health sentences using expert knowledge from an existing aligned corpus and a novel simple, language independent monolingual text alignment method. Furthermore, we use the dataset to train a state-of-the-art neural machine translation model, and compare it to a model trained on a general simplification dataset using an automatic evaluation, and an extensive human-expert evaluation.
卫生素养,即阅读和理解医学文献的能力,是公共卫生的一个相关组成部分。不幸的是,许多医学文本很难被普通大众理解,因为它们针对的是高技能的专业人士,使用复杂的语言和特定领域的术语。在这里,自动文本简化使文本易于理解将是非常有益的。然而,医学文本简化的研究和发展受到缺乏公开可用的训练和测试语料库的阻碍,这些语料库包含复杂的医学句子及其对齐的简化版本。在本文中,我们引入了这样一个数据集来帮助医学文本简化研究。该数据集是通过使用现有对齐语料库中的专家知识和一种新颖的简单的、独立于语言的单语文本对齐方法过滤对齐的健康句而创建的。此外,我们使用该数据集来训练最先进的神经机器翻译模型,并将其与使用自动评估和广泛的人类专家评估在一般简化数据集上训练的模型进行比较。
{"title":"Evaluating Neural Text Simplification in the Medical Domain","authors":"Laurens Van den Bercken, Robert-Jan Sips, C. Lofi","doi":"10.1145/3308558.3313630","DOIUrl":"https://doi.org/10.1145/3308558.3313630","url":null,"abstract":"Health literacy, i.e. the ability to read and understand medical text, is a relevant component of public health. Unfortunately, many medical texts are hard to grasp by the general population as they are targeted at highly-skilled professionals and use complex language and domain-specific terms. Here, automatic text simplification making text commonly understandable would be very beneficial. However, research and development into medical text simplification is hindered by the lack of openly available training and test corpora which contain complex medical sentences and their aligned simplified versions. In this paper, we introduce such a dataset to aid medical text simplification research. The dataset is created by filtering aligned health sentences using expert knowledge from an existing aligned corpus and a novel simple, language independent monolingual text alignment method. Furthermore, we use the dataset to train a state-of-the-art neural machine translation model, and compare it to a model trained on a general simplification dataset using an automatic evaluation, and an extensive human-expert evaluation.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91089628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
Learning Clusters through Information Diffusion 通过信息扩散学习集群
Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313560
L. Ostroumova, Alexey Tikhonov, N. Litvak
When information or infectious diseases spread over a network, in many practical cases, one can observe when nodes adopt information or become infected, but the underlying network is hidden. In this paper, we analyze the problem of finding communities of highly interconnected nodes, given only the infection times of nodes. We propose, analyze, and empirically compare several algorithms for this task. The most stable performance, that improves the current state-of-the-art, is obtained by our proposed heuristic approaches, that are agnostic to a particular graph structure and epidemic model.
当信息或传染病在网络上传播时,在许多实际情况下,人们可以观察到节点何时采用信息或被感染,但底层网络是隐藏的。在给定节点感染次数的情况下,我们分析了寻找高度互联节点群体的问题。我们为这项任务提出、分析和经验比较了几种算法。我们提出的启发式方法对特定的图结构和流行病模型不可知,从而获得了最稳定的性能,提高了当前的技术水平。
{"title":"Learning Clusters through Information Diffusion","authors":"L. Ostroumova, Alexey Tikhonov, N. Litvak","doi":"10.1145/3308558.3313560","DOIUrl":"https://doi.org/10.1145/3308558.3313560","url":null,"abstract":"When information or infectious diseases spread over a network, in many practical cases, one can observe when nodes adopt information or become infected, but the underlying network is hidden. In this paper, we analyze the problem of finding communities of highly interconnected nodes, given only the infection times of nodes. We propose, analyze, and empirically compare several algorithms for this task. The most stable performance, that improves the current state-of-the-art, is obtained by our proposed heuristic approaches, that are agnostic to a particular graph structure and epidemic model.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"464 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91478411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Learning Task-Specific City Region Partition 学习特定任务的城市区域划分
Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313704
Hongjian Wang, P. Jenkins, Hua Wei, Fei Wu, Z. Li
The proliferation of publicly accessible urban data provide new insights on various urban tasks. A frequently used approach is to treat each region as a data sample and build a model over all the regions to observe the correlations between urban features (e.g., demographics) and the target variable (e.g., crime count). To define regions, most existing studies use fixed grids or pre-defined administrative boundaries (e.g., census tracts or community areas). In reality, however, definitions of regions should be different depending on tasks (e.g., regional crime count prediction vs. real estate prices estimation). In this paper, we propose a new problem of task-specific city region partitioning, aiming to find the best partition in a city w.r.t. a given task. We prove this is an NP-hard search problem with no trivial solution. To learn the partition, we first study two variants of Markov Chain Monte Carlo (MCMC). We further propose a reinforcement learning scheme for effective sampling the search space. We conduct experiments on two real datasets in Chicago (i.e., crime count and real estate price) to demonstrate the effectiveness of our proposed method.
可公开获取的城市数据的激增为各种城市任务提供了新的见解。一种常用的方法是将每个区域视为一个数据样本,并在所有区域建立一个模型,以观察城市特征(例如人口统计)与目标变量(例如犯罪计数)之间的相关性。为了确定区域,大多数现有研究使用固定网格或预先确定的行政边界(例如,人口普查区或社区区域)。然而,在现实中,区域的定义应该根据任务而有所不同(例如,区域犯罪数量预测与房地产价格估计)。本文提出了一种新的基于任务的城市区域划分问题,目的是在给定任务的基础上寻找城市区域的最佳划分。我们证明了这是一个没有平凡解的NP-hard搜索问题。为了学习划分,我们首先研究了马尔可夫链蒙特卡罗(MCMC)的两个变体。我们进一步提出了一种有效采样搜索空间的强化学习方案。我们在芝加哥的两个真实数据集(即犯罪计数和房地产价格)上进行了实验,以证明我们提出的方法的有效性。
{"title":"Learning Task-Specific City Region Partition","authors":"Hongjian Wang, P. Jenkins, Hua Wei, Fei Wu, Z. Li","doi":"10.1145/3308558.3313704","DOIUrl":"https://doi.org/10.1145/3308558.3313704","url":null,"abstract":"The proliferation of publicly accessible urban data provide new insights on various urban tasks. A frequently used approach is to treat each region as a data sample and build a model over all the regions to observe the correlations between urban features (e.g., demographics) and the target variable (e.g., crime count). To define regions, most existing studies use fixed grids or pre-defined administrative boundaries (e.g., census tracts or community areas). In reality, however, definitions of regions should be different depending on tasks (e.g., regional crime count prediction vs. real estate prices estimation). In this paper, we propose a new problem of task-specific city region partitioning, aiming to find the best partition in a city w.r.t. a given task. We prove this is an NP-hard search problem with no trivial solution. To learn the partition, we first study two variants of Markov Chain Monte Carlo (MCMC). We further propose a reinforcement learning scheme for effective sampling the search space. We conduct experiments on two real datasets in Chicago (i.e., crime count and real estate price) to demonstrate the effectiveness of our proposed method.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90349693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
The World Wide Web Conference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1