首页 > 最新文献

Proceedings of the 13th International Conference on Web Search and Data Mining最新文献

英文 中文
SUM'20: State-based User Modelling SUM'20:基于状态的用户建模
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3371883
Sahan Bulathwela, M. Pérez-Ortiz, Rishabh Mehrotra, D. Orlic, C. D. L. Higuera, J. Shawe-Taylor, Emine Yilmaz
Capturing and effectively utilising user states and goals is becoming a timely challenge for successfully leveraging intelligent and usercentric systems in differentweb search and data mining applications. Examples of such systems are conversational agents, intelligent assistants, educational and contextual information retrieval systems, recommender/match-making systems and advertising systems, all of which rely on identifying the user state in order to provide the most relevant information and assist users in achieving their goals. There has been, however, limited work towards building such state-aware intelligent learning mechanisms. Hence, devising information systems that can keep track of the user's state has been listed as one of the grand challenges to be tackled in the next few years [1]. It is thus timely to organize a workshop that re-visits the problem of designing and evaluating state-aware and user-centric systems, ensuring that the community (spanning academic and industrial backgrounds) works together to tackle these challenges.
捕获和有效地利用用户状态和目标正在成为在不同的web搜索和数据挖掘应用程序中成功利用智能和以用户为中心的系统的及时挑战。这些系统的例子有对话代理、智能助手、教育和上下文信息检索系统、推荐/配对系统和广告系统,所有这些系统都依赖于识别用户状态,以便提供最相关的信息并帮助用户实现他们的目标。然而,在建立这种状态感知智能学习机制方面的工作有限。因此,设计能够跟踪用户状态的信息系统已被列为未来几年需要解决的重大挑战之一[1]。因此,及时组织一次研讨会,重新审视设计和评估状态感知和以用户为中心的系统的问题,确保社区(跨越学术和工业背景)共同努力应对这些挑战。
{"title":"SUM'20: State-based User Modelling","authors":"Sahan Bulathwela, M. Pérez-Ortiz, Rishabh Mehrotra, D. Orlic, C. D. L. Higuera, J. Shawe-Taylor, Emine Yilmaz","doi":"10.1145/3336191.3371883","DOIUrl":"https://doi.org/10.1145/3336191.3371883","url":null,"abstract":"Capturing and effectively utilising user states and goals is becoming a timely challenge for successfully leveraging intelligent and usercentric systems in differentweb search and data mining applications. Examples of such systems are conversational agents, intelligent assistants, educational and contextual information retrieval systems, recommender/match-making systems and advertising systems, all of which rely on identifying the user state in order to provide the most relevant information and assist users in achieving their goals. There has been, however, limited work towards building such state-aware intelligent learning mechanisms. Hence, devising information systems that can keep track of the user's state has been listed as one of the grand challenges to be tackled in the next few years [1]. It is thus timely to organize a workshop that re-visits the problem of designing and evaluating state-aware and user-centric systems, ensuring that the community (spanning academic and industrial backgrounds) works together to tackle these challenges.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128120448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Epidemic Graph Convolutional Network 流行图卷积网络
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3371807
Tyler Derr, Yao Ma, Wenqi Fan, Xiaorui Liu, C. Aggarwal, Jiliang Tang
A growing trend recently is to harness the structure of today's big data, where much of the data can be represented as graphs. Simultaneously, graph convolutional networks (GCNs) have been proposed and since seen rapid development. More recently, due to the scalability issues that arise when attempting to utilize these powerful models on real-world data, methodologies have sought the use of sampling techniques. More specifically, minibatches of nodes are formed and then sets of nodes are sampled to aggregate from in one or more layers. Among these methods, the two prominent ways are based on sampling nodes from either a local or global perspective. In this work, we first observe the similarities in the two sampling strategies to that of epidemic and diffusion network models. Then we harness this understanding to fuse together the benefits of sampling from both a local and global perspective while alleviating some of the inherent issues found in both through the use of a low-dimensional approximation for the path-based Katz similarity measure. Our proposed framework, Epidemic Graph Convolutional Network (EGCN), is thus able to achieve improved performance over sampling from just one of the two perspectives alone. Empirical experiments are performed on several public benchmark datasets to verify the effectiveness over existing methodologies for the node classification task and we furthermore present some empirical parameter analysis of EGCN.
最近一个日益增长的趋势是利用当今大数据的结构,其中大部分数据可以用图表表示。与此同时,图卷积网络(GCNs)也被提出并迅速发展。最近,由于试图在真实世界的数据上利用这些强大的模型时出现的可伸缩性问题,方法已经寻求使用抽样技术。更具体地说,形成小批节点,然后采样节点集,从一个或多个层中进行聚合。在这些方法中,两种突出的方法是基于局部或全局视角的采样节点。在这项工作中,我们首先观察到两种采样策略与流行病和扩散网络模型的相似之处。然后,我们利用这种理解,从局部和全局的角度融合采样的好处,同时通过使用基于路径的Katz相似性度量的低维近似来缓解两者中发现的一些固有问题。因此,我们提出的框架流行病图卷积网络(EGCN)能够仅从两个角度中的一个角度实现更好的采样性能。在几个公开的基准数据集上进行了实证实验,验证了EGCN方法在节点分类任务中的有效性,并进一步给出了EGCN的一些经验参数分析。
{"title":"Epidemic Graph Convolutional Network","authors":"Tyler Derr, Yao Ma, Wenqi Fan, Xiaorui Liu, C. Aggarwal, Jiliang Tang","doi":"10.1145/3336191.3371807","DOIUrl":"https://doi.org/10.1145/3336191.3371807","url":null,"abstract":"A growing trend recently is to harness the structure of today's big data, where much of the data can be represented as graphs. Simultaneously, graph convolutional networks (GCNs) have been proposed and since seen rapid development. More recently, due to the scalability issues that arise when attempting to utilize these powerful models on real-world data, methodologies have sought the use of sampling techniques. More specifically, minibatches of nodes are formed and then sets of nodes are sampled to aggregate from in one or more layers. Among these methods, the two prominent ways are based on sampling nodes from either a local or global perspective. In this work, we first observe the similarities in the two sampling strategies to that of epidemic and diffusion network models. Then we harness this understanding to fuse together the benefits of sampling from both a local and global perspective while alleviating some of the inherent issues found in both through the use of a low-dimensional approximation for the path-based Katz similarity measure. Our proposed framework, Epidemic Graph Convolutional Network (EGCN), is thus able to achieve improved performance over sampling from just one of the two perspectives alone. Empirical experiments are performed on several public benchmark datasets to verify the effectiveness over existing methodologies for the node classification task and we furthermore present some empirical parameter analysis of EGCN.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122055239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Comparative Web Search Questions 比较网络搜索问题
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3371848
Alexander Bondarenko, Pavel Braslavski, Michael Völske, Rami Aly, Maik Fröbe, Alexander Panchenko, Christian Biemann, Benno Stein, Matthias Hagen
beginabstract We analyze comparative questions, i.e., questions asking to compare different items, that were submitted to Yandex in 2012. Responses to such questions might be quite different from the simple "ten blue links'' and could, for example, aggregate pros and cons of the different options as direct answers. However, changing the result presentation is an intricate decision such that the classification of comparative questions forms a highly precision-oriented task. From a year-long Yandex log, we annotate a random sample of 50,000~questions; 2.8%~of which are comparative. For these annotated questions, we develop a precision-oriented classifier by combining carefully hand-crafted lexico-syntactic rules with feature-based and neural approaches---achieving a recall of~0.6 at a perfect precision of~1.0. After running the classifier on the full year log (on average, there is at least one comparative question per second), we analyze 6,250~comparative questions using more fine-grained subclasses (e.g., should the answer be a "simple'' fact or rather a more verbose argument) for which individual classifiers are trained. An important insight is that more than 65%~of the comparative questions demand argumentation and opinions, i.e., reliable direct answers to comparative questions require more than the facts from a search engine's knowledge graph. In addition, we present a qualitative analysis of the underlying comparative information needs (separated into 14~categories likeconsumer electronics orhealth ), their seasonal dynamics, and possible answers from community question answering platforms. endabstract
我们分析比较问题,即要求比较不同项目的问题,这些问题于2012年提交给Yandex。对这些问题的回答可能与简单的“十个蓝色链接”截然不同,例如,可以将不同选项的利弊汇总为直接答案。然而,改变结果表示是一个复杂的决定,因此比较问题的分类形成了一个高度精确导向的任务。从长达一年的Yandex日志中,我们对随机抽取的5万个问题进行了注释;其中2.8%是比较的。对于这些注释问题,我们开发了一个面向精度的分类器,通过将精心制作的词典句法规则与基于特征和神经方法相结合,实现了~0.6的召回率和~1.0的完美精度。在全年日志上运行分类器之后(平均而言,每秒至少有一个比较问题),我们使用更细粒度的子类(例如,答案应该是一个“简单”的事实还是更冗长的参数)分析了6,250~比较问题,每个分类器都为此进行了训练。一个重要的洞察是,超过65%的比较问题需要论证和观点,也就是说,对比较问题的可靠直接答案需要的不仅仅是搜索引擎知识图谱中的事实。此外,我们对潜在的比较信息需求(分为14个类别,如消费电子产品或健康),其季节性动态以及社区问答平台的可能答案进行了定性分析。 endabstract
{"title":"Comparative Web Search Questions","authors":"Alexander Bondarenko, Pavel Braslavski, Michael Völske, Rami Aly, Maik Fröbe, Alexander Panchenko, Christian Biemann, Benno Stein, Matthias Hagen","doi":"10.1145/3336191.3371848","DOIUrl":"https://doi.org/10.1145/3336191.3371848","url":null,"abstract":"beginabstract We analyze comparative questions, i.e., questions asking to compare different items, that were submitted to Yandex in 2012. Responses to such questions might be quite different from the simple \"ten blue links'' and could, for example, aggregate pros and cons of the different options as direct answers. However, changing the result presentation is an intricate decision such that the classification of comparative questions forms a highly precision-oriented task. From a year-long Yandex log, we annotate a random sample of 50,000~questions; 2.8%~of which are comparative. For these annotated questions, we develop a precision-oriented classifier by combining carefully hand-crafted lexico-syntactic rules with feature-based and neural approaches---achieving a recall of~0.6 at a perfect precision of~1.0. After running the classifier on the full year log (on average, there is at least one comparative question per second), we analyze 6,250~comparative questions using more fine-grained subclasses (e.g., should the answer be a \"simple'' fact or rather a more verbose argument) for which individual classifiers are trained. An important insight is that more than 65%~of the comparative questions demand argumentation and opinions, i.e., reliable direct answers to comparative questions require more than the facts from a search engine's knowledge graph. In addition, we present a qualitative analysis of the underlying comparative information needs (separated into 14~categories likeconsumer electronics orhealth ), their seasonal dynamics, and possible answers from community question answering platforms. endabstract","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131531364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
A Context-Aware Click Model for Web Search Web搜索的上下文感知点击模型
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3371819
Jia Chen, Jiaxin Mao, Yiqun Liu, Min Zhang, Shaoping Ma
To better exploit the search logs, various click models have been proposed to extract implicit relevance feedback from user clicks. Most traditional click models are based on probability graphical models (PGMs) with manually designed dependencies. Recently, some researchers also adopt neural-based methods to improve the accuracy of click prediction. However, most of the existing click models only model user behavior in query level. As the previous iterations within the session may have an impact on the current search round, we can leverage these behavior signals to better model user behaviors. In this paper, we propose a novel neural- based Context-Aware Click Model (CACM) for Web search. CACM consists of a context-aware relevance estimator and an examination predictor. The relevance estimator utilizes session context infor- mation, i.e., the query sequence and clickthrough data, as well as the pre-trained embeddings learned from a session-flow graph to estimate the context-aware relevance of each search result. The examination predictor estimates the examination probability of each result. We further investigate several combination functions to integrate the context-aware relevance and examination probabil- ity into click prediction. Experiment results on a public Web search dataset show that CACM outperforms existing click models in both relevance estimation and click prediction tasks.
为了更好地利用搜索日志,人们提出了各种点击模型来从用户点击中提取隐含的相关性反馈。大多数传统的点击模型都是基于概率图形模型(PGMs)和手动设计的依赖关系。最近,一些研究人员也采用基于神经网络的方法来提高点击预测的准确性。然而,现有的大多数点击模型仅在查询级对用户行为进行建模。由于会话中的先前迭代可能会对当前搜索轮产生影响,因此我们可以利用这些行为信号来更好地模拟用户行为。本文提出了一种基于神经网络的上下文感知点击模型(ccm)。ccm由上下文感知的相关性估计器和考试预测器组成。相关性估计器利用会话上下文信息,即查询序列和点击数据,以及从会话流图中学习的预训练嵌入来估计每个搜索结果的上下文感知相关性。考试预测器估计每个结果的考试概率。我们进一步研究了几个组合函数,将上下文感知相关性和检查概率集成到点击预测中。在公共Web搜索数据集上的实验结果表明,ccm在相关性估计和点击预测任务上都优于现有的点击模型。
{"title":"A Context-Aware Click Model for Web Search","authors":"Jia Chen, Jiaxin Mao, Yiqun Liu, Min Zhang, Shaoping Ma","doi":"10.1145/3336191.3371819","DOIUrl":"https://doi.org/10.1145/3336191.3371819","url":null,"abstract":"To better exploit the search logs, various click models have been proposed to extract implicit relevance feedback from user clicks. Most traditional click models are based on probability graphical models (PGMs) with manually designed dependencies. Recently, some researchers also adopt neural-based methods to improve the accuracy of click prediction. However, most of the existing click models only model user behavior in query level. As the previous iterations within the session may have an impact on the current search round, we can leverage these behavior signals to better model user behaviors. In this paper, we propose a novel neural- based Context-Aware Click Model (CACM) for Web search. CACM consists of a context-aware relevance estimator and an examination predictor. The relevance estimator utilizes session context infor- mation, i.e., the query sequence and clickthrough data, as well as the pre-trained embeddings learned from a session-flow graph to estimate the context-aware relevance of each search result. The examination predictor estimates the examination probability of each result. We further investigate several combination functions to integrate the context-aware relevance and examination probabil- ity into click prediction. Experiment results on a public Web search dataset show that CACM outperforms existing click models in both relevance estimation and click prediction tasks.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126335589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Sequential Modeling of Hierarchical User Intention and Preference for Next-item Recommendation 分级用户意向和下一项推荐偏好的顺序建模
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3371840
N. Zhu, Jian Cao, Yanchi Liu, Yang Yang, Haochao Ying, Hui Xiong
The next-item recommendation has attracted great research interests with both static and dynamic users' preferences considered. Existing approaches typically utilize user-item binary relations, and assume a flat preference distribution over items for each user. However, this assumption neglects the hierarchical discrimination between user intentions and user preferences, causing the methods have limited capacity to depict intention-specific preference. In fact, a consumer's purchasing behavior involves a natural sequential process, i.e., he/she first has an intention to buy one type of items, followed by choosing a specific item according to his/her preference under this intention. To this end, we propose a novel key-array memory network (KA-MemNN), which takes both user intentions and preferences into account for next-item recommendation. Specifically, the user behavioral intention tendency is determined through key addressing. Further, each array outputs an intention-specific preference representation of a user. Then, the degree of user's behavioral intention tendency and intention-specific preference representation are combined to form a hierarchical representation of a user. This representation is further utilized to replace the static profile of users in traditional matrix factorization for the purposes of reasoning. The experimental results on real-world data demonstrate the advantages of our approach over state-of-the-art methods.
下一项推荐引起了很大的研究兴趣,同时考虑了静态和动态用户的偏好。现有的方法通常利用用户-项目二元关系,并假设每个用户的项目具有平坦的偏好分布。然而,这种假设忽略了用户意图和用户偏好之间的层次区别,导致方法描述特定意图偏好的能力有限。事实上,消费者的购买行为是一个自然的顺序过程,即消费者首先有购买某一类商品的意向,然后在这种意向下根据自己的喜好选择某一类商品。为此,我们提出了一种新的键阵列记忆网络(KA-MemNN),它将用户的意图和偏好都考虑到下一个项目的推荐。具体来说,通过键寻址来确定用户的行为意向倾向。此外,每个数组输出用户的特定于意图的首选项表示。然后,将用户的行为意向倾向程度与意向偏好表征相结合,形成用户的层次表征。为了进行推理,进一步利用这种表示来取代传统矩阵分解中用户的静态轮廓。实际数据的实验结果表明,我们的方法优于最先进的方法。
{"title":"Sequential Modeling of Hierarchical User Intention and Preference for Next-item Recommendation","authors":"N. Zhu, Jian Cao, Yanchi Liu, Yang Yang, Haochao Ying, Hui Xiong","doi":"10.1145/3336191.3371840","DOIUrl":"https://doi.org/10.1145/3336191.3371840","url":null,"abstract":"The next-item recommendation has attracted great research interests with both static and dynamic users' preferences considered. Existing approaches typically utilize user-item binary relations, and assume a flat preference distribution over items for each user. However, this assumption neglects the hierarchical discrimination between user intentions and user preferences, causing the methods have limited capacity to depict intention-specific preference. In fact, a consumer's purchasing behavior involves a natural sequential process, i.e., he/she first has an intention to buy one type of items, followed by choosing a specific item according to his/her preference under this intention. To this end, we propose a novel key-array memory network (KA-MemNN), which takes both user intentions and preferences into account for next-item recommendation. Specifically, the user behavioral intention tendency is determined through key addressing. Further, each array outputs an intention-specific preference representation of a user. Then, the degree of user's behavioral intention tendency and intention-specific preference representation are combined to form a hierarchical representation of a user. This representation is further utilized to replace the static profile of users in traditional matrix factorization for the purposes of reasoning. The experimental results on real-world data demonstrate the advantages of our approach over state-of-the-art methods.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132961017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Decision Boundary of Deep Neural Networks: Challenges and Opportunities 深度神经网络决策边界:挑战与机遇
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3372186
Hamid Karimi, Jiliang Tang
One crucial aspect that yet remains fairly unknown while can inform us about the behavior of deep neural networks is their decision boundaries. Trust can be improved once we understand how and why deep models carve out a particular form of decision boundary and thus make particular decisions. Robustness against adversarial examples is directly related to the decision boundary as adversarial examples are basically 'missed out' by the decision boundary between two classes. Investigating the decision boundary of deep neural networks, nevertheless, faces tremendous challenges. First, how we can generate instances near the decision boundary that are similar to real samples? Second, how we can leverage near decision boundary instances to characterize the behaviour of deep neural networks? Motivated to solve these challenges, we focus on investigating the decision boundary of deep neural network classifiers. In particular, we propose a novel approach to generate instances near decision boundary of pre-trained DNNs and then leverage these instances to characterize the behaviour of deep models.
一个关键的方面仍然是相当未知的,但可以告诉我们关于深度神经网络的行为是他们的决策边界。一旦我们理解了深度模型如何以及为什么会划分出一种特定形式的决策边界,从而做出特定的决策,信任就可以得到改善。对抗性示例的鲁棒性与决策边界直接相关,因为对抗性示例基本上被两个类之间的决策边界“遗漏”了。然而,研究深度神经网络的决策边界面临着巨大的挑战。首先,我们如何在决策边界附近生成与真实样本相似的实例?其次,我们如何利用近决策边界实例来表征深度神经网络的行为?为了解决这些挑战,我们重点研究了深度神经网络分类器的决策边界。特别是,我们提出了一种新的方法来生成预训练dnn的决策边界附近的实例,然后利用这些实例来表征深度模型的行为。
{"title":"Decision Boundary of Deep Neural Networks: Challenges and Opportunities","authors":"Hamid Karimi, Jiliang Tang","doi":"10.1145/3336191.3372186","DOIUrl":"https://doi.org/10.1145/3336191.3372186","url":null,"abstract":"One crucial aspect that yet remains fairly unknown while can inform us about the behavior of deep neural networks is their decision boundaries. Trust can be improved once we understand how and why deep models carve out a particular form of decision boundary and thus make particular decisions. Robustness against adversarial examples is directly related to the decision boundary as adversarial examples are basically 'missed out' by the decision boundary between two classes. Investigating the decision boundary of deep neural networks, nevertheless, faces tremendous challenges. First, how we can generate instances near the decision boundary that are similar to real samples? Second, how we can leverage near decision boundary instances to characterize the behaviour of deep neural networks? Motivated to solve these challenges, we focus on investigating the decision boundary of deep neural network classifiers. In particular, we propose a novel approach to generate instances near decision boundary of pre-trained DNNs and then leverage these instances to characterize the behaviour of deep models.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131022440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Network Analysis with Negative Links 负链接的网络分析
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3372188
Tyler Derr
As we rapidly continue into the information age, the rate at which data is produced has created an unprecedented demand for novel methods to effectively/efficiently extract insightful patterns. Then, once paired with domain knowledge, we can seek to understand the past, make predictions about the future, and ultimately take actionable steps towards improving our society. Thus, due to the fact that much of today's big data can be represented as graphs, emphasis is being taken to harness the natural structure of data through network analysis. Furthermore, many real-world networks can be better represented as signed networks, e.g., in an online social network such as Facebook, friendships can be represented as positive links while negative links can represent blocked users. Hence, due to signed networks being ubiquitous, in this work we seek to provide a fundamental background into the domain, a hierarchical categorization of existing work highlighting both seminal and state of the art, provide a curated collection of signed network datasets, and discuss important future directions.
随着我们迅速进入信息时代,数据产生的速度对有效/高效地提取有洞察力的模式的新方法产生了前所未有的需求。然后,一旦与领域知识相结合,我们就可以寻求理解过去,预测未来,并最终采取可行的步骤来改善我们的社会。因此,由于今天的大部分大数据都可以用图形表示,因此重点是通过网络分析来利用数据的自然结构。此外,许多现实世界的网络可以更好地表示为签名网络,例如,在Facebook等在线社交网络中,友谊可以表示为积极链接,而消极链接可以表示被屏蔽的用户。因此,由于签名网络无处不在,在本工作中,我们试图提供该领域的基本背景,对现有工作进行分层分类,突出显示开创性和最新技术,提供签名网络数据集的策划集合,并讨论重要的未来方向。
{"title":"Network Analysis with Negative Links","authors":"Tyler Derr","doi":"10.1145/3336191.3372188","DOIUrl":"https://doi.org/10.1145/3336191.3372188","url":null,"abstract":"As we rapidly continue into the information age, the rate at which data is produced has created an unprecedented demand for novel methods to effectively/efficiently extract insightful patterns. Then, once paired with domain knowledge, we can seek to understand the past, make predictions about the future, and ultimately take actionable steps towards improving our society. Thus, due to the fact that much of today's big data can be represented as graphs, emphasis is being taken to harness the natural structure of data through network analysis. Furthermore, many real-world networks can be better represented as signed networks, e.g., in an online social network such as Facebook, friendships can be represented as positive links while negative links can represent blocked users. Hence, due to signed networks being ubiquitous, in this work we seek to provide a fundamental background into the domain, a hierarchical categorization of existing work highlighting both seminal and state of the art, provide a curated collection of signed network datasets, and discuss important future directions.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134462393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Illustrate Your Story: Enriching Text with Images 说明你的故事:用图像丰富文本
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3371866
Sreyasi Nag Chowdhury, William Cheng, Gerard de Melo, Simon Razniewski, G. Weikum
Human perception is known to be predominantly visual. As modern web infrastructure promoted the storage of media, the web-data paradigm shifted from text-only documents to those containing text and images. A multitude of blog posts, news articles, and social media posts exist on the Internet today as examples of multimodal stories. The manual alignment of images and text in a story is time-consuming and labor intensive. We present a web application for automatically selecting relevant images from an album and placing them in suitable contexts within a body of text. The application solves a global optimization problem that maximizes the coherence of text paragraphs and image descriptors, and allows for exploring the underlying image descriptors and similarity metrics. Experiments show that our method can align images with texts with high semantic fit, and to user satisfaction.
众所周知,人类的知觉主要是视觉的。随着现代网络基础设施促进媒体存储,网络数据范式从纯文本文档转向包含文本和图像的文档。如今,互联网上有大量的博客文章、新闻文章和社交媒体文章,作为多模式故事的例子。在故事中手动对齐图像和文本是费时费力的。我们提供了一个web应用程序,可以自动从相册中选择相关图像,并将它们放置在文本正文中的合适上下文中。该应用程序解决了一个全局优化问题,最大限度地提高了文本段落和图像描述符的一致性,并允许探索潜在的图像描述符和相似度量。实验表明,该方法可以使图像与文本对齐,语义匹配度高,用户满意。
{"title":"Illustrate Your Story: Enriching Text with Images","authors":"Sreyasi Nag Chowdhury, William Cheng, Gerard de Melo, Simon Razniewski, G. Weikum","doi":"10.1145/3336191.3371866","DOIUrl":"https://doi.org/10.1145/3336191.3371866","url":null,"abstract":"Human perception is known to be predominantly visual. As modern web infrastructure promoted the storage of media, the web-data paradigm shifted from text-only documents to those containing text and images. A multitude of blog posts, news articles, and social media posts exist on the Internet today as examples of multimodal stories. The manual alignment of images and text in a story is time-consuming and labor intensive. We present a web application for automatically selecting relevant images from an album and placing them in suitable contexts within a body of text. The application solves a global optimization problem that maximizes the coherence of text paragraphs and image descriptors, and allows for exploring the underlying image descriptors and similarity metrics. Experiments show that our method can align images with texts with high semantic fit, and to user satisfaction.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121764206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Stepwise Reasoning for Multi-Relation Question Answering over Knowledge Graph with Weak Supervision 弱监督下知识图多关系问答的逐步推理
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3371812
Yunqi Qiu, Yuanzhuo Wang, Xiaolong Jin, Kun Zhang
Knowledge Graph Question Answering aims to automatically answer natural language questions via well-structured relation information between entities stored in knowledge graphs. When faced with a multi-relation question, existing embedding-based approaches take the whole topic-entity-centric subgraph into account, resulting in high time complexity. Meanwhile, due to the high cost for data annotations, it is impractical to exactly show how to answer a complex question step by step, and only the final answer is labeled, as weak supervision. To address these challenges, this paper proposes a neural method based on reinforcement learning, namely Stepwise Reasoning Network, which formulates multi-relation question answering as a sequential decision problem. The proposed model performs effective path search over the knowledge graph to obtain the answer, and leverages beam search to reduce the number of candidates significantly. Meanwhile, based on the attention mechanism and neural networks, the policy network can enhance the unique impact of different parts of a given question over triple selection. Moreover, to alleviate the delayed and sparse reward problem caused by weak supervision, we propose a potential-based reward shaping strategy, which can accelerate the convergence of the training algorithm and help the model perform better. Extensive experiments conducted over three benchmark datasets well demonstrate the effectiveness of the proposed model, which outperforms the state-of-the-art approaches.
知识图问答旨在通过知识图中存储的实体之间结构良好的关系信息,自动回答自然语言问题。当面对多关系问题时,现有的基于嵌入的方法考虑的是整个以主题实体为中心的子图,导致时间复杂度很高。同时,由于数据标注的成本较高,要准确地展示如何一步一步地回答一个复杂的问题是不现实的,只有最后的答案被标记,是弱监督。为了解决这些问题,本文提出了一种基于强化学习的神经方法,即逐步推理网络,该方法将多关系问题的回答表述为一个顺序决策问题。该模型在知识图上进行有效的路径搜索以获得答案,并利用束搜索显著减少候选个数。同时,基于注意力机制和神经网络,策略网络可以增强给定问题不同部分对三重选择的独特影响。此外,为了缓解弱监督导致的奖励延迟和稀疏问题,我们提出了一种基于电位的奖励塑造策略,该策略可以加速训练算法的收敛,帮助模型更好地执行。在三个基准数据集上进行的大量实验很好地证明了所提出模型的有效性,它优于最先进的方法。
{"title":"Stepwise Reasoning for Multi-Relation Question Answering over Knowledge Graph with Weak Supervision","authors":"Yunqi Qiu, Yuanzhuo Wang, Xiaolong Jin, Kun Zhang","doi":"10.1145/3336191.3371812","DOIUrl":"https://doi.org/10.1145/3336191.3371812","url":null,"abstract":"Knowledge Graph Question Answering aims to automatically answer natural language questions via well-structured relation information between entities stored in knowledge graphs. When faced with a multi-relation question, existing embedding-based approaches take the whole topic-entity-centric subgraph into account, resulting in high time complexity. Meanwhile, due to the high cost for data annotations, it is impractical to exactly show how to answer a complex question step by step, and only the final answer is labeled, as weak supervision. To address these challenges, this paper proposes a neural method based on reinforcement learning, namely Stepwise Reasoning Network, which formulates multi-relation question answering as a sequential decision problem. The proposed model performs effective path search over the knowledge graph to obtain the answer, and leverages beam search to reduce the number of candidates significantly. Meanwhile, based on the attention mechanism and neural networks, the policy network can enhance the unique impact of different parts of a given question over triple selection. Moreover, to alleviate the delayed and sparse reward problem caused by weak supervision, we propose a potential-based reward shaping strategy, which can accelerate the convergence of the training algorithm and help the model perform better. Extensive experiments conducted over three benchmark datasets well demonstrate the effectiveness of the proposed model, which outperforms the state-of-the-art approaches.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129155423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 95
All You Need Is Low (Rank): Defending Against Adversarial Attacks on Graphs 所有你需要的是低(等级):防御对抗性攻击的图表
Pub Date : 2020-01-20 DOI: 10.1145/3336191.3371789
Negin Entezari, Saba A. Al-Sayouri, Amirali Darvishzadeh, E. Papalexakis
Recent studies have demonstrated that machine learning approaches like deep learning methods are easily fooled by adversarial attacks. Recently, a highly-influential study examined the impact of adversarial attacks on graph data and demonstrated that graph embedding techniques are also vulnerable to adversarial attacks. Fake users on social media and fake product reviews are examples of perturbations in graph data that are realistic counterparts of the adversarial models proposed. Graphs are widely used in a variety of domains and it is highly important to develop graph analysis techniques that are robust to adversarial attacks. One of the recent studies on generating adversarial attacks for graph data is Nettack. The Nettack model has shown to be very successful in deceiving the Graph Convolutional Network (GCN) model. Nettack is also transferable to other node classification approaches e.g. node embeddings. In this paper, we explore the properties of Nettack perturbations, in search for effective defenses against them. Our first finding is that Nettack demonstrates a very specific behavior in the spectrum of the graph: only high-rank (low-valued) singular components of the graph are affected. Following that insight, we show that a low-rank approximation of the graph, that uses only the top singular components for its reconstruction, can greatly reduce the effects of Nettack and boost the performance of GCN when facing adversarial attacks. Indicatively, on the CiteSeer dataset, our proposed defense mechanism is able to reduce the success rate of Nettack from 98% to 36%. Furthermore, we show that tensor-based node embeddings, which by default project the graph into a low-rank subspace, are robust against Nettack perturbations. Lastly, we propose LowBlow, a low-rank adversarial attack which is able to affect the classification performance of both GCN and tensor-based node embeddings and we show that the low-rank attack is noticeable and making it unnoticeable results in a high-rank attack.
最近的研究表明,像深度学习方法这样的机器学习方法很容易被对抗性攻击所欺骗。最近,一项非常有影响力的研究调查了对抗性攻击对图数据的影响,并证明图嵌入技术也容易受到对抗性攻击。社交媒体上的虚假用户和虚假产品评论是图数据扰动的例子,是所提出的对抗模型的现实对应。图被广泛应用于各种领域,开发对对抗性攻击具有鲁棒性的图分析技术非常重要。最近对图数据生成对抗性攻击的研究之一是网络攻击。网络攻击模型在欺骗图卷积网络(GCN)模型方面非常成功。netattack也可以转移到其他节点分类方法中,例如节点嵌入。在本文中,我们探讨了网络攻击扰动的性质,以寻找有效的防御措施。我们的第一个发现是,netattack在图的频谱中展示了一种非常具体的行为:只有图的高秩(低值)奇异分量受到影响。根据这一见解,我们展示了图的低秩近似,仅使用顶部奇异分量进行重建,可以大大减少网络攻击的影响,并在面对对抗性攻击时提高GCN的性能。在CiteSeer数据集上,我们提出的防御机制能够将网络攻击的成功率从98%降低到36%。此外,我们证明了基于张量的节点嵌入,默认情况下将图投影到低秩子空间,对网络攻击扰动具有鲁棒性。最后,我们提出了LowBlow,这是一种低秩对抗性攻击,能够影响GCN和基于张量的节点嵌入的分类性能,并且我们表明低秩攻击是明显的,并且使其不明显导致高秩攻击。
{"title":"All You Need Is Low (Rank): Defending Against Adversarial Attacks on Graphs","authors":"Negin Entezari, Saba A. Al-Sayouri, Amirali Darvishzadeh, E. Papalexakis","doi":"10.1145/3336191.3371789","DOIUrl":"https://doi.org/10.1145/3336191.3371789","url":null,"abstract":"Recent studies have demonstrated that machine learning approaches like deep learning methods are easily fooled by adversarial attacks. Recently, a highly-influential study examined the impact of adversarial attacks on graph data and demonstrated that graph embedding techniques are also vulnerable to adversarial attacks. Fake users on social media and fake product reviews are examples of perturbations in graph data that are realistic counterparts of the adversarial models proposed. Graphs are widely used in a variety of domains and it is highly important to develop graph analysis techniques that are robust to adversarial attacks. One of the recent studies on generating adversarial attacks for graph data is Nettack. The Nettack model has shown to be very successful in deceiving the Graph Convolutional Network (GCN) model. Nettack is also transferable to other node classification approaches e.g. node embeddings. In this paper, we explore the properties of Nettack perturbations, in search for effective defenses against them. Our first finding is that Nettack demonstrates a very specific behavior in the spectrum of the graph: only high-rank (low-valued) singular components of the graph are affected. Following that insight, we show that a low-rank approximation of the graph, that uses only the top singular components for its reconstruction, can greatly reduce the effects of Nettack and boost the performance of GCN when facing adversarial attacks. Indicatively, on the CiteSeer dataset, our proposed defense mechanism is able to reduce the success rate of Nettack from 98% to 36%. Furthermore, we show that tensor-based node embeddings, which by default project the graph into a low-rank subspace, are robust against Nettack perturbations. Lastly, we propose LowBlow, a low-rank adversarial attack which is able to affect the classification performance of both GCN and tensor-based node embeddings and we show that the low-rank attack is noticeable and making it unnoticeable results in a high-rank attack.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121240003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 206
期刊
Proceedings of the 13th International Conference on Web Search and Data Mining
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1